1
|
Lang BH, Kostick-Quenet K, Smith JN, Hurley M, Dexter R, Blumenthal-Barby J. Should Physicians Take the Rap? Normative Analysis of Clinician Perspectives on Responsible Use of 'Black Box' AI Tools. AJOB Empir Bioeth 2025:1-12. [PMID: 40354225 DOI: 10.1080/23294515.2025.2497755] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/14/2025]
Abstract
BACKGROUND Increasing interest in deploying artificial intelligence tools in clinical contexts has raised several ethical questions of both normative and empirical interest. One such question in the literature is whether "responsibility gaps" (r-gaps) are created when clinicians utilize or rely on such tools for providing care, and if so, what to do about them. These gaps are particularly likely to arise when using opaque, "black box" AI tools. Compared to normative and legal analysis of AI-generated responsibility gaps in health care, little is known, empirically, about health care providers views on this issue. The present study examines clinician perspectives on this issue in the context of black box AI decisional support systems (BBAI-DSS) in advanced heart failure. METHODS Semi-structured interviews were conducted with 20 clinicians (14 cardiologists and 6 LVAD nurse coordinators). Interviews were transcribed, coded, and thematically analyzed for salient themes. All study procedures were approved by local IRB. RESULTS We found that all clinicians voiced that, if someone were responsible for the use and outcomes of black box AI, it would be physicians. We compare clinician perspectives on the existence of r-gaps and their impact on responsibility for errors or adverse outcomes when BBAI-DSS tools are used against a taxonomy from the literature, finding some clinicians acknowledging an r-gap and others denying it or its relevance in medical decision-making. CONCLUSION Clinicians varied in their view about the existence of r-gaps but were united in their ascriptions of physician responsibility for the use of BBAI-DSS in clinical care. It was unclear at times whether these were descriptive or normative judgments (i.e., is it merely inevitable physicians will be responsible, or is it morally appropriate that they be held responsible?) or both. We discuss the likely normative inadequacy of such a conception of physician responsibility for BBAI tool use.
Collapse
Affiliation(s)
- Ben H Lang
- Department of Philosophy, Oxford University, Oxford, UK
| | - Kristin Kostick-Quenet
- Center for Medical Ethics and Health Policy, Baylor College of Medicine, Houston, Texas, USA
| | - Jared N Smith
- Center for Medical Ethics and Health Policy, Baylor College of Medicine, Houston, Texas, USA
| | - Meghan Hurley
- Center for Medical Ethics and Health Policy, Baylor College of Medicine, Houston, Texas, USA
| | - Rita Dexter
- Center for Medical Ethics and Health Policy, Baylor College of Medicine, Houston, Texas, USA
| | | |
Collapse
|
2
|
Silverman WK, Spencer SD. Considering Historical Context, Current Societal Trends, and Implications for Understanding Harm in Youth Mental Health Treatment: A Broader Lens to the Primum Non Nocere Special Issue. Res Child Adolesc Psychopathol 2025; 53:817-830. [PMID: 40304872 DOI: 10.1007/s10802-025-01324-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/04/2025] [Indexed: 05/02/2025]
Abstract
Potential for harm or non-beneficence in psychological treatments remains understudied compared to questions of benefit or efficacy, especially in youth populations. Further study is critical for upholding the ethical mandate to both maximize salutary outcomes and minimize harm/non-beneficence. In the present special issue, authors of target articles incisively delineate parameters of harm and associated clinical strategies for measuring and addressing it, along with recommendations for research advancing understanding of harm in youth mental health treatment. In this commentary, we synthesize key points across the articles and offer future directions for advancing knowledge on treatment harm. First, we provide historical context that includes the origins of the concept of Primum non nocere, its initial linkages with clinical psychology, and controversies relating to the scientific study of psychotherapy. Second, we leverage lessons learned from the evidence-based treatment movement to advance the study of harm. We suggest that research aimed at advancing understanding harm in youth treatment ought to transpire concurrently with research examining putative benefits of such interventions, including a deliberate focus on mechanisms and moderators of both benefit and harm. Third, we identify several contemporary societal trends relevant to our understanding of harm, including current skepticism of science and proliferation of pseudoscientific interventions. We conclude by discussing implications for furthering knowledge of potential harms in youth mental health treatment across the domains of theory, research, and dissemination. We hope this special issue and commentary stimulate further thought and research in this heretofore understudied area.
Collapse
Affiliation(s)
- Wendy K Silverman
- Child Study Center, Yale University School of Medicine, 230 South Frontage Road, New Haven, CT, 06510, USA.
| | - Samuel D Spencer
- Department of Psychology, University of North Texas, 1155 Union Circle #311280, Denton, TX, 76203, USA
| |
Collapse
|
3
|
Massad R, Hertz-Palmor N, Blay Y, Gur S, Schneier FR, Lazarov A. In the eye of the beholder - validating the visual social anxiety scale (VSAS) in social anxiety disorder. Cogn Behav Ther 2025:1-29. [PMID: 40202294 DOI: 10.1080/16506073.2025.2487777] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2024] [Accepted: 03/25/2025] [Indexed: 04/10/2025]
Abstract
The Visual Social Anxiety Scale (VSAS) is a novel picture-based self-report measure of social anxiety that shown promising psychometric properties among non-selected participants. The present study aimed to validate the VSAS among individuals with clinically diagnosed social anxiety disorder (SAD) and establish clinical cutoff scores. One-hundred-and-three adults with SAD completed the VSAS with a battery of additional self-report measures of social anxiety, depression, and general anxiety. Internal consistency, convergent and discriminant validities, were assessed. Clinical cutoff scores were established via a Receiver Operating Characteristics (ROC) analysis using a control group of individuals without any past or present psychopathology (n = 34). An Exploratory Factor Analysis (EFA) was performed to explore underlying thematic factors. The VSAS exhibited high internal consistency and adequate convergent and discriminant validities. The ROC analysis showed the area under the curve to be 0.95 and yielded an optimal cutoff score of 23.40, providing high accuracy (0.90), sensitivity (0.89), and specificity (0.91) for distinguishing SAD from non-SAD individuals. The EFA revealed a 3-factor structure representing the following themes: social interpersonal situations, formal interpersonal situations, and being the center of attention. The psychometric properties of the VSAS support its utility in assessing and identifying individuals with clinical SAD.
Collapse
Affiliation(s)
- Raz Massad
- School of Psychological Sciences, Tel Aviv University, Tel-Aviv, Israel
| | - Nimrod Hertz-Palmor
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
| | - Yoav Blay
- Geha Mental Health Center, Petach-Tikva, Israel
| | - Shay Gur
- Geha Mental Health Center, Petach-Tikva, Israel
- Sackler Faculty of Medicine, Tel Aviv University, Tel-Aviv, Israel
| | - Franklin R Schneier
- New York State Psychiatric Institute and Department of Psychiatry, Columbia University Irving Medical Center, New York, NY, USA
| | - Amit Lazarov
- School of Psychological Sciences, Tel Aviv University, Tel-Aviv, Israel
| |
Collapse
|
4
|
Sadowsky SJ. Can ChatGPT be trusted as a resource for a scholarly article on treatment planning implant-supported prostheses? J Prosthet Dent 2025:S0022-3913(25)00258-6. [PMID: 40210509 DOI: 10.1016/j.prosdent.2025.03.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2025] [Revised: 03/17/2025] [Accepted: 03/18/2025] [Indexed: 04/12/2025]
Abstract
STATEMENT OF PROBLEM Access to artificial intelligence is ubiquitous but its limitations in the preparation of scholarly articles have not been established in implant restorative treatment planning. PURPOSE The purpose of this study is to determine if ChatGPT can be a reliable resource in synthesizing the best available literature on treatment planning questions for implant-supported prostheses. MATERIAL AND METHODS Six questions were posed to ChatGPT on treatment planning implant-supported prostheses for the partially edentulous and completely edentulous scenario. Question 1: Would higher crown to implant ratios (C/I) greater than 1:1 be linked to increased marginal bone loss? Question 2: Do 2-unit posterior cantilevers lead to more bone loss than 2 adjacent implants? Question 3: Should implants be splinted in the posterior maxilla in patients that require no grafting and are not bruxers? Question 4: Do patients prefer a maxillary implant overdenture to a well-made complete denture? Question 5: Do resilient and rigid anchorage systems have the same maintenance when comparing implant overdentures? Question 6: Do denture patients prefer fixed implant prostheses compared with removable implant prostheses? Follow-up questions were intended to clarify the source and content of the supporting evidence for ChatGPT's responses. Additional higher quality and timely studies indexed on PubMed were identified for ChatGPT to consider in a revision of its original implant treatment planning answer. A quantitative rating was assessed based on 4 indices: accurate/retrievable source, representative literature, accurate interpretation of evidence, original conclusion reflects best evidence. RESULTS ChatGPT's responses to: Question 1: "Higher C/I can be associated with an increased risk of marginal bone loss." Revision: "While many clinicians believe that higher C/I ratios lead to bone loss, recent evidence suggests that this concern is less relevant for modern implants." Question 2: "The presence of cantilever extensions with short implants tend to fail at earlier time points and has been associated with a higher incidence of technical complications. Revision: "The use of implant-supported single-unit crowns with cantilever extensions in posterior regions is a viable long-term treatment option with minimal complications." Question 3: "Splinted restorations were associated with a higher implant survival rate, particularly in the posterior region." Revision: "There is no compelling evidence to suggest that splinting all implants in the posterior maxilla is necessary." Question 4: Patients report higher satisfaction with maxillary implant-supported overdentures compared to conventional complete dentures. Revision: "For patients with adequate maxillary bone support, a conventional denture may be just as satisfactory as an implant overdenture." Question 5: "While resilient attachments may require more frequent replacement of components, rigid attachments might necessitate monitoring for implant-related complications due to increased stress." Revision: "Research indicates that rigid attachment systems, such as bar and telescopic attachments, do not necessarily lead to increased complications due to stress in implant overdentures." Question 6: "Yes, in general, denture patients tend to prefer fixed implant prostheses over removable implant prostheses due to several key advantages. However, preferences can vary based on individual needs, costs, and clinical factors." Revision: "There is no universal patient preference for fixed or removable implant prostheses. Satisfaction is generally high with both options, and preference depends on individual patient factors, including comfort, hygiene, cost, and anatomical considerations." CONCLUSIONS ChatGPT has not demonstrated the ability to accurately cull the literature, stratify the rigor of the evidence, and extract accurate implications from the studies selected to deliver the best evidence-based answers to questions on treatment planning implant-supported prostheses.
Collapse
Affiliation(s)
- Steven J Sadowsky
- Professor Emeritus, Preventive and Restorative Department, University of the Pacific Arthur A. Dugoni School of Dentistry, San Francisco, Calif.
| |
Collapse
|
5
|
Djulbegovic B, Hozo I, Koletsi D, Price A, Nunan D, Hemkens LG. What is the probability that higher versus lower quality of evidence represents true effects estimates? J Eval Clin Pract 2025; 31:e14160. [PMID: 39373266 PMCID: PMC12020367 DOI: 10.1111/jep.14160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Revised: 09/04/2024] [Accepted: 09/18/2024] [Indexed: 10/08/2024]
Abstract
RATIONALE, AIMS, AND OBJECTIVES The previous studies demonstrated that the Grading of Recommendations Assessment, Development and Evaluation (GRADE) system, a leading method for evaluating the certainty (quality) of scientific evidence (CoE), cannot reliably differentiate between various levels of CoE when the objective is to accurately assess the magnitude of the treatment effect. An estimated effect size is a function of multiple factors, including the true underlying treatment effect, biases, and other nonlinear factors that affect the estimate in different directions. We postulate that non-weighted, simple linear tallying can provide more accurate estimates of the probability of a true estimate of treatment effects as a function of CoE. METHODS We reasoned that stable treatment effect estimates over time indicate truthfulness. We compared odds ratios (ORs) from meta-analyses (MAs) before and after updates, hypothesising that a ratio of odds ratios (ROR) equal to 1 will be more commonly observed in higher versus lower CoE. We used a subset of a previously analysed data set consisting of 82 Cochrane pairs of MAs in which CoE has not changed with the updated MA. If the linear model is valid, we would expect a decrease in the number of ROR = 1 cases as we move from high to moderate, low, and very low CoE. RESULTS We found a linear relationship between the probability of a potentially 'true' estimate of treatment effects as a function of CoE (assuming a 10% ROR error margin) (R2 = 1; p = 0.001). The probability of potentially 'true' estimates decreases by 21% (95% CI: 18%-24%) for each drop in the rating of CoE. A linear relationship with a 5% ROR error margin was less clear, likely due to a smaller sample size. Still, higher CoE showed a significantly greater probability of 'true' effects (53%) compared to non-high (i.e., moderate, low, or very low) CoE (25%); p = 0.032. CONCLUSION This study confirmed linear relationship between CoE and the probability of potentially 'true' estimates. We found that the probability of potentially "true" estimates decreases by about 20% for each drop in CoE (from about 80% for high to 55% for moderate to 35% to low and 15% to very low CoE).
Collapse
Affiliation(s)
- Benjamin Djulbegovic
- Division of Hematology/Oncology, Department of MedicineMedical University of South CarolinaCharlestonSouth CarolinaUSA
| | - Iztok Hozo
- Department of MathematicsIndiana University NorthwestGaryIndianaUSA
| | - Despina Koletsi
- Clinic of Orthodontics and Pediatric Dentistry, Center of Dental MedicineUniversity of ZurichZurichSwitzerland
- Meta‐Research Innovation Center at Stanford (METRICS)Stanford UniversityStanfordCaliforniaUSA
| | - Amy Price
- Nuffield Department of Primary Care Health SciencesCentre for Evidence‐Based MedicineOxfordUK
- Dartmouth Institute for Health Policy and Clinical Practice (TDI) Geisel School of MedicineDartmouth CollegeHanoverNew HampshireUSA
| | - David Nunan
- Pragmatic Evidence Lab, Research Center for Clinical Neuroimmunology and Neuroscience Basel (RC2NB)University Hospital Basel, University of BaselBaselSwitzerland
| | - Lars G. Hemkens
- Meta‐Research Innovation Center at Stanford (METRICS)Stanford UniversityStanfordCaliforniaUSA
- Pragmatic Evidence Lab, Research Center for Clinical Neuroimmunology and Neuroscience Basel (RC2NB)University Hospital Basel, University of BaselBaselSwitzerland
| |
Collapse
|
6
|
Kornowicz J, Thommes K. Algorithm, expert, or both? Evaluating the role of feature selection methods on user preferences and reliance. PLoS One 2025; 20:e0318874. [PMID: 40053559 PMCID: PMC11888136 DOI: 10.1371/journal.pone.0318874] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2024] [Accepted: 01/22/2025] [Indexed: 03/09/2025] Open
Abstract
The integration of users and experts in machine learning is a widely studied topic in artificial intelligence literature. Similarly, human-computer interaction research extensively explores the factors that influence the acceptance of AI as a decision support system. In this experimental study, we investigate users' preferences regarding the integration of experts in the development of such systems and how this affects their reliance on these systems. Specifically, we focus on the process of feature selection-an element that is gaining importance due to the growing demand for transparency in machine learning models. We differentiate between three feature selection methods: algorithm-based, expert-based, and a combined approach. In the first treatment, we analyze users' preferences for these methods. In the second treatment, we randomly assign users to one of the three methods and analyze whether the method affects advice reliance. Users prefer the combined method, followed by the expert-based and algorithm-based methods. However, the users in the second treatment rely equally on all methods. Thus, we find a remarkable difference between stated preferences and actual usage, revealing a significant attitude-behavior-gap. Moreover, allowing the users to choose their preferred method had no effect, and the preferences and the extent of reliance were domain-specific. The findings underscore the importance of understanding cognitive processes in AI-supported decisions and the need for behavioral experiments in human-AI interactions.
Collapse
Affiliation(s)
- Jaroslaw Kornowicz
- Faculty of Business Administration and Economics, Paderborn University, Paderborn, Germany
| | - Kirsten Thommes
- Faculty of Business Administration and Economics, Paderborn University, Paderborn, Germany
| |
Collapse
|
7
|
Oddleifson C, Kilgus S, Klingbeil DA, Latham AD, Kim JS, Vengurlekar IN. Using a naive Bayesian approach to identify academic risk based on multiple sources: A conceptual replication. J Sch Psychol 2025; 108:101397. [PMID: 39710436 DOI: 10.1016/j.jsp.2024.101397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 11/06/2024] [Accepted: 11/08/2024] [Indexed: 12/24/2024]
Abstract
The purpose of this study was to conduct a conceptual replication of Pendergast et al.'s (2018) study that examined the diagnostic accuracy of a nomogram procedure, also known as a naive Bayesian approach. The specific naive Bayesian approach combined academic and social-emotional and behavioral (SEB) screening data to predict student performance on a state end-of-year achievement test. Study data were collected in a large suburban school district in the Midwest across 2 school years and 19 elementary schools. Participants included 5753 students in Grades 3-5. Academic screening data included aimswebPlus reading and math composite scores. SEB screening data included Academic Behavior subscale scores from the Social, Academic, and Emotional Behavior Risk Screener. Criterion scores were derived from the Missouri Assessment Program (MAP) tests of English Language Arts and Mathematics. The performance of each individual screener was compared to the naive Bayesian approach that integrated pre-test probability information (i.e., district-wide base rates of risk derived from prior year MAP test scores), academic screening scores, and SEB screening scores. Post-test probability scores were then evaluated using a threshold model (VanDerHeyden, 2013) to determine the percentage of students within the sample that could be differentiated in terms of ruling in or ruling out intervention versus those who remained undifferentiated (as indicated by the need for additional assessment to determine risk status). Results indicated that the naive Bayesian approach tended to perform similarly to individual aimswebPlus measures, with all approaches yielding a large percentage (65%-87%) of undifferentiated students when predicting proficient performance. Overall, the results indicated that we likely failed to replicate the findings of the original study. Limitations and future directions for research are discussed.
Collapse
Affiliation(s)
- Carly Oddleifson
- Department of Educational Psychology, University of Wisconsin-Madison, United States.
| | - Stephen Kilgus
- Department of Educational Psychology, University of Wisconsin-Madison, United States
| | - David A Klingbeil
- Department of Educational Psychology, University of Wisconsin-Madison, United States
| | - Alexander D Latham
- Department of Educational Psychology, University of Wisconsin-Madison, United States
| | - Jessica S Kim
- Department of Educational Psychology, University of Wisconsin-Madison, United States
| | - Ishan N Vengurlekar
- Department of Educational Psychology, University of Wisconsin-Madison, United States
| |
Collapse
|
8
|
Espinosa O, Drummond M, Russo E, Williams D, Wix D. How can actuarial science contribute to the field of health technology assessment? An interdisciplinary perspective. Int J Technol Assess Health Care 2025; 41:e3. [PMID: 39757736 DOI: 10.1017/s0266462324004781] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2025]
Abstract
A reflective analysis is presented on the potential added value that actuarial science can contribute to the field of health technology assessment. This topic is discussed based on the experience of several experts in health actuarial science and health economics. Different points are addressed, such as the role of actuarial science in health, actuarial judgment, data inputs and their quality, modeling methodologies and the use of decision-analytic models in the age of artificial intelligence, and the development of innovative pricing and payment models.
Collapse
Affiliation(s)
- Oscar Espinosa
- Economic Models and Quantitative Methods Research Group (IMEMC), Centro de Investigaciones para el Desarrollo, Universidad Nacional de Colombia, Bogotá, D.C., Colombia
| | | | | | | | | |
Collapse
|
9
|
Viljoen JL, Goossens I, Monjazeb S, Cochrane DM, Vargen LM, Jonnson MR, Blanchard AJE, Li SMY, Jackson JR. Are risk assessment tools more accurate than unstructured judgments in predicting violent, any, and sexual offending? A meta-analysis of direct comparison studies. BEHAVIORAL SCIENCES & THE LAW 2025; 43:75-113. [PMID: 39363308 PMCID: PMC11771637 DOI: 10.1002/bsl.2698] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Accepted: 08/29/2024] [Indexed: 10/05/2024]
Abstract
We conducted a pre-registered meta-analysis of studies that directly compared the predictive validity of risk assessment tools to unstructured judgments of risk for violent, any, or sexual offending. A total of 31 studies, containing 169 effect sizes from 45,673 risk judgments, met inclusion criteria. Based on the results of three-level mixed-effects meta-regression models, the predictive validity of total scores on risk assessment tools was significantly higher than that of unstructured judgments for predictions of violent, any, and sexual offending. Tools continued to outperform unstructured judgments after accounting for risk of bias. This finding was also robust to variations in population, assessment context, and outcome measurement. Although this meta-analysis provides support for the use of risk assessment tools, it also highlights limitations and gaps that future research should address.
Collapse
Affiliation(s)
- Jodi L. Viljoen
- Department of PsychologySimon Fraser UniversityBurnabyBritish ColumbiaCanada
| | - Ilvy Goossens
- Department of PsychologySimon Fraser UniversityBurnabyBritish ColumbiaCanada
| | - Sanam Monjazeb
- Department of PsychologySimon Fraser UniversityBurnabyBritish ColumbiaCanada
| | - Dana M. Cochrane
- Department of PsychologySimon Fraser UniversityBurnabyBritish ColumbiaCanada
| | - Lee M. Vargen
- Department of PsychologySimon Fraser UniversityBurnabyBritish ColumbiaCanada
| | - Melissa R. Jonnson
- Department of PsychologySimon Fraser UniversityBurnabyBritish ColumbiaCanada
| | | | - Shanna M. Y. Li
- Department of PsychologySimon Fraser UniversityBurnabyBritish ColumbiaCanada
| | - Jourdan R. Jackson
- Department of PsychologySimon Fraser UniversityBurnabyBritish ColumbiaCanada
| |
Collapse
|
10
|
Beauchaine TP. Developmental psychopathology as a meta-paradigm: From zero-sum science to epistemological pluralism in theory and research. Dev Psychopathol 2024; 36:2114-2126. [PMID: 38389490 DOI: 10.1017/s0954579424000208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/24/2024]
Abstract
In a thoughtful commentary in this journal a decade ago, Michael Rutter reviewed 25 years of progress in the field before concluding that developmental psychopathology (DP) initiated a paradigm shift in clinical science. This deduction requires that DP itself be a paradigm. According to Thomas Kuhn, canonical paradigms in the physical sciences serve unifying functions by consolidating scientists' thinking and scholarship around single, closed sets of discipline-defining epistemological assumptions and methods. Paradigm shifts replace these assumptions and methods with a new field-defining framework. In contrast, the social sciences are multiparadigmatic, with thinking and scholarship unified locally around open sets of epistemological assumptions and methods with varying degrees of inter-, intra-, and subdisciplinary reach. DP challenges few if any of these local paradigms. Instead, DP serves an essential pluralizing function, and is therefore better construed as a metaparadigm. Seen in this way, DP holds tremendous untapped potential to move the field from zero-sum thinking and scholarship to positive-sum science and epistemological pluralism. This integrative vision, which furthers Dante Cicchetti's legacy of interdisciplinarity, requires broad commitment among scientists to reject zero-sum scholarship in which portending theories, useful principles, and effective interventions are jettisoned based on confirmation bias, errors in logic, and ideology.
Collapse
|
11
|
Worden B. A Call for More Careful Use of the Clinical Global Impression (CGI) Rating as a Measure of Psychopathology and Outcome. Psychol Rep 2024:332941241301344. [PMID: 39601331 DOI: 10.1177/00332941241301344] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2024]
Affiliation(s)
- Blaise Worden
- Anxiety Disorders Center, Hartford Hospital/Institute of Living, Hartford, CT, USA
| |
Collapse
|
12
|
Xu M, Wang Y. Explainability increases trust resilience in intelligent agents. Br J Psychol 2024. [PMID: 39431949 DOI: 10.1111/bjop.12740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2024] [Accepted: 09/12/2024] [Indexed: 10/22/2024]
Abstract
Even though artificial intelligence (AI)-based systems typically outperform human decision-makers, they are not immune to errors, leading users to lose trust in them and be less likely to use them again-a phenomenon known as algorithm aversion. The purpose of the present research was to investigate whether explainable AI (XAI) could function as a viable strategy to counter algorithm aversion. We conducted two experiments to examine how XAI influences users' willingness to continue using AI-based systems when these systems exhibit errors. The results showed that, following the observation of algorithms erring, the inclination of users to delegate decisions to or follow advice from intelligent agents significantly decreased compared to the period before the errors were revealed. However, the explainability effectively mitigated this decline, with users in the XAI condition being more likely to continue utilizing intelligent agents for subsequent tasks after seeing algorithms erring than those in the non-XAI condition. We further found that the explainability could reduce users' decision regret, and the decrease in decision regret mediated the relationship between the explainability and re-use behaviour. These findings underscore the adaptive function of XAI in alleviating negative user experiences and maintaining user trust in the context of imperfect AI.
Collapse
Affiliation(s)
- Min Xu
- School of Economics and Management, Fuzhou University, Fuzhou, China
| | - Yiwen Wang
- School of Business Administration, Zhejiang Gongshang University, Hangzhou, China
| |
Collapse
|
13
|
Peringa IP, Niessen ASM, Meijer RR, den Hartigh RJR. A uniform approach for advancing athlete assessment: A tutorial on the Lens Model. PSYCHOLOGY OF SPORT AND EXERCISE 2024; 76:102732. [PMID: 39278579 DOI: 10.1016/j.psychsport.2024.102732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Revised: 08/30/2024] [Accepted: 09/05/2024] [Indexed: 09/18/2024]
Abstract
In athlete assessment, coaches or scouts typically judge athletes by observing and combining information about their attributes. However, how accurate is the expert's eye in combining this information, and can its accuracy be improved? To address these questions, this paper introduces the Lens Model, a framework for studying human judgment that has been widely successful in other performance domains. Since the framework offers both theoretical and practical benefits and is new to sports scientists and practitioners, our paper is presented in the form of a tutorial. First, we discuss the need for the Lens Model in sports; second, we demonstrate its proven value outside of sports. Third, we provide a conceptual explanation of the Lens Model, detailing, among other aspects, how experts' judgmental policies can be modeled and how judgmental accuracy can be determined and evaluated. This is followed by an empirical example: a study on the judgments of soccer scouts, along with suggestions to improve their accuracy. To inspire further Lens Model research across sports, we conclude with prospective research directions.
Collapse
Affiliation(s)
- Ilse P Peringa
- Department of Psychology, University of Groningen, Grote Kruisstraat 2/1, 9712 TS, Groningen, the Netherlands.
| | - A Susan M Niessen
- Department of Psychology, University of Groningen, Grote Kruisstraat 2/1, 9712 TS, Groningen, the Netherlands.
| | - Rob R Meijer
- Department of Psychology, University of Groningen, Grote Kruisstraat 2/1, 9712 TS, Groningen, the Netherlands.
| | - Ruud J R den Hartigh
- Department of Psychology, University of Groningen, Grote Kruisstraat 2/1, 9712 TS, Groningen, the Netherlands.
| |
Collapse
|
14
|
Chai KEK, Graham-Schmidt K, Lee CMY, Rock D, Coleman M, Betts KS, Robinson S, McEvoy PM. Predicting anxiety treatment outcome in community mental health services using linked health administrative data. Sci Rep 2024; 14:20559. [PMID: 39232215 PMCID: PMC11375212 DOI: 10.1038/s41598-024-71557-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2024] [Accepted: 08/29/2024] [Indexed: 09/06/2024] Open
Abstract
Anxiety disorders is ranked as the most common class of mental illness disorders globally, affecting hundreds of millions of people and significantly impacting daily life. Developing reliable predictive models for anxiety treatment outcomes holds immense potential to help guide the development of personalised care, optimise resource allocation and improve patient outcomes. This research investigates whether community mental health treatment for anxiety disorder is associated with reliable changes in Kessler psychological distress scale (K10) scores and whether pre-treatment K10 scores and past health service interactions can accurately predict reliable change (improvement). The K10 assessment was administered to 46,938 public patients in a community setting within the Western Australia dataset in 2005-2022; of whom 3794 in 4067 episodes of care were reassessed at least twice for anxiety disorders, obsessive-compulsive disorder, or reaction to severe stress and adjustment disorders (ICD-10 codes F40-F43). Reliable change on the K10 was calculated and used with the post-treatment score as the outcome variables. Machine learning models were developed using features from a large health service administrative linked dataset that includes the pre-treatment K10 assessment as well as community mental health episodes of care, emergency department presentations, and inpatient admissions for prediction. The classification model achieved an area under the receiver operating characteristic curve of 0.76 as well as an F1 score, precision and recall of 0.69, and the regression model achieved an R2 of 0.37 with mean absolute error of 5.58 on the test dataset. While the prediction models achieved moderate performance, they also underscore the necessity for regular patient monitoring and the collection of more clinically relevant and contextual patient data to further improve prediction of treatment outcomes.
Collapse
Affiliation(s)
- Kevin E K Chai
- School of Population Health, Curtin University, Perth, WA, Australia.
| | | | - Crystal M Y Lee
- School of Population Health, Curtin University, Perth, WA, Australia
| | - Daniel Rock
- Western Australia Primary Health Alliance, Perth, WA, Australia
- Discipline of Psychiatry, Medical School, University of Western Australia, Perth, WA, Australia
- Faculty of Health, Health Research Institute, University of Canberra, Canberra, ACT, Australia
| | - Mathew Coleman
- Western Australia Country Health Service, Albany, WA, Australia
| | - Kim S Betts
- School of Population Health, Curtin University, Perth, WA, Australia
| | - Suzanne Robinson
- School of Population Health, Curtin University, Perth, WA, Australia
- Deakin Health Economics, Deakin University, Melbourne, VIC, Australia
| | - Peter M McEvoy
- School of Population Health, Curtin University, Perth, WA, Australia
- Centre for Clinical Interventions, North Metropolitan Health Service, Perth, WA, Australia
| |
Collapse
|
15
|
Kleinberg J, Ludwig J, Mullainathan S, Raghavan M. The Inversion Problem: Why Algorithms Should Infer Mental State and Not Just Predict Behavior. PERSPECTIVES ON PSYCHOLOGICAL SCIENCE 2024; 19:827-838. [PMID: 38085919 DOI: 10.1177/17456916231212138] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
More and more machine learning is applied to human behavior. Increasingly these algorithms suffer from a hidden-but serious-problem. It arises because they often predict one thing while hoping for another. Take a recommender system: It predicts clicks but hopes to identify preferences. Or take an algorithm that automates a radiologist: It predicts in-the-moment diagnoses while hoping to identify their reflective judgments. Psychology shows us the gaps between the objectives of such prediction tasks and the goals we hope to achieve: People can click mindlessly; experts can get tired and make systematic errors. We argue such situations are ubiquitous and call them "inversion problems": The real goal requires understanding a mental state that is not directly measured in behavioral data but must instead be inverted from the behavior. Identifying and solving these problems require new tools that draw on both behavioral and computational science.
Collapse
Affiliation(s)
| | - Jens Ludwig
- Harris School of Public Policy, University of Chicago
| | | | - Manish Raghavan
- Sloan School of Management, Massachusetts Institute of Technology
| |
Collapse
|
16
|
Holm S. Ethical trade-offs in AI for mental health. Front Psychiatry 2024; 15:1407562. [PMID: 39267699 PMCID: PMC11390554 DOI: 10.3389/fpsyt.2024.1407562] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Accepted: 07/15/2024] [Indexed: 09/15/2024] Open
Abstract
It is expected that machine learning algorithms will enable better diagnosis, prognosis, and treatment in psychiatry. A central argument for deploying algorithmic methods in clinical decision-making in psychiatry is that they may enable not only faster and more accurate clinical judgments but also that they may provide a more objective foundation for clinical decisions. This article argues that the outputs of algorithms are never objective in the sense of being unaffected by human values and possibly biased choices. And it suggests that the best way to approach this is to ensure awareness of and transparency about the ethical trade-offs that must be made when developing an algorithm for mental health.
Collapse
Affiliation(s)
- Sune Holm
- Department of Food and Resource Economics, University of Copenhagen, Frederiksberg, Denmark
| |
Collapse
|
17
|
Wade TD, Shafran R, Cooper Z. Developing a protocol to address co-occurring mental health conditions in the treatment of eating disorders. Int J Eat Disord 2024; 57:1291-1299. [PMID: 37278186 DOI: 10.1002/eat.24008] [Citation(s) in RCA: 21] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 05/25/2023] [Accepted: 05/27/2023] [Indexed: 06/07/2023]
Abstract
OBJECTIVE While co-occurring mental health conditions are the norm in eating disorders, no testable protocol addresses management of these in psychotherapy. METHOD The literature on managing mental health conditions that co-occur with eating disorders is outlined and reviewed. RESULTS In the absence of clear evidence to inform managing co-occurring mental health conditions, we advocate for use of an iterative, session-by-session measurement to guide practice and research. We identify three data-driven treatment approaches (focus solely on the eating disorder; multiple sequential interventions either before or after the eating disorder is addressed; integrated interventions), and the indications for their use. Where a co-occurring mental health condition/s impede effective treatment of the eating disorder, and an integrated intervention is required, we outline a four-step protocol for three broad intervention approaches (alternate, modular, transdiagnostic). A research program is suggested to test the usefulness of the protocol. DISCUSSION Guidelines that provide a starting point to improving outcomes for people with eating disorders that can be evaluated/researched are offered in the current paper. These guidelines require further elaboration with reference to: (1) whether any difference in approach is required where the co-occurring mental health condition is a comorbid symptom or condition; (2) the place of biological treatments within these guidelines; (3) precise guidelines for selecting among the three broad intervention approaches when adapting care for co-occurring conditions; (4) optimal approaches to involving consumer input into identifying the most relevant co-occurring conditions; (5) detailed specification on how to determine which adjuncts to add. PUBLIC SIGNIFICANCE Most people with an eating disorder also have another diagnosis or an underlying trait (e.g., perfectionism). Currently no clear guidelines exist to guide treatment in this situation, which often results in a drift away from evidence-based techniques. This paper outlines data-driven strategies for treating eating disorders and the accompanying comorbid conditions and a research program that can test the usefulness of the different approaches suggested.
Collapse
Affiliation(s)
- Tracey D Wade
- Blackbird Initiative, Flinders Institute for Mental Health and Wellbeing, Flinders University, Bedford Park, South Australia, Australia
| | - Roz Shafran
- UCL Great Ormond Street Institute of Child Health, University College London, London, UK
| | - Zafra Cooper
- Yale School of Medicine, New Haven, Connecticut, USA
- Department of Psychiatry, Oxford University, Oxford, UK
| |
Collapse
|
18
|
Gamoran A, Lieberman L, Gilead M, Dobbins IG, Sadeh T. Detecting recollection: Human evaluators can successfully assess the veracity of others' memories. Proc Natl Acad Sci U S A 2024; 121:e2310979121. [PMID: 38781212 PMCID: PMC11145205 DOI: 10.1073/pnas.2310979121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Accepted: 04/06/2024] [Indexed: 05/25/2024] Open
Abstract
Humans have the highly adaptive ability to learn from others' memories. However, because memories are prone to errors, in order for others' memories to be a valuable source of information, we need to assess their veracity. Previous studies have shown that linguistic information conveyed in self-reported justifications can be used to train a machine-learner to distinguish true from false memories. But can humans also perform this task, and if so, do they do so in the same way the machine-learner does? Participants were presented with justifications corresponding to Hits and False Alarms and were asked to directly assess whether the witness's recognition was correct or incorrect. In addition, participants assessed justifications' recollective qualities: their vividness, specificity, and the degree of confidence they conveyed. Results show that human evaluators can discriminate Hits from False Alarms above chance levels, based on the justifications provided per item. Their performance was on par with the machine learner. Furthermore, through assessment of the perceived recollective qualities of justifications, participants were able to glean more information from the justifications than they used in their own direct decisions and than the machine learner did.
Collapse
Affiliation(s)
- Avi Gamoran
- Department of Psychology, Ben-Gurion University of the Negev, Beer Sheva8410501, Israel
| | - Lilach Lieberman
- Department of Cognitive and Brain Sciences, Ben-Gurion University of the Negev, Beer Sheva8410501, Israel
- Zelman School of Brain Sciences and Cognition, Ben-Gurion University of the Negev, Beer Sheva8410501, Israel
| | - Michael Gilead
- School of Psychological Sciences, Tel Aviv University, Tel Aviv6997801, Israel
- Sagol School of Neuroscience, Tel Aviv University, Tel Aviv6997801, Israel
| | - Ian G. Dobbins
- Department of Psychological and Brain Sciences, Washington University in Saint Louis, St. Louis, MO63130
| | - Talya Sadeh
- Department of Psychology, Ben-Gurion University of the Negev, Beer Sheva8410501, Israel
- Department of Cognitive and Brain Sciences, Ben-Gurion University of the Negev, Beer Sheva8410501, Israel
- Zelman School of Brain Sciences and Cognition, Ben-Gurion University of the Negev, Beer Sheva8410501, Israel
| |
Collapse
|
19
|
Doussau A, Kane P, Peppercorn J, Feustel AC, Ganeshamoorthy S, Kekre N, Benjamin DM, Kimmelman J. The impact of feedback training on prediction of cancer clinical trial results. Clin Trials 2024; 21:143-151. [PMID: 37873661 PMCID: PMC11005298 DOI: 10.1177/17407745231203375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
INTRODUCTION Funders must make difficult decisions about which squared treatments to prioritize for randomized trials. Earlier research suggests that experts have no ability to predict which treatments will vindicate their promise. We tested whether a brief training module could improve experts' trial predictions. METHODS We randomized a sample of breast cancer and hematology-oncology experts to the presence or absence of a feedback training module where experts predicted outcomes for five recently completed randomized controlled trials and received feedback on accuracy. Experts then predicted primary outcome attainment for a sample of ongoing randomized controlled trials. Prediction skill was assessed by Brier scores, which measure the average deviation between their predictions and actual outcomes. Secondary outcomes were discrimination (ability to distinguish between positive and non-positive trials) and calibration (higher predictions reflecting higher probability of trials being positive). RESULTS A total of 148 experts (46 for breast cancer, 54 for leukemia, and 48 for lymphoma) were randomized between May and December 2017 and included in the analysis (1217 forecasts for 25 trials). Feedback did not improve prediction skill (mean Brier score for control: 0.22, 95% confidence interval = 0.20-0.24 vs feedback arm: 0.21, 95% confidence interval = 0.20-0.23; p = 0.51). Control and feedback arms showed similar discrimination (area under the curve = 0.70 vs 0.73, p = 0.24) and calibration (calibration index = 0.01 vs 0.01, p = 0.81). However, experts in both arms offered predictions that were significantly more accurate than uninformative forecasts of 50% (Brier score = 0.25). DISCUSSION A short training module did not improve predictions for cancer trial results. However, expert communities showed unexpected ability to anticipate positive trials.Pre-registration record: https://aspredicted.org/4ka6r.pdf.
Collapse
Affiliation(s)
- Adélaïde Doussau
- Studies in Translation, Ethics and Medicine, Department of Equity, Ethics and Policy, McGill University, Montreal, QC, Canada
| | - Patrick Kane
- Studies in Translation, Ethics and Medicine, Department of Equity, Ethics and Policy, McGill University, Montreal, QC, Canada
| | | | - Aden C Feustel
- Studies in Translation, Ethics and Medicine, Department of Equity, Ethics and Policy, McGill University, Montreal, QC, Canada
| | - Sylviya Ganeshamoorthy
- Studies in Translation, Ethics and Medicine, Department of Equity, Ethics and Policy, McGill University, Montreal, QC, Canada
| | - Natasha Kekre
- Department of Medicine, The Ottawa Hospital, Ottawa Hospital Research Institute, Ottawa, ON, Canada
| | - Daniel M Benjamin
- Huizenga College of Business and Entrepreneurship, Nova Southeastern University, Fort Lauderdale, FL, US
| | - Jonathan Kimmelman
- Studies in Translation, Ethics and Medicine, Department of Equity, Ethics and Policy, McGill University, Montreal, QC, Canada
| |
Collapse
|
20
|
Etzler S, Schönbrodt FD, Pargent F, Eher R, Rettenberger M. Machine Learning and Risk Assessment: Random Forest Does Not Outperform Logistic Regression in the Prediction of Sexual Recidivism. Assessment 2024; 31:460-481. [PMID: 37039529 DOI: 10.1177/10731911231164624] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/12/2023]
Abstract
Although many studies supported the use of actuarial risk assessment instruments (ARAIs) because they outperformed unstructured judgments, it remains an ongoing challenge to seek potentials for improvement of their predictive performance. Machine learning (ML) algorithms, like random forests, are able to detect patterns in data useful for prediction purposes without explicitly programming them (e.g., by considering nonlinear effects between risk factors and the criterion). Therefore, the current study aims to compare conventional logistic regression analyses with the random forest algorithm on a sample of N = 511 adult male individuals convicted of sexual offenses. Data were collected at the Federal Evaluation Center for Violent and Sexual Offenders in Austria within a prospective-longitudinal research design and participants were followed-up for an average of M = 8.2 years. The Static-99, containing static risk factors, and the Stable-2007, containing stable dynamic risk factors, were included as predictors. The results demonstrated no superior predictive performance of the random forest compared with logistic regression; furthermore, methods of interpretable ML did not point to any robust nonlinear effects. Altogether, results supported the statistical use of logistic regression for the development and clinical application of ARAIs.
Collapse
Affiliation(s)
- Sonja Etzler
- Goethe-University Frankfurt am Main, Germany
- Centre for Criminology (Kriminologische Zentralstelle-KrimZ), Wiesbaden, Germany
| | | | | | - Reinhard Eher
- Federal Evaluation Centre for Violent and Sexual Offenders, Austrian Ministry of Justice, Vienna, Austria
| | - Martin Rettenberger
- Centre for Criminology (Kriminologische Zentralstelle-KrimZ), Wiesbaden, Germany
- Johannes Gutenberg-University Mainz (JGU), Germany
| |
Collapse
|
21
|
Chu JY, Voelkel JG, Stagnaro MN, Kang S, Druckman JN, Rand DG, Willer R. Academics are more specific, and practitioners more sensitive, in forecasting interventions to strengthen democratic attitudes. Proc Natl Acad Sci U S A 2024; 121:e2307008121. [PMID: 38215187 PMCID: PMC10801850 DOI: 10.1073/pnas.2307008121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2023] [Accepted: 11/08/2023] [Indexed: 01/14/2024] Open
Abstract
Concern over democratic erosion has led to a proliferation of proposed interventions to strengthen democratic attitudes in the United States. Resource constraints, however, prevent implementing all proposed interventions. One approach to identify promising interventions entails leveraging domain experts, who have knowledge regarding a given field, to forecast the effectiveness of candidate interventions. We recruit experts who develop general knowledge about a social problem (academics), experts who directly intervene on the problem (practitioners), and nonexperts from the public to forecast the effectiveness of interventions to reduce partisan animosity, support for undemocratic practices, and support for partisan violence. Comparing 14,076 forecasts submitted by 1,181 forecasters against the results of a megaexperiment (n = 32,059) that tested 75 hypothesized effects of interventions, we find that both types of experts outperformed members of the public, though experts differed in how they were accurate. While academics' predictions were more specific (i.e., they identified a larger proportion of ineffective interventions and had fewer false-positive forecasts), practitioners' predictions were more sensitive (i.e., they identified a larger proportion of effective interventions and had fewer false-negative forecasts). Consistent with this, practitioners were better at predicting best-performing interventions, while academics were superior in predicting which interventions performed worst. Our paper highlights the importance of differentiating types of experts and types of accuracy. We conclude by discussing factors that affect whether sensitive or specific forecasters are preferable, such as the relative cost of false positives and negatives and the expected rate of intervention success.
Collapse
Affiliation(s)
- James Y. Chu
- Department of Sociology, Columbia University, New York, NY10027
| | - Jan G. Voelkel
- Department of Sociology, Stanford University, Stanford, CA94305
| | - Michael N. Stagnaro
- Sloan School of Management, Massachusetts Institute of Technology, Cambridge, MA02139
| | - Suji Kang
- Perry World House, University of Pennsylvania, Philadelphia, PA19104
| | - James N. Druckman
- Department of Political Science, University of Rochester, Rochester, NY14627
| | - David G. Rand
- Sloan School of Management, Massachusetts Institute of Technology, Cambridge, MA02139
| | - Robb Willer
- Department of Sociology, Stanford University, Stanford, CA94305
| |
Collapse
|
22
|
Singh S, Nurek M, Mason S, Moore LS, Mughal N, Vizcaychipi MP. WHY STOP? A prospective observational vignette-based study to determine the cognitive-behavioural effects of rapid diagnostic PCR-based point-of-care test results on antibiotic cessation in ICU infections. BMJ Open 2023; 13:e073577. [PMID: 37989388 PMCID: PMC10668237 DOI: 10.1136/bmjopen-2023-073577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/01/2023] [Accepted: 10/19/2023] [Indexed: 11/23/2023] Open
Abstract
OBJECTIVES Point-of-care tests (POCTs) for infection offer accurate rapid diagnostics but do not consistently improve antibiotic stewardship (ASP) of suspected ventilator-associated pneumonia. We aimed to measure the effect of a negative PCR-POCT result on intensive care unit (ICU) clinicians' antibiotic decisions and the additional effects of patient trajectory and cognitive-behavioural factors (clinician intuition, dis/interest in POCT, risk averseness). DESIGN Observational cohort simulation study. SETTING ICU. PARTICIPANTS 70 ICU consultants/trainees working in UK-based teaching hospitals. METHODS Clinicians saw four case vignettes describing patients who had completed a course of antibiotics for respiratory infection. Vignettes comprised clinical and biological data (ie, white cell count, C reactive protein), varied to create four trajectories: clinico-biological improvement (the 'improvement' case), clinico-biological worsening ('worsening'), clinical improvement/biological worsening ('discordant clin better'), clinical worsening/biological improvement ('discordant clin worse'). Based on this, clinicians made an initial antibiotics decision (stop/continue) and rated confidence (6-point Likert scale). A PCR-based POCT was then offered, which clinicians could accept or decline. All clinicians (including those who declined) were shown the result, which was negative. Clinicians updated their antibiotics decision and confidence. MEASURES Antibiotics decisions and confidence were compared pre-POCT versus post-POCT, per vignette. RESULTS A negative POCT result increased the proportion of stop decisions (54% pre-POCT vs 70% post-POCT, χ2(1)=25.82, p<0.001, w=0.32) in all vignettes except improvement (already high), most notably in discordant clin worse (49% pre-POCT vs 74% post-POCT). In a linear regression, factors that significantly reduced clinicians' inclination to stop antibiotics were a worsening trajectory (b=-0.73 (-1.33, -0.14), p=0.015), initial confidence in continuing (b=0.66 (0.56, 0.76), p<0.001) and involuntary receipt of POCT results (clinicians who accepted the POCT were more inclined to stop than clinicians who declined it, b=1.30 (0.58, 2.02), p<0.001). Clinician risk averseness was not found to influence antibiotic decisions (b=-0.01 (-0.12, 0.10), p=0.872). CONCLUSIONS A negative PCR-POCT result can encourage antibiotic cessation in ICU, notably in cases of clinical worsening (where the inclination might otherwise be to continue). This effect may be reduced by high clinician confidence to continue and/or disinterest in POCT, perhaps due to low trust/perceived utility. Such cognitive-behavioural and trajectorial factors warrant greater consideration in future ASP study design.
Collapse
Affiliation(s)
- Suveer Singh
- Faculty of Medicine, Imperial College London, London, UK
- Respiratory and Intensive Care Medicine, Chelsea and Westminster Hospital NHS Foundation Trust, London, UK
| | - Martine Nurek
- Surgery and Cancer, Imperial College London, London, UK
| | - Sonia Mason
- Guy's and St Thomas' Hospitals NHS Trust, London, UK
| | - Luke Sp Moore
- Imperial College London, London, UK
- Chelsea and Westminster Hospital NHS Foundation Trust, London, UK
| | - Nabeela Mughal
- Imperial College London, London, UK
- Chelsea and Westminster Hospital NHS Foundation Trust, London, UK
| | - Marcela P Vizcaychipi
- APMIC, Imperial College London, London, UK
- Magill Department of Anaesthesia and Intensive Care Medicine, Chelsea and Westminster Healthcare NHS Trust, London, UK
| |
Collapse
|
23
|
Colunga-Lozano LE, Foroutan F, Rayner D, De Luca C, Hernández-Wolters B, Couban R, Ibrahim Q, Guyatt G. Clinical judgment shows similar and sometimes superior discrimination compared to prognostic clinical prediction models. A systematic review. J Clin Epidemiol 2023; 165:S0895-4356(23)00276-7. [PMID: 39492557 DOI: 10.1016/j.jclinepi.2023.10.016] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Revised: 10/09/2023] [Accepted: 10/21/2023] [Indexed: 11/05/2024]
Abstract
OBJECTIVES To systematically review the comparative statistical performance (discrimination and /or calibration) of prognostic clinical prediction models (CPMs) and clinician judgment (CJ). STUDY DESIGN AND SETTING We conducted a systematic review of observational studies in PubMed, Medline, Embase, and CINAHL. Eligible studies reported direct statistical comparison between prognostic CPMs and CJ. Risk of bias was assessed using the PROBAST tool. RESULTS We identified 41 studies, most with high risk of bias (39 studies). Of these, 41 studies, 39 examined discrimination and 12 studies assessed calibration. Prognostic CPMs had a median AUC of 0.73 (IQR, 0.62 - 0.81), while CJ had a median AUC of 0.71 (IQR, 0.62 - 0.81). 29 studies provided 124 discrimination metrics useful for comparative analysis. Among these, 58 (46.7%) found no significant difference between prognostic CPMs and CJ (p > 0.05); 31 (25%) favored prognostic CPMs, and 35 (28.2%) favored CJ. Four studies compared calibration, showing better performance on prognostic CPMs. CONCLUSIONS In many instances CJ frequently demonstrates comparable or superior discrimination compared to prognostic CPMs, although models outperform CJ on calibration. Studies comparing performance of prognostic CPMs and CJ require large improvements in reporting.
Collapse
Affiliation(s)
- Luis Enrique Colunga-Lozano
- Department of clinical medicine, Health science center, Universidad de Guadalajara, Guadalajara, Jalisco, México; Department of Health Research Methods, Evidence and Impact. McMaster University, Hamilton, Ontario, Canada.
| | - Farid Foroutan
- Department of Health Research Methods, Evidence and Impact. McMaster University, Hamilton, Ontario, Canada
| | - Daniel Rayner
- Department of Health Research Methods, Evidence and Impact. McMaster University, Hamilton, Ontario, Canada
| | - Christopher De Luca
- Faculty of Science, Schulich School of Medicine & Dentistry, University of Western Ontario, London, Canada
| | | | - Rachel Couban
- Department of Health Research Methods, Evidence and Impact. McMaster University, Hamilton, Ontario, Canada
| | - Quazi Ibrahim
- Department of Health Research Methods, Evidence and Impact. McMaster University, Hamilton, Ontario, Canada
| | - Gordon Guyatt
- Department of Health Research Methods, Evidence and Impact. McMaster University, Hamilton, Ontario, Canada
| |
Collapse
|
24
|
Tomko RL, Wolf BJ, McClure EA, Carpenter MJ, Magruder KM, Squeglia LM, Gray KM. Who responds to a multi-component treatment for cannabis use disorder? Using multivariable and machine learning models to classify treatment responders and non-responders. Addiction 2023; 118:1965-1974. [PMID: 37132085 PMCID: PMC10524796 DOI: 10.1111/add.16226] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Accepted: 04/13/2023] [Indexed: 05/04/2023]
Abstract
BACKGROUND AND AIMS Treatments for cannabis use disorder (CUD) have limited efficacy and little is known about who responds to existing treatments. Accurately predicting who will respond to treatment can improve clinical decision-making by allowing clinicians to offer the most appropriate level and type of care. This study aimed to determine whether multivariable/machine learning models can be used to classify CUD treatment responders versus non-responders. METHODS This secondary analysis used data from a National Drug Abuse Treatment Clinical Trials Network multi-site outpatient clinical trial in the United States. Adults with CUD (n = 302) received 12 weeks of contingency management, brief cessation counseling and were randomized to receive additionally either (1) N-Acetylcysteine or (2) placebo. Multivariable/machine learning models were used to classify treatment responders (i.e. two consecutive negative urine cannabinoid tests or a 50% reduction in days of use) versus non-responders using baseline demographic, medical, psychiatric and substance use information. RESULTS Prediction performance for various machine learning and regression prediction models yielded area under the curves (AUCs) >0.70 for four models (0.72-0.77), with support vector machine models having the highest overall accuracy (73%; 95% CI = 68-78%) and AUC (0.77; 95% CI = 0.72, 0.83). Fourteen variables were retained in at least three of four top models, including demographic (ethnicity, education), medical (diastolic/systolic blood pressure, overall health, neurological diagnosis), psychiatric (depressive symptoms, generalized anxiety disorder, antisocial personality disorder) and substance use (tobacco smoker, baseline cannabinoid level, amphetamine use, age of experimentation with other substances, cannabis withdrawal intensity) characteristics. CONCLUSIONS Multivariable/machine learning models can improve on chance prediction of treatment response to outpatient cannabis use disorder treatment, although further improvements in prediction performance are likely necessary for decisions about clinical care.
Collapse
Affiliation(s)
- Rachel L. Tomko
- Department of Psychiatry and Behavioral Sciences, Medical University of South Carolina, Charleston, SC, USA
| | - Bethany J. Wolf
- Department of Public Health Sciences, Medical University of South Carolina, Charleston, SC, USA
| | - Erin A. McClure
- Department of Psychiatry and Behavioral Sciences, Medical University of South Carolina, Charleston, SC, USA
| | - Matthew J. Carpenter
- Department of Psychiatry and Behavioral Sciences, Medical University of South Carolina, Charleston, SC, USA
- Department of Public Health Sciences, Medical University of South Carolina, Charleston, SC, USA
- Hollings Cancer Center, Medical University of South Carolina, Charleston, SC, USA
| | - Kathryn M. Magruder
- Department of Psychiatry and Behavioral Sciences, Medical University of South Carolina, Charleston, SC, USA
| | - Lindsay M. Squeglia
- Department of Psychiatry and Behavioral Sciences, Medical University of South Carolina, Charleston, SC, USA
| | - Kevin M. Gray
- Department of Psychiatry and Behavioral Sciences, Medical University of South Carolina, Charleston, SC, USA
| |
Collapse
|
25
|
Lehmann RJB, Schäfer T, Helmus LM, Henniges J, Fleischhauer M. Same Score, Different Audience, Different Message: Perceptions of Sex Offense Risk Depend on Static-99R Risk Level and Personality Factors of the Recipient. SEXUAL ABUSE : A JOURNAL OF RESEARCH AND TREATMENT 2023; 35:863-895. [PMID: 36720719 DOI: 10.1177/10790632221148667] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
There are multiple ways to report risk scale results. Varela et al. (2014) found that Static-99R results were interpreted differently by prospective jurors based on risk level (high vs low) and an interaction between risk level and risk communication format (categorical, absolute estimate, and risk ratio). We adapted and extended Varela et al.'s (2014) study using updated Static-99R norms, recruiting a population-wide sample (n = 166), and adding variables assessing the personality factors 'cognitive motivation' (i.e., need for cognition) and 'attitudinal affect' (i.e., attitudes toward sex offenders, authoritarianism). We found a main effect of risk level and no effect of either communication format or the interaction between the two. Adding the personality variables increased explained variance from 9% to 34%, suggesting risk perception may be more about the personality of the person receiving the information than the information itself. We also found an interaction between attitudes toward sex offenders and risk level. Our results suggest risk perception might be better understood if personality factors are considered, particularly attitudes toward sex offenders. Because biases/personality of the person receiving the information are unknown in real world settings we argue that sharing multiple methods for communicating risk might be best and more inclusive.
Collapse
Affiliation(s)
| | - Thomas Schäfer
- Department of Psychology, MSB Medical School Berlin, Berlin, Germany
| | - L Maaike Helmus
- Department of Criminology, Simon Fraser University, Vancouver, BC, Canada
| | - Julia Henniges
- Department of Psychology, MSB Medical School Berlin, Berlin, Germany
| | | |
Collapse
|
26
|
Sun S, Wilson CM, Alter S, Ge Y, Hazlett EA, Goodman M, Yehuda R, Galfalvy H, Haghighi F. Association of interleukin-6 with suicidal ideation in veterans: a longitudinal perspective. Front Psychiatry 2023; 14:1231031. [PMID: 37779624 PMCID: PMC10540304 DOI: 10.3389/fpsyt.2023.1231031] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Accepted: 09/01/2023] [Indexed: 10/03/2023] Open
Abstract
Introduction Studies showing associations between inflammation in suicide are typically cross-sectional. Present study investigated how cytokine levels track with suicidal ideation and severity longitudinally. Methods Veterans with a diagnosis of major depressive disorder (MDD) with or without suicide attempt history (MDD/SA n = 38, MDD/NS n = 41) and non-psychiatric non-attempter controls (HC n = 33) were recruited, MDD/SA and HC groups were followed longitudinally at 3 months and 6 months. Blood plasma was collected and processed using Luminex Immunology Multiplex technology. Results Significant differences in depression severity (BDI) and suicidal ideation severity (SSI) were observed across all groups at study entry, wherein MDD/SA group had the highest scores followed by MDD/NS and HC, respectively. Cytokines IL-1β, IL-4, TNF-α, IFN-γ, and IL-6 were examined at study entry and longitudinally, with IL6 levels differing significantly across the groups (p = 0.0123) at study entry. Significant differences in changes in cytokine levels between depressed attempters and the control group were detected for IL-6 (interaction F1,91.77 = 5.58, p = 0.0203) and TNF-α (F1,101.73 = 4.69, p = 0.0327). However, only depressed attempters showed a significant change, in IL-6 and TNF-α levels, decreasing over time [IL-6: b = -0.04, 95% CI = (-0.08, -0.01), p = 0.0245 and TNF-α: b = -0.02, 95% CI = (-0.04, -0.01), p = 0.0196]. Although IL-6 levels were not predictive of suicidal ideation presence [OR = 1.34, 95% CI = (0.77, 2.33), p = 0.3067], IL-6 levels were significantly associated with suicidal ideation severity (b = 0.19, p = 0.0422). Discussion IL-6 was not associated with presence of suicidal ideation. IL-6 however, was associated with severity of ideation, suggesting that IL-6 may be useful in clinical practice, as an objective marker of heightened suicide risk.
Collapse
Affiliation(s)
- Shengnan Sun
- Nash Family Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, United States
- James J. Peters VAMC, Bronx, NY, United States
| | - Caroline M. Wilson
- Nash Family Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, United States
- James J. Peters VAMC, Bronx, NY, United States
| | | | - Yongchao Ge
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Erin A. Hazlett
- James J. Peters VAMC, Bronx, NY, United States
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Marianne Goodman
- James J. Peters VAMC, Bronx, NY, United States
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Rachel Yehuda
- James J. Peters VAMC, Bronx, NY, United States
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Hanga Galfalvy
- James J. Peters VAMC, Bronx, NY, United States
- Department of Psychiatry and Department of Biostatistics, Columbia University, New York, NY, United States
| | - Fatemeh Haghighi
- Nash Family Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, United States
- James J. Peters VAMC, Bronx, NY, United States
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| |
Collapse
|
27
|
Kivisto AJ, Guynn A, Jenson H, Knowles E, Magham PS, Miner C, Scelsi K, Staats MP. Intelligence is a poor predictor of nonrestorability of competence to stand trial. APPLIED NEUROPSYCHOLOGY. ADULT 2023:1-10. [PMID: 37672504 DOI: 10.1080/23279095.2023.2253949] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/08/2023]
Abstract
Criminal defendants found incompetent to stand trial (IST) may only be committed for competency restoration if their restoration success is considered likely and when this aim can be met within a "reasonable" period of time. In this study, we evaluated the predictive validity and test accuracy of standardized intelligence testing on the classification of nonrestorability in a sample of 293 male patients adjudicated IST and committed for inpatient restoration. At 90 days, 17.0% of cases with FSIQ scores within one standard deviation of the mean were unrestored, and nonrestoration rates increased with each additional FSIQ standard deviation decrement to 29.5%, 38.8%, and 59.5%. Time-to-event analyses found that whereas half of patients with FSIQ scores of 56 or higher would be predicted to be restored within 64 days, the median restoration timeline was 145 days for patients with FSIQ scores of 55 or below. Positive predictive values associated with the range of possible FSIQ scores were uniformly low at modeled nonrestoration prevalence rates of 5%, 15%, and 25%, rarely exceeding chance levels. We conclude that although FSIQ scores are relevant to predictions of nonrestorability, particularly when scores are at least three standard deviations below average, the accuracy of individual FSIQ-based predictions of nonrestorability are limited.
Collapse
Affiliation(s)
- Aaron J Kivisto
- Graduate Department of Clinical Psychology, University of Indianapolis, Indianapolis, IN, USA
| | - Alexis Guynn
- Graduate Department of Clinical Psychology, University of Indianapolis, Indianapolis, IN, USA
| | - Hallie Jenson
- Graduate Department of Clinical Psychology, University of Indianapolis, Indianapolis, IN, USA
| | - Emma Knowles
- Graduate Department of Clinical Psychology, University of Indianapolis, Indianapolis, IN, USA
| | - Pragati Sai Magham
- Graduate Department of Clinical Psychology, University of Indianapolis, Indianapolis, IN, USA
| | - Courtney Miner
- Graduate Department of Clinical Psychology, University of Indianapolis, Indianapolis, IN, USA
| | - Keana Scelsi
- Graduate Department of Clinical Psychology, University of Indianapolis, Indianapolis, IN, USA
| | - Megan Porter Staats
- Department of Psychiatry and Health Behavior, Augusta University, Augusta, GE, USA
| |
Collapse
|
28
|
Faust D. Invited Commentary: Advancing but not yet Advanced: Assessment of Effort/Malingering in Forensic and Clinical Settings. Neuropsychol Rev 2023; 33:628-642. [PMID: 37594693 DOI: 10.1007/s11065-023-09605-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Accepted: 05/02/2023] [Indexed: 08/19/2023]
Abstract
Neuropsychologists' conclusions and courtroom testimony on malingering can have profound impact. Intensive and ingenious research has advanced our capacities to identify both insufficient and sufficient effort and thus make worthy contributions to just conflict resolution. Nevertheless, given multiple converging factors, such as misleadingly high accuracy rates in many studies, practitioners may well develop inflated confidence in methods for evaluating effort/malingering. Considerable research shows that overconfidence often increases diagnostic and predictive error and may lead to fixed conclusions when caution is better advised. Leonhard's work thus performs an important service by alerting us to methodological considerations and shortcomings that can generate misimpressions about the efficacy of effort/malingering assessment. The present commentary covers various additional complicating factors in malingering assessment, including other factors that also inflate confidence; subtle and perhaps underappreciated methodological flaws that are inversely related to positive study outcomes (i.e., the worse the flaws the better methods appear to be); oversimplified classifications schemes for studying and evaluating effort that overlook, for example, common mixed presentations (e.g., malingering and genuinely injured); and the need to expand research across a greater range and severity of neuropsychological conditions and diverse groups. More generally, although endorsing various points that Leonhard raises, a number of questions and concerns are presented, such as methods for calculating the impact of case exclusions in studies. Ultimately, although Leonhard's conclusions may be more negative than is justified, it seems fair to categorize methods for assessing malingering/effort as advancing, but not yet advanced, with much more needed to be done to approach that latter status.
Collapse
Affiliation(s)
- David Faust
- Department of Psychology, University of Rhode Island, 142 Flagg Rd., Kingston, RI, 02881, USA.
- Department of Psychiatry and Human Behavior, Warren Alpert Medical School of Brown University, Providence, RI, USA.
| |
Collapse
|
29
|
Cohen IG, Babic B, Gerke S, Xia Q, Evgeniou T, Wertenbroch K. How AI can learn from the law: putting humans in the loop only on appeal. NPJ Digit Med 2023; 6:160. [PMID: 37626155 PMCID: PMC10457290 DOI: 10.1038/s41746-023-00906-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Accepted: 08/10/2023] [Indexed: 08/27/2023] Open
Abstract
While the literature on putting a "human in the loop" in artificial intelligence (AI) and machine learning (ML) has grown significantly, limited attention has been paid to how human expertise ought to be combined with AI/ML judgments. This design question arises because of the ubiquity and quantity of algorithmic decisions being made today in the face of widespread public reluctance to forgo human expert judgment. To resolve this conflict, we propose that human expert judges be included via appeals processes for review of algorithmic decisions. Thus, the human intervenes only in a limited number of cases and only after an initial AI/ML judgment has been made. Based on an analogy with appellate processes in judiciary decision-making, we argue that this is, in many respects, a more efficient way to divide the labor between a human and a machine. Human reviewers can add more nuanced clinical, moral, or legal reasoning, and they can consider case-specific information that is not easily quantified and, as such, not available to the AI/ML at an initial stage. In doing so, the human can serve as a crucial error correction check on the AI/ML, while retaining much of the efficiency of AI/ML's use in the decision-making process. In this paper, we develop these widely applicable arguments while focusing primarily on examples from the use of AI/ML in medicine, including organ allocation, fertility care, and hospital readmission.
Collapse
Affiliation(s)
- I Glenn Cohen
- The Petrie-Flom Center for Health Law Policy, Biotechnology, and Bioethics at Harvard Law School, The Project on Precision Medicine, Artificial Intelligence, and the Law (PMAIL), Cambridge, MA, USA.
- Harvard Law School, Cambridge, MA, USA.
| | | | - Sara Gerke
- The Petrie-Flom Center for Health Law Policy, Biotechnology, and Bioethics at Harvard Law School, The Project on Precision Medicine, Artificial Intelligence, and the Law (PMAIL), Cambridge, MA, USA
- Penn State Dickinson Law, Carlisle, PA, USA
| | - Qiong Xia
- INSEAD, Fontainebleau, France
- INSEAD, Singapore, Singapore
| | | | | |
Collapse
|
30
|
Neumann M, Niessen ASM, Meijer RR. Predicting decision-makers’ algorithm use. COMPUTERS IN HUMAN BEHAVIOR 2023. [DOI: 10.1016/j.chb.2023.107759] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/30/2023]
|
31
|
Van der Put CE, Stolwijk IJ, Staal IIE. Early detection of risk for maltreatment within Dutch preventive child health care: A proxy-based evaluation of the long-term predictive validity of the SPARK method. CHILD ABUSE & NEGLECT 2023; 143:106316. [PMID: 37421774 DOI: 10.1016/j.chiabu.2023.106316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Revised: 05/22/2023] [Accepted: 06/19/2023] [Indexed: 07/10/2023]
Abstract
BACKGROUND For effective prevention of child maltreatment, it is crucial that risk factors for child maltreatment are identified as early as possible. In the Dutch preventive child healthcare, the SPARK-method is used for this purpose. OBJECTIVE The current study investigated the predictive validity of the SPARK-method for predicting child protection activities, as a proxy for child maltreatment, and whether the estimation can be improved with an actuarial module. PARTICIPANTS AND SETTING Participants included a community sample of 1582 children of approximately 18-months-old for whom the SPARK was administered during well-child visits at home (51 %) or at the well-baby clinic (49 %). METHODS SPARK measurements were linked to data on child protection orders and residential youth care over a 10-year follow-up period. The predictive validity was evaluated using Area Under the receiver operating characteristic Curve (AUC) values. RESULTS Results showed good predictive validity for the SPARK clinical risk assessment (AUC = 0.723; large effect). The actuarial module led to a significant improvement in predictive validity (AUC = 0.802; large effect), z = 2.05, p = .04. CONCLUSION These results show that the SPARK is suitable for estimating the risk of child protection activities and that the actuarial module is a valuable addition. The SPARK can be used to support professionals in preventive child healthcare with their decision on appropriate follow-up actions.
Collapse
Affiliation(s)
- C E Van der Put
- Research Institute of Child Development and Education, University of Amsterdam, the Netherlands.
| | | | - I I E Staal
- Department of Preventive Child Health Care, Municipal Health Service Zeeland, Goes, the Netherlands
| |
Collapse
|
32
|
Abstract
The sex difference in the prevalence of autism spectrum disorder (ASD) may be magnified by sex differences on diagnostic measures. The current study compared autistic males and females on items on the gold-standard diagnostic measure, the Autism Diagnostic Observation Schedule, Second Edition (ADOS-2). In a sample of 8-to-17-year old autistic individuals from research (n = 229) and clinical settings (n = 238), females were less likely to show atypicalities on most items related to social-communication behaviors and on total and subscale scores. When controlling for overall intensity of symptomatology, no sex differences survived statistical corrections. Diagnostic criteria and/or gold-standard assessments may be less sensitive to female presentations of ASD and/or autistic females may exhibit fewer or less intense behaviors characteristic of ASD.
Collapse
Affiliation(s)
- Hannah M Rea
- Psychiatry & Behavioral Science Department, University of Washington, Seattle, WA, 98195, USA
| | - Roald A Øien
- UiT-The Arctic University of Norway, PB 6060, 9037, Tromsø, Norway
- Child Study Center, Yale School of Medicine, New Haven, CT, 06510, USA
| | - Frederick Shic
- Center On Child Health, Behavior & Development, Seattle Children's Research Institute, Seattle, WA, 98105, USA
- Department of Pediatrics, University of Washington, Seattle, WA, 98195, USA
| | - Sara Jane Webb
- Psychiatry & Behavioral Science Department, University of Washington, Seattle, WA, 98195, USA
- Center On Child Health, Behavior & Development, Seattle Children's Research Institute, Seattle, WA, 98105, USA
| | - Allison B Ratto
- Children's National Medical Center, 15245 Shady Grove Road, Suite 350, Rockville, MD, 20850, USA.
| |
Collapse
|
33
|
Mulligan CA, Ayoub JL. Remote Assessment: Origins, Benefits, and Concerns. J Intell 2023; 11:114. [PMID: 37367516 DOI: 10.3390/jintelligence11060114] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2023] [Revised: 06/07/2023] [Accepted: 06/07/2023] [Indexed: 06/28/2023] Open
Abstract
Although guidelines surrounding COVID-19 have relaxed and school-aged students are no longer required to wear masks and social distance in schools, we have become, as a nation and as a society, more comfortable working from home, learning online, and using technology as a platform to communicate ubiquitously across ecological environments. In the school psychology community, we have also become more familiar with assessing students virtually, but at what cost? While there is research suggesting score equivalency between virtual and in-person assessment, score equivalency alone is not sufficient to validate a measure or an adaptation thereof. Furthermore, the majority of psychological measures on the market are normed for in-person administration. In this paper, we will not only review the pitfalls of reliability and validity but will also unpack the ethics of remote assessment as an equitable practice.
Collapse
Affiliation(s)
- Christy A Mulligan
- Derner School of Psychology, Adelphi University, 1 South Avenue, Garden City, NY 11530, USA
| | - Justin L Ayoub
- Nassau BOCES, 71 Clinton Road P.O. Box 9195, Garden City, NY 11530, USA
| |
Collapse
|
34
|
Lüdin D, Donath L, Romann M. Disagreement between talent scouts: Implications for improved talent assessment in youth football. J Sports Sci 2023; 41:758-765. [PMID: 37490515 DOI: 10.1080/02640414.2023.2239614] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Accepted: 07/14/2023] [Indexed: 07/27/2023]
Abstract
Reliable talent identification and selection (TID) processes are prerequisites to accurately select young athletes with the most potential for talent development programmes. Knowledge about the agreement between scouts who play a key role in the initial TID in football is lacking. Therefore, the aim of the present study was to evaluate the agreement within four groups of a total of n = 83 talent scouts during rank assessment of under-11 male youth football players (n = 24, age = 11.0 ± 0.3 years) and to describe scouts' underlying approach to assess talent. Krippendorff's α estimates indicated disagreement of scouts' rankings within all groups of scouts (αA = 0.09, αB = 0.03, αC = 0.05, αD = 0.02). Scouts reported relying mainly on their overall impression when forming their final prediction about a player. Reportings of a consistent, structured approach were less prevalent. Taken together, results indicated that different approaches to TID may be associated with disagreement on selection decisions. In order to overcome disagreement in TID, football organisations are encouraged to establish a more structured process. Future research on the elaboration and benefit of ranking guidelines incorporating decomposed and independently evaluated sub-predictors is recommended to improve the reliability of TID.
Collapse
Affiliation(s)
- Dennis Lüdin
- Swiss Federal Institute of Sport Magglingen, Department of Elite Sport, Magglingen, Switzerland
| | - Lars Donath
- Department of Intervention Research in Exercise Training, German Sport University Cologne, Cologne, Germany
| | - Michael Romann
- Swiss Federal Institute of Sport Magglingen, Department of Elite Sport, Magglingen, Switzerland
| |
Collapse
|
35
|
Lee J, Bissell K. User agency–based versus machine agency–based misinformation interventions: The effects of commenting and AI fact-checking labeling on attitudes toward the COVID-19 vaccination. NEW MEDIA & SOCIETY 2023; 26:14614448231163228. [PMCID: PMC10113910 DOI: 10.1177/14614448231163228] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 05/29/2025]
Abstract
This study aimed to examine the effects of commenting on a Facebook misinformation post by comparing a user agency–based intervention and machine agency–based intervention in the form of artificial intelligence (AI) fact-checking labeling on attitudes toward the COVID-19 vaccination. We found that both interventions were effective at promoting positive attitudes toward vaccination compared to the misinformation-only condition. However, the intervention effects manifested differently depending on participants’ residential locations, such that the commenting intervention emerged as a promising tool for suburban participants. The effectiveness of the AI fact-checking labeling intervention was pronounced for urban populations. Neither of the fact-checking interventions showed salient effects with the rural population. These findings suggest that although user agency- and machine agency–based interventions might have potential against misinformation, these interventions should be developed in a more sophisticated way to address the unequal effects among populations in different geographic locations.
Collapse
|
36
|
Benjamin DM, Morstatter F, Abbas AE, Abeliuk A, Atanasov P, Bennett S, Beger A, Birari S, Budescu DV, Catasta M, Ferrara E, Haravitch L, Himmelstein M, Hossain KSMT, Huang Y, Jin W, Joseph R, Leskovec J, Matsui A, Mirtaheri M, Ren X, Satyukov G, Sethi R, Singh A, Sosic R, Steyvers M, Szekely PA, Ward MD, Galstyan A. Hybrid forecasting of geopolitical events
†. AI MAG 2023. [DOI: 10.1002/aaai.12085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/31/2023]
|
37
|
Schaap G, Bosse T, Hendriks Vettehen P. The ABC of algorithmic aversion: not agent, but benefits and control determine the acceptance of automated decision-making. AI & SOCIETY 2023. [DOI: 10.1007/s00146-023-01649-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/30/2023]
Abstract
AbstractWhile algorithmic decision-making (ADM) is projected to increase exponentially in the coming decades, the academic debate on whether people are ready to accept, trust, and use ADM as opposed to human decision-making is ongoing. The current research aims at reconciling conflicting findings on ‘algorithmic aversion’ in the literature. It does so by investigating algorithmic aversion while controlling for two important characteristics that are often associated with ADM: increased benefits (monetary and accuracy) and decreased user control. Across three high-powered (Ntotal = 1192), preregistered 2 (agent: algorithm/human) × 2 (benefits: high/low) × 2 (control: user control/no control) between-subjects experiments, and two domains (finance and dating), the results were quite consistent: there is little evidence for a default aversion against algorithms and in favor of human decision makers. Instead, users accept or reject decisions and decisional agents based on their predicted benefits and the ability to exercise control over the decision.
Collapse
|
38
|
Bonder T, Erev I, Ludvig EA, Roth Y. The common origin of both oversimplified and overly complex decision rules. JOURNAL OF BEHAVIORAL DECISION MAKING 2023. [DOI: 10.1002/bdm.2321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/18/2023]
Affiliation(s)
- Taly Bonder
- Data and Decision Sciences Technion‐Israel Institute of Technology Haifa Israel
| | - Ido Erev
- Data and Decision Sciences Technion‐Israel Institute of Technology Haifa Israel
| | | | - Yefim Roth
- Human Services University of Haifa Haifa Israel
| |
Collapse
|
39
|
Monaghan C, Bizumic B. Dimensional models of personality disorders: Challenges and opportunities. Front Psychiatry 2023; 14:1098452. [PMID: 36960458 PMCID: PMC10028270 DOI: 10.3389/fpsyt.2023.1098452] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 02/03/2023] [Indexed: 03/09/2023] Open
Abstract
Categorical models of personality disorders have been beneficial throughout psychiatric history, providing a mechanism for organizing and communicating research and treatment. However, the view that individuals with personality disorders are qualitatively distinct from the general population is no longer tenable. This perspective has amassed steady criticism, ranging from inconsequential to irreconcilable. In response, stronger evidence has been accumulated in support of a dimensional perspective that unifies normal and pathological personality on underlying trait continua. Contemporary nosology has largely shifted toward this dimensional perspective, yet broader adoption within public lexicon and routine clinical practice appears slow. This review focuses on challenges and the related opportunities of moving toward dimensional models in personality disorder research and practice. First, we highlight the need for ongoing development of a broader array of measurement methods, ideally facilitating multimethod assessments that reduce biases associated with any single methodology. These efforts should also include measurement across both poles of each trait, intensive longitudinal studies, and more deeply considering social desirability. Second, wider communication and training in dimensional approaches is needed for individuals working in mental health. This will require clear demonstrations of incremental treatment efficacy and structured public health rebates. Third, we should embrace cultural and geographic diversity, and investigate how unifying humanity may reduce the stigma and shame currently generated by arbitrarily labeling an individual's personality as normal or abnormal. This review aims to organize ongoing research efforts toward broader and routine usage of dimensional perspectives within research and clinical spaces.
Collapse
Affiliation(s)
- Conal Monaghan
- Research School of Psychology, Australian National University, Canberra, ACT, Australia
| | | |
Collapse
|
40
|
Neumann M, Niessen ASM, Hurks PPM, Meijer RR. Holistic and mechanical combination in psychological assessment: Why algorithms are underutilized and what is needed to increase their use. INTERNATIONAL JOURNAL OF SELECTION AND ASSESSMENT 2023. [DOI: 10.1111/ijsa.12416] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Affiliation(s)
- Marvin Neumann
- Department of Psychometrics and Statistics, Faculty of Behavioral and Social Sciences University of Groningen Groningen The Netherlands
| | - A. Susan M. Niessen
- Department of Psychometrics and Statistics, Faculty of Behavioral and Social Sciences University of Groningen Groningen The Netherlands
| | - Petra P. M. Hurks
- Department of Neuropsychology and Psychopharmacology, Faculty of Psychology and Neuroscience University of Maastricht Maastricht The Netherlands
| | - Rob R. Meijer
- Department of Psychometrics and Statistics, Faculty of Behavioral and Social Sciences University of Groningen Groningen The Netherlands
| |
Collapse
|
41
|
Woo SE, LeBreton JM, Keith MG, Tay L. Bias, Fairness, and Validity in Graduate-School Admissions: A Psychometric Perspective. PERSPECTIVES ON PSYCHOLOGICAL SCIENCE 2023; 18:3-31. [PMID: 35687736 DOI: 10.1177/17456916211055374] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
As many schools and departments are considering the removal of the Graduate Record Examination (GRE) from their graduate-school admission processes to enhance equity and diversity in higher education, controversies arise. From a psychometric perspective, we see a critical need for clarifying the meanings of measurement "bias" and "fairness" to create common ground for constructive discussions within the field of psychology, higher education, and beyond. We critically evaluate six major sources of information that are widely used to help inform graduate-school admissions decisions: grade point average, personal statements, resumes/curriculum vitae, letters of recommendation, interviews, and GRE. We review empirical research evidence available to date on the validity, bias, and fairness issues associated with each of these admission measures and identify potential issues that have been overlooked in the literature. We conclude by suggesting several directions for practical steps to improve the current admissions decisions and highlighting areas in which future research would be beneficial.
Collapse
Affiliation(s)
- Sang Eun Woo
- Department of Psychological Sciences, Purdue University
| | | | | | - Louis Tay
- Department of Psychological Sciences, Purdue University
| |
Collapse
|
42
|
Attema AE, Galizzi MM, Groß M, Hennig-Schmidt H, Karay Y, L'Haridon O, Wiesen D. The formation of physician altruism. JOURNAL OF HEALTH ECONOMICS 2023; 87:102716. [PMID: 36603361 DOI: 10.1016/j.jhealeco.2022.102716] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/05/2022] [Revised: 09/21/2022] [Accepted: 11/29/2022] [Indexed: 06/17/2023]
Abstract
We study how patient-regarding altruism is formed by medical education. We elicit and structurally estimate altruistic preferences using experimental data from a large sample of medical students (N = 733) in Germany at different progress stages in their studies. The estimates reveal substantial heterogeneity in altruistic preferences of medical students. Patient-regarding altruism is highest for freshmen, significantly declines for students in the course of medical studies, and tends to increase again for last year students, who assist in clinical practice. Also, patient-regarding altruism is higher for females and positively associated to general altruism. Altruistic medical students have gained prior practical experience in healthcare, have lower income expectations, and are more likely to choose surgery and pediatrics as their preferred specialty.
Collapse
Affiliation(s)
- Arthur E Attema
- Erasmus School of Health Policy & Management, Erasmus University Rotterdam, The Netherlands.
| | - Matteo M Galizzi
- Department of Psychological and Behavioural Science, London School of Economics and Political Science, UK.
| | - Mona Groß
- Department of Business Administration and Healthcare Management, University of Cologne, Germany.
| | - Heike Hennig-Schmidt
- Laboratory for Experimental Economics, Department of Economics, University of Bonn, Germany.
| | | | - Olivier L'Haridon
- Center for Research in Economics and Management (CREM), University of Rennes 1, France; Institut Universitaire de France, France.
| | - Daniel Wiesen
- Department of Business Administration and Healthcare Management, University of Cologne, Germany.
| |
Collapse
|
43
|
Filiz I, Judek JR, Lorenz M, Spiwoks M. The extent of algorithm aversion in decision-making situations with varying gravity. PLoS One 2023; 18:e0278751. [PMID: 36809526 PMCID: PMC9942970 DOI: 10.1371/journal.pone.0278751] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Accepted: 11/15/2022] [Indexed: 02/23/2023] Open
Abstract
Algorithms already carry out many tasks more reliably than human experts. Nevertheless, some subjects have an aversion towards algorithms. In some decision-making situations an error can have serious consequences, in others not. In the context of a framing experiment, we examine the connection between the consequences of a decision-making situation and the frequency of algorithm aversion. This shows that the more serious the consequences of a decision are, the more frequently algorithm aversion occurs. Particularly in the case of very important decisions, algorithm aversion thus leads to a reduction of the probability of success. This can be described as the tragedy of algorithm aversion.
Collapse
Affiliation(s)
- Ibrahim Filiz
- Faculty of Business, Ostfalia University of Applied Sciences, Wolfsburg, Germany
| | - Jan René Judek
- Faculty of Business, Ostfalia University of Applied Sciences, Wolfsburg, Germany
| | - Marco Lorenz
- Faculty of Economic Sciences, Georg August University Göttingen, Göttingen, Germany
- * E-mail:
| | - Markus Spiwoks
- Faculty of Business, Ostfalia University of Applied Sciences, Wolfsburg, Germany
| |
Collapse
|
44
|
Rost N, Binder EB, Brückl TM. Predicting treatment outcome in depression: an introduction into current concepts and challenges. Eur Arch Psychiatry Clin Neurosci 2023; 273:113-127. [PMID: 35587279 PMCID: PMC9957888 DOI: 10.1007/s00406-022-01418-4] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Accepted: 04/11/2022] [Indexed: 12/19/2022]
Abstract
Improving response and remission rates in major depressive disorder (MDD) remains an important challenge. Matching patients to the treatment they will most likely respond to should be the ultimate goal. Even though numerous studies have investigated patient-specific indicators of treatment efficacy, no (bio)markers or empirical tests for use in clinical practice have resulted as of now. Therefore, clinical decisions regarding the treatment of MDD still have to be made on the basis of questionnaire- or interview-based assessments and general guidelines without the support of a (laboratory) test. We conducted a narrative review of current approaches to characterize and predict outcome to pharmacological treatments in MDD. We particularly focused on findings from newer computational studies using machine learning and on the resulting implementation into clinical decision support systems. The main issues seem to rest upon the unavailability of robust predictive variables and the lacking application of empirical findings and predictive models in clinical practice. We outline several challenges that need to be tackled on different stages of the translational process, from current concepts and definitions to generalizable prediction models and their successful implementation into digital support systems. By bridging the addressed gaps in translational psychiatric research, advances in data quantity and new technologies may enable the next steps toward precision psychiatry.
Collapse
Affiliation(s)
- Nicolas Rost
- Department of Translational Research in Psychiatry, Max Planck Institute of Psychiatry, Kraepelinstraße 2-10, 80804, Munich, Germany. .,International Max Planck Research School for Translational Psychiatry, Munich, Germany.
| | - Elisabeth B. Binder
- Department of Translational Research in Psychiatry, Max Planck Institute of Psychiatry, Kraepelinstraße 2-10, 80804 Munich, Germany
| | - Tanja M. Brückl
- Department of Translational Research in Psychiatry, Max Planck Institute of Psychiatry, Kraepelinstraße 2-10, 80804 Munich, Germany
| |
Collapse
|
45
|
Gibbons S, Sinclair CT. Demystifying Prognosis : Understanding the Science and Art of Prognostication. Cancer Treat Res 2023; 187:53-71. [PMID: 37851219 DOI: 10.1007/978-3-031-29923-0_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2023]
Abstract
The science of prognostication is emerging as a vital part of providing goal concordant patient care. Historically, modern medicine has tended to shy away from approaching prognostication as a core clinical skill, and prognosis as something to be shared directly with the patient. In recent years however, the medical field's shift towards a focus on patient autonomy and more openness in matters regarding end of life has propelled the study of prognostication into a more essential component of patient centered care. This calls for more emphasis on teaching the science of prognosis and the skill of prognostication as a core part of modern medical education. The following chapter aims to delve into the science of prognostication, explore the methods of formulating a prognosis, and discuss issues surrounding the communication of prognosis.
Collapse
Affiliation(s)
- Shauna Gibbons
- Division of Palliative Medicine, University of Kansas Health System, 4000 Cambridge St, Kansas City, KS, USA.
| | - Christian T Sinclair
- Division of Palliative Medicine, University of Kansas Health System, 4000 Cambridge St, Kansas City, KS, USA
| |
Collapse
|
46
|
Weigard A, Spencer RJ. Benefits and challenges of using logistic regression to assess neuropsychological performance validity: Evidence from a simulation study. Clin Neuropsychol 2023; 37:34-59. [PMID: 35006042 PMCID: PMC9273108 DOI: 10.1080/13854046.2021.2023650] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Accepted: 12/22/2021] [Indexed: 02/07/2023]
Abstract
Logistic regression (LR) is recognized as a promising method for making decisions about neuropsychological performance validity by integrating information across multiple measures. However, this method has yet to be widely adopted in clinical practice, likely because several open questions remain about its utility relative to simpler methods, its effectiveness across different clinical contexts, and its feasibility at sample sizes common in the field. The current study addresses these questions by assessing classification performance of logistic regression and alternative methods across an array of simulated data sets. We simulated scores of valid and invalid performers on 6 tests designed to mimic the psychometric and distributional properties of real performance validity measures. Out-of-sample predictive performance of LR and a commonly used alternative ("vote counting") was assessed across different base rates, validity measure properties, and sample sizes. LR improved classification accuracy by 2%-12% across simulation conditions, primarily by improving sensitivity. False positives and negatives can be further reduced when LR predictions are interpreted as continuous, rather than binary. LR made robust predictions at sample sizes feasible for neuropsychology research (N = 307) and when as few as 2 tests with good psychometric properties were used. Although training and test data sets of at least several hundred individuals may be required to develop and evaluate LR models for use in clinical practice, LR promises to be an efficient and powerful tool for improving judgements about performance validity. We offer several recommendations for model development and LR interpretation in a clinical setting.
Collapse
Affiliation(s)
| | - Robert J. Spencer
- Department of Psychiatry, University of Michigan
- VA Ann Arbor Healthcare System
| |
Collapse
|
47
|
Connors MH, Large MM. Calibrating violence risk assessments for uncertainty. Gen Psychiatr 2023; 36:e100921. [PMID: 37144159 PMCID: PMC10151861 DOI: 10.1136/gpsych-2022-100921] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Accepted: 03/26/2023] [Indexed: 05/06/2023] Open
Abstract
Psychiatrists and other mental health clinicians are often tasked with assessing patients' risk of violence. Approaches to this vary and include both unstructured (based on individual clinicians' judgement) and structured methods (based on formalised scoring and algorithms with varying scope for clinicians' judgement). The end result is usually a categorisation of risk, which may, in turn, reference a probability estimate of violence over a certain time period. Research over recent decades has made considerable improvements in refining structured approaches and categorising patients' risk classifications at a group level. The ability, however, to apply these findings clinically to predict the outcomes of individual patients remains contested. In this article, we review methods of assessing violence risk and empirical findings on their predictive validity. We note, in particular, limitations in calibration (accuracy at predicting absolute risk) as distinct from discrimination (accuracy at separating patients by outcome). We also consider clinical applications of these findings, including challenges applying statistics to individual patients, and broader conceptual issues in distinguishing risk and uncertainty. Based on this, we argue that there remain significant limits to assessing violence risk for individuals and that this requires careful consideration in clinical and legal contexts.
Collapse
Affiliation(s)
- Michael H Connors
- Centre for Healthy Brain Ageing, University of New South Wales, Sydney, New South Wales, Australia
- Discipline of Psychiatry and Mental Health, University of New South Wales, Sydney, New South Wales, Australia
- Department of Psychiatry, University of Melbourne, Melbourne, Victoria, Australia
| | - Matthew M Large
- Discipline of Psychiatry and Mental Health, University of New South Wales, Sydney, New South Wales, Australia
| |
Collapse
|
48
|
Uher J. Rating scales institutionalise a network of logical errors and conceptual problems in research practices: A rigorous analysis showing ways to tackle psychology's crises. Front Psychol 2022; 13:1009893. [PMID: 36643697 PMCID: PMC9833395 DOI: 10.3389/fpsyg.2022.1009893] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 10/14/2022] [Indexed: 12/29/2022] Open
Abstract
This article explores in-depth the metatheoretical and methodological foundations on which rating scales-by their very conception, design and application-are built and traces their historical origins. It brings together independent lines of critique from different scholars and disciplines to map out the problem landscape, which centres on the failed distinction between psychology's study phenomena (e.g., experiences, everyday constructs) and the means of their exploration (e.g., terms, data, scientific constructs)-psychologists' cardinal error. Rigorous analyses reveal a dense network of 12 complexes of problematic concepts, misconceived assumptions and fallacies that support each other, making it difficult to be identified and recognised by those (unwittingly) relying on them (e.g., various forms of reductionism, logical errors of operationalism, constructification, naïve use of language, quantificationism, statisticism, result-based data generation, misconceived nomotheticism). Through the popularity of rating scales for efficient quantitative data generation, uncritically interpreted as psychological measurement, these problems have become institutionalised in a wide range of research practices and perpetuate psychology's crises (e.g., replication, confidence, validation, generalizability). The article provides an in-depth understanding that is needed to get to the root of these problems, which preclude not just measurement but also the scientific exploration of psychology's study phenomena and thus its development as a science. From each of the 12 problem complexes; specific theoretical concepts, methodologies and methods are derived as well as key directions of development. The analyses-based on three central axioms for transdisciplinary research on individuals, (1) complexity, (2) complementarity and (3) anthropogenicity-highlight that psychologists must (further) develop an explicit metatheory and unambiguous terminology as well as concepts and theories that conceive individuals as living beings, open self-organising systems with complementary phenomena and dynamic interrelations across their multi-layered systemic contexts-thus, theories not simply of elemental properties and structures but of processes, relations, dynamicity, subjectivity, emergence, catalysis and transformation. Philosophical and theoretical foundations of approaches suited for exploring these phenomena must be developed together with methods of data generation and methods of data analysis that are appropriately adapted to the peculiarities of psychologists' study phenomena (e.g., intra-individual variation, momentariness, contextuality). Psychology can profit greatly from its unique position at the intersection of many other disciplines and can learn from their advancements to develop research practices that are suited to tackle its crises holistically.
Collapse
Affiliation(s)
- Jana Uher
- School of Human Sciences, University of Greenwich, London, United Kingdom
- London School of Economics, London, United Kingdom
| |
Collapse
|
49
|
Harper CA, Hicks RA. The Effect of Attitudes Towards Individuals with Sexual Convictions on Professional and Student Risk Judgments. SEXUAL ABUSE : A JOURNAL OF RESEARCH AND TREATMENT 2022; 34:948-972. [PMID: 35220820 PMCID: PMC9643808 DOI: 10.1177/10790632211070799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Attitudes towards individuals with sexual convictions is an area with growing research interest, but the effects of such attitudes on professional judgments is largely unexplored. What is known from the existing literature is that attitudes guide the interpretation of sexual crime related information, which cascade into potential biased or heuristically driven judgments. In this study we recruited samples of both students (n = 341) and forensic professionals (n = 186) to explore whether attitudes towards individuals with sexual convictions predicted risk judgments of hypothetical sexual offense scenarios, and whether this relationship is moderated by professional status or perpetrator characteristics. Forensic professionals expressed more positive attitudes overall, but the significant effect of attitudes on risk judgments was consistent between participant groups and was not moderated by perpetrator age or sex. We suggest that relying on attitudes as a basis for risk judgments opens the door to incorrect (and potentially dangerous) decision-making and discuss our data in terms of their potential clinical implications. An open-access preprint of this work is available at https://psyarxiv.com/rjt5h/.
Collapse
Affiliation(s)
| | - Rachel A. Hicks
- Nottingham Trent University
(UK), Nottingham, UK
- Nottinghamshire Healthcare NHS
Foundation Trust, Nottingham, UK
| |
Collapse
|
50
|
Schuringa E, Spreen M, Bogaerts S. Treatment Evaluation in Forensic Psychiatry. Which One Should Be Used: The Clinical Judgment or the Instrument-based Assessment of Change? INTERNATIONAL JOURNAL OF OFFENDER THERAPY AND COMPARATIVE CRIMINOLOGY 2022; 66:1821-1836. [PMID: 34114499 DOI: 10.1177/0306624x211023921] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In forensic psychiatry, it is common practice to use an unstructured clinical judgment for treatment evaluation. From risk assessment studies, it is known that the unstructured clinical judgment is unreliable and the use of instruments is recommended. This paper aims to explore the clinical judgment of change compared to the calculated change using the Instrument for Forensic Treatment Evaluation (IFTE) in relation to changes in inpatient violence This study shows that the clinical judgment is much more positive about patient's behavioral changes than the calculated change. And that the calculated change is more in accordance with the change in the occurrence of inpatient violence, suggesting that the calculated change reflects reality closer than the unstructured clinical judgment. Therefore, it is advisable to use the IFTE as a base to make a structured professional judgment of the treatment evaluation of a forensic psychiatric patient.
Collapse
Affiliation(s)
| | - Marinus Spreen
- NHLStenden University of Applied Sciences, Leeuwarden, The Netherlands
| | - Stefan Bogaerts
- Department of Developmental Psychology, Tilburg University, The Netherlands
- Fivoor Research & Treatment Innovation, Poortugaal, The Netherlands
| |
Collapse
|