1
|
Majdik ZP, Graham SS, Shiva Edward JC, Rodriguez SN, Karnes MS, Jensen JT, Barbour JB, Rousseau JF. Sample Size Considerations for Fine-Tuning Large Language Models for Named Entity Recognition Tasks: Methodological Study. JMIR AI 2024; 3:e52095. [PMID: 38875593 PMCID: PMC11140272 DOI: 10.2196/52095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 12/13/2023] [Accepted: 03/30/2024] [Indexed: 06/16/2024]
Abstract
BACKGROUND Large language models (LLMs) have the potential to support promising new applications in health informatics. However, practical data on sample size considerations for fine-tuning LLMs to perform specific tasks in biomedical and health policy contexts are lacking. OBJECTIVE This study aims to evaluate sample size and sample selection techniques for fine-tuning LLMs to support improved named entity recognition (NER) for a custom data set of conflicts of interest disclosure statements. METHODS A random sample of 200 disclosure statements was prepared for annotation. All "PERSON" and "ORG" entities were identified by each of the 2 raters, and once appropriate agreement was established, the annotators independently annotated an additional 290 disclosure statements. From the 490 annotated documents, 2500 stratified random samples in different size ranges were drawn. The 2500 training set subsamples were used to fine-tune a selection of language models across 2 model architectures (Bidirectional Encoder Representations from Transformers [BERT] and Generative Pre-trained Transformer [GPT]) for improved NER, and multiple regression was used to assess the relationship between sample size (sentences), entity density (entities per sentence [EPS]), and trained model performance (F1-score). Additionally, single-predictor threshold regression models were used to evaluate the possibility of diminishing marginal returns from increased sample size or entity density. RESULTS Fine-tuned models ranged in topline NER performance from F1-score=0.79 to F1-score=0.96 across architectures. Two-predictor multiple linear regression models were statistically significant with multiple R2 ranging from 0.6057 to 0.7896 (all P<.001). EPS and the number of sentences were significant predictors of F1-scores in all cases ( P<.001), except for the GPT-2_large model, where EPS was not a significant predictor (P=.184). Model thresholds indicate points of diminishing marginal return from increased training data set sample size measured by the number of sentences, with point estimates ranging from 439 sentences for RoBERTa_large to 527 sentences for GPT-2_large. Likewise, the threshold regression models indicate a diminishing marginal return for EPS with point estimates between 1.36 and 1.38. CONCLUSIONS Relatively modest sample sizes can be used to fine-tune LLMs for NER tasks applied to biomedical text, and training data entity density should representatively approximate entity density in production data. Training data quality and a model architecture's intended use (text generation vs text processing or classification) may be as, or more, important as training data volume and model parameter size.
Collapse
Affiliation(s)
- Zoltan P Majdik
- Department of Communication, North Dakota State University, Fargo, ND, United States
| | - S Scott Graham
- Department of Rhetoric & Writing, The University of Texas at Austin, Austin, TX, United States
| | - Jade C Shiva Edward
- Department of Rhetoric & Writing, The University of Texas at Austin, Austin, TX, United States
| | - Sabrina N Rodriguez
- Department of Neurology, The Dell Medical School, The University of Texas at Austin, Austin, TX, United States
| | - Martha S Karnes
- Department of Rhetoric & Writing, University of Arkansas Little Rock, Little Rock, AR, United States
| | - Jared T Jensen
- Department of Rhetoric & Writing, The University of Texas at Austin, Austin, TX, United States
| | - Joshua B Barbour
- Department of Communication, The University of Illinois at Urbana-Champaign, Urbana, IL, United States
| | - Justin F Rousseau
- Statistical Planning and Analysis Section, Department of Neurology, The University of Texas Southwestern Medical Center, Dallas, TX, United States
- Peter O'Donnell Jr. Brain Institute, The University of Texas Southwestern Medical Center, Dallas, TX, United States
| |
Collapse
|
2
|
Graham SS, Sharma N, Karnes MS, Majdik ZP, Barbour JB, Rousseau JF. A Content Analysis of Self-Reported Financial Relationships in Biomedical Research. AJOB Empir Bioeth 2023; 14:91-98. [PMID: 36576202 PMCID: PMC10182247 DOI: 10.1080/23294515.2022.2160509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/18/2023]
Abstract
INTRODUCTION Financial conflicts of interest (fCOI) present well documented risks to the integrity of biomedical research. However, few studies differentiate among fCOI types in their analyses, and those that do tend to use preexisting taxonomies for fCOI identification. Research on fCOI would benefit from an empirically-derived taxonomy of self-reported fCOI and data on fCOI type and payor prevalence. METHODS We conducted a content analysis of 6,165 individual self-reported relationships from COI statements distributed across 378 articles indexed with PubMed. Two coders used an iterative coding process to identify and classify individual fCOI types and payors. Inter-rater reliability was κ = 0.935 for fCOI type and κ = 0.884 for payor identification. RESULTS Our analysis identified 21 fCOI types, 9 of which occurred at prevalences greater than 1%. These included research funding (24.8%), speaking fees (20.8%), consulting fees (18.8%), advisory relationships (11%), industry employment (7.6%), unspecified fees (4.8%), travel fees (3.2%), stock holdings (3.1%), and patent ownership (1%). Reported fCOI were held with 1,077 unique payors, 22 of which were present in more than 1% of financial relationships. The ten most common payors included Pfizer (4%), Novartis (3.9%), MSD (3.8%), Bristol Myers Squibb (3.2%), AstraZeneca (3.1%), GSK (3%), Boehringer Ingelheim (2.9%), Roche (2.8%), Eli LIlly (2.5%), and AbbVie (2.4%). CONCLUSIONS These results provide novel multi-domain prevalence data on self-reported fCOI and payors in biomedical research. As such, they have the potential to catalyze future research that can assess the differential effects of various types of fCOI. Specifically, the data suggest that comparative analyses of the effects of different fCOI types are needed and that special attention should be paid to the diversity of payor types for research relationships.
Collapse
Affiliation(s)
- S Scott Graham
- Department of Rhetoric & Writing, Center for Health Communication, The University of Texas at Austin, Austin, TX, USA
| | - Nandini Sharma
- Moody College of Communication, The University of Texas at Austin, Austin, TX, USA
| | - Martha S Karnes
- Department of English, The University of Texas at Austin, Austin, TX, USA
| | - Zoltan P Majdik
- Department of Communication, North Dakota State University, Fargo, ND, USA
| | - Joshua B Barbour
- Moody College of Communication, The University of Texas at Austin, Austin, TX, USA
| | - Justin F Rousseau
- Departments of Population Health and Neurology, The Dell Medical School at The University of Texas at Austin, Austin, TX, USA
| |
Collapse
|
3
|
Cannabis companies and the sponsorship of scientific research: A cross-sectional Canadian case study. PLoS One 2023; 18:e0280110. [PMID: 36626363 PMCID: PMC9831296 DOI: 10.1371/journal.pone.0280110] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Accepted: 12/20/2022] [Indexed: 01/11/2023] Open
Abstract
Corporations across sectors engage in the conduct, sponsorship, and dissemination of scientific research. Industry sponsorship of research, however, is associated with research agendas, outcomes, and conclusions that are favourable to the sponsor. The legalization of cannabis in Canada provides a useful case study to understand the nature and extent of the nascent cannabis industry's involvement in the production of scientific evidence as well as broader impacts on equity-oriented research agendas. We conducted a cross-sectional, descriptive, meta-research study to describe the characteristics of research that reports funding from, or author conflicts of interest with, Canadian cannabis companies. From May to August 2021, we sampled licensed, prominent Canadian cannabis companies, identified their subsidiaries, and searched each company name in the PubMed conflict of interest statement search interface. Authors of included articles disclosed research support from, or conflicts of interest with, Canadian cannabis companies. We included 156 articles: 82% included at least one author with a conflict of interest and 1/3 reported study support from a Canadian cannabis company. More than half of the sampled articles were not cannabis focused, however, a cannabis company was listed amongst other biomedical companies in the author disclosure statement. For articles with a cannabis focus, prevalent topics included cannabis as a treatment for a range of conditions (15/72, 21%), particularly chronic pain (6/72, 8%); as a tool in harm reduction related to other substance use (10/72, 14%); product safety (10/72, 14%); and preclinical animal studies (6/72, 8%). Demographics were underreported in empirical studies with human participants, but most included adults (76/84, 90%) and, where reported, predominantly white (32/39, 82%) and male (49/83, 59%) participants. The cannabis company-funded studies included people who used drugs (37%) and people prescribed medical cannabis (22%). Canadian cannabis companies may be analogous to peer industries such as pharmaceuticals, alcohol, tobacco, and food in the following three ways: sponsoring research related to product development, expanding indications of use, and supporting key opinion leaders. Given the recent legalization of cannabis in Canada, there is ample opportunity to create a policy climate that can mitigate the harms of criminalization as well as impacts of the "funding effect" on research integrity, research agendas, and the evidence base available for decision-making, while promoting high-priority and equity-oriented independent research.
Collapse
|
4
|
Graham SS, Karnes MS, Jensen JT, Sharma N, Barbour JB, Majdik ZP, Rousseau JF. Evidence for stratified conflicts of interest policies in research contexts: a methodological review. BMJ Open 2022; 12:e063501. [PMID: 36123074 PMCID: PMC9486359 DOI: 10.1136/bmjopen-2022-063501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
OBJECTIVES The purpose of this study was to conduct a methodological review of research on the effects of conflicts of interest (COIs) in research contexts. DESIGN Methodological review. DATA SOURCES Ovid. ELIGIBILITY CRITERIA Studies published between 1986 and 2021 conducting quantitative assessments of relationships between industry funding or COI and four target outcomes: positive study results, methodological biases, reporting quality and results-conclusions concordance. DATA EXTRACTION AND SYNTHESIS We assessed key facets of study design: our primary analysis identified whether studies stratified industry funding or COI variables by magnitude (ie, number of COI or disbursement amount), type (employment, travel fees, speaking fees) or if they assessed dichotomous variables (ie, conflict present or absent). Secondary analyses focused on target outcomes and available effects measures. RESULTS Of the 167 articles included in this study, a substantial majority (98.2%) evaluated the effects of industry sponsorship. None evaluated associations between funding magnitude and outcomes of interest. Seven studies (4.3%) stratified industry funding based on the mechanism of disbursement or funder relationship to product (manufacturer or competitor). A fifth of the articles (19.8%) assessed the effects of author COI on target outcomes. None evaluated COI magnitude, and three studies (9.1%) stratified COI by disbursement type and/or reporting practices. Participation of an industry-employed author showed the most consistent effect on favourability of results across studies. CONCLUSIONS Substantial evidence demonstrates that industry funding and COI can bias biomedical research. Evidence-based policies are essential for mitigating the risks associated with COI. Although most policies stratify guidelines for managing COI, differentiating COIs based on the type of relationship or monetary value, this review shows that the available research has generally not been designed to assess the differential risks of COI types or magnitudes. Targeted research is necessary to establish an evidence base that can effectively inform policy to manage COI.
Collapse
Affiliation(s)
- S Scott Graham
- Department of Rhetoric & Writing, University of Texas at Austin, Austin, Texas, USA
| | - Martha S Karnes
- Department of English, The University of Texas at Austin, Austin, Texas, USA
| | - Jared T Jensen
- Department of Communication Studies, The Unviersity of Texas at Austin, Austin, TX, USA
| | - Nandini Sharma
- Department of Communication Studies, The Unviersity of Texas at Austin, Austin, TX, USA
| | - Joshua B Barbour
- Department of Communication Studies, The Unviersity of Texas at Austin, Austin, TX, USA
| | - Zoltan P Majdik
- Department of Communication, North Dakota State University, Fargo, North Dakota, USA
| | - Justin F Rousseau
- Department of Population Health and Neurology, The University of Texas at Austin Dell Medical School, Austin, Texas, USA
| |
Collapse
|
5
|
Graham SS, Majdik ZP, Barbour JB, Rousseau JF. Associations Between Aggregate NLP-Extracted Conflicts of Interest and Adverse Events by Drug Product. Stud Health Technol Inform 2022; 290:405-409. [PMID: 35673045 PMCID: PMC9186043 DOI: 10.3233/shti220106] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/27/2023]
Abstract
This study evaluates associations between aggregate conflicts of interest (COI) and drug safety. We used a machine-learning system to extract and classify COI from PubMed-indexed disclosure statements. Individual conflicts were classified as Type 1 (personal fees, travel, board memberships, and non-financial support), Type 2 (grants and research support), or Type 3 (stock ownership and industry employment). COI were aggregated by type compared to adverse events by product. Type 1 COI are associated with a 1.1-1.8% increase in the number of adverse events, serious events, hospitalizations, and deaths. Type 2 COI are associated with a 1.7-2% decrease in adverse events across severity levels. Type 3 COI are associated with an approximately 1% increase in adverse events, serious events, and hospitalizations, but have no significant association with adverse events resulting in death. The findings suggest that COI policies might be adapted to account the relative risks of different types of financial relationships.
Collapse
Affiliation(s)
- S. Scott Graham
- Department of Rhetoric & Writing, The University of Texas at Austin, Austin, TX, USA
| | - Zoltan P. Majdik
- Department of Communication, North Dakota State University, Fargo, ND, USA
| | - Johua B. Barbour
- Moody College of Communication, The University of Texas at Austin, Austin, TX, USA
| | - Justin F. Rousseau
- Departments of Neurology & Population Health, The Dell Medical School at The University of Texas at Austin, Austin, TX, USA
| |
Collapse
|
6
|
Nagendrababu V, Murray PE, Faggion CM, Dummer PMH. Promoting integrity in scholarly research and its publication: International Endodontic Journal policy on reporting conflicts of interest, funding and acknowledgments within manuscripts submitted for publication. Int Endod J 2021; 54:1969-1973. [PMID: 34633660 DOI: 10.1111/iej.13599] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Venkateshbabu Nagendrababu
- Department of Preventive and Restorative Dentistry, College of Dental Medicine, University of Sharjah, Sharjah, UAE
| | | | - Clovis M Faggion
- Department of Periodontology and Operative Dentistry, Faculty of Dentistry, University Hospital Münster, Münster, Germany
| | - Paul M H Dummer
- School of Dentistry, College of Biomedical and Life Sciences, Cardiff University, Cardiff, UK
| |
Collapse
|