1
|
Mayo-Wilson E. Complete reporting of clinical trials requires more than journal articles. BMJ 2025; 389:r494. [PMID: 40228826 DOI: 10.1136/bmj.r494] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 04/16/2025]
Affiliation(s)
- Evan Mayo-Wilson
- Department of Epidemiology, University of North Carolina Gillings School of Global Public Health, Chapel Hill, NC, USA
| |
Collapse
|
2
|
Dudda L, Kormann E, Kozula M, DeVito NJ, Klebel T, Dewi APM, Spijker R, Stegeman I, Van den Eynden V, Ross-Hellauer T, Leeflang MMG. Open science interventions to improve reproducibility and replicability of research: a scoping review. ROYAL SOCIETY OPEN SCIENCE 2025; 12:242057. [PMID: 40206851 PMCID: PMC11979971 DOI: 10.1098/rsos.242057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/25/2024] [Revised: 02/24/2025] [Accepted: 02/27/2025] [Indexed: 04/11/2025]
Abstract
Various open science practices have been proposed to improve the reproducibility and replicability of scientific research, but not for all practices, there may be evidence they are indeed effective. Therefore, we conducted a scoping review of the literature on interventions to improve reproducibility. We systematically searched Medline, Embase, Web of Science, PsycINFO, Scopus and Eric, on 18 August 2023. Any study empirically evaluating the effectiveness of interventions aimed at improving the reproducibility or replicability of scientific methods and findings was included. We summarized the retrieved evidence narratively and in evidence gap maps. Of the 105 distinct studies we included, 15 directly measured the effect of an intervention on reproducibility or replicability, while the remainder addressed a proxy outcome that might be expected to increase reproducibility or replicability, such as data sharing, methods transparency or pre-registration. Thirty studies were non-comparative and 27 were comparative but cross-sectional observational designs, precluding any causal inference. Despite studies investigating a range of interventions and addressing various outcomes, our findings indicate that in general the evidence base for which various interventions to improve reproducibility of research remains remarkably limited in many respects.
Collapse
Affiliation(s)
- Leonie Dudda
- Department of Otorhinolaryngology and Head & Neck Surgery, University Medical Center Utrecht, Utrecht, The Netherlands
- Brain Center, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Eva Kormann
- Open and Reproducible Research Group, Know Center GmbH, Graz, Austria
| | | | - Nicholas J. DeVito
- Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, UK
| | - Thomas Klebel
- Open and Reproducible Research Group, Know Center GmbH, Graz, Austria
| | - Ayu P. M. Dewi
- Epidemiology and Data Science, Amsterdam UMC Locatie AMC, Amsterdam, Noord-Holland, The Netherlands
| | - René Spijker
- Cochrane Netherlands, University Medical Center Utrecht, Utrecht, The Netherlands
- Medical Library, Amsterdam UMC Locatie AMC, Amsterdam, Noord-Holland, The Netherlands
| | - Inge Stegeman
- Department of Otorhinolaryngology and Head & Neck Surgery, University Medical Center Utrecht, Utrecht, The Netherlands
- Brain Center, University Medical Center Utrecht, Utrecht, The Netherlands
| | | | | | - Mariska M. G. Leeflang
- Epidemiology and Data Science, Amsterdam UMC Locatie AMC, Amsterdam, Noord-Holland, The Netherlands
| |
Collapse
|
3
|
Wrightson JG, Blazey P, Moher D, Khan KM, Ardern CL. GPT for RCTs? Using AI to determine adherence to clinical trial reporting guidelines. BMJ Open 2025; 15:e088735. [PMID: 40107689 PMCID: PMC11927406 DOI: 10.1136/bmjopen-2024-088735] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 03/22/2025] Open
Abstract
OBJECTIVES Adherence to established reporting guidelines can improve clinical trial reporting standards, but attempts to improve adherence have produced mixed results. This exploratory study aimed to determine how accurate a large language model generative artificial intelligence system (AI-LLM) was for determining reporting guideline compliance in a sample of sports medicine clinical trial reports. DESIGN This study was an exploratory retrospective data analysis. OpenAI GPT-4 and Meta Llama 2 AI-LLM were evaluated for their ability to determine reporting guideline adherence in a sample of sports medicine and exercise science clinical trial reports. SETTING Academic research institution. PARTICIPANTS The study sample included 113 published sports medicine and exercise science clinical trial papers. For each paper, the GPT-4 Turbo and Llama 2 70B models were prompted to answer a series of nine reporting guideline questions about the text of the article. The GPT-4 Vision model was prompted to answer two additional reporting guideline questions about the participant flow diagram in a subset of articles. The dataset was randomly split (80/20) into a TRAIN and TEST dataset. Hyperparameter and fine-tuning were performed using the TRAIN dataset. The Llama 2 model was fine-tuned using the data from the GPT-4 Turbo analysis of the TRAIN dataset. PRIMARY AND SECONDARY OUTCOME MEASURES The primary outcome was the F1-score, a measure of model performance on the TEST dataset. The secondary outcome was the model's classification accuracy (%). RESULTS Across all questions about the article text, the GPT-4 Turbo AI-LLM demonstrated acceptable performance (F1-score=0.89, accuracy (95% CI) = 90% (85% to 94%)). Accuracy for all reporting guidelines was >80%. The Llama 2 model accuracy was initially poor (F1-score=0.63, accuracy (95% CI) = 64% (57% to 71%)) and improved with fine-tuning (F1-score=0.84, accuracy (95% CI) = 83% (77% to 88%)). The GPT-4 Vision model accurately identified all participant flow diagrams (accuracy (95% CI) = 100% (89% to 100%)) but was less accurate at identifying when details were missing from the flow diagram (accuracy (95% CI) = 57% (39% to 73%)). CONCLUSIONS Both the GPT-4 and fine-tuned Llama 2 AI-LLMs showed promise as tools for assessing reporting guideline compliance. Next steps should include developing an efficient, open-source AI-LLM and exploring methods to improve model accuracy.
Collapse
Affiliation(s)
- James G Wrightson
- Department of Physical Therapy, The University of British Columbia Faculty of Medicine, Vancouver, British Columbia, Canada
| | - Paul Blazey
- Centre for Aging SMART, The University of British Columbia, Vancouver, British Columbia, Canada
| | - David Moher
- Ottawa Methods Centre, Ottawa Hospital Research Institute, Ottawa, Ontario, Canada
| | - Karim M Khan
- Department of Family Practice, The University of British Columbia, Vancouver, British Columbia, Canada
| | - Clare L Ardern
- Department of Physical Therapy, The University of British Columbia Faculty of Medicine, Vancouver, British Columbia, Canada
- Centre for Aging SMART, The University of British Columbia, Vancouver, British Columbia, Canada
- Sport and Exercise Medicine Research Centre, La Trobe University, Melbourne, Victoria, Australia
| |
Collapse
|
4
|
Struthers C, Harwood J, de Beyer JA, Logullo P, Collins GS. There is no reliable evidence that providing authors with customized article templates including items from reporting guidelines improves completeness of reporting: the GoodReports randomized trial (GRReaT). BMC Med Res Methodol 2025; 25:71. [PMID: 40087548 PMCID: PMC11907807 DOI: 10.1186/s12874-025-02518-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2024] [Accepted: 02/24/2025] [Indexed: 03/17/2025] Open
Abstract
BACKGROUND Although medical journals endorse reporting guidelines, authors often struggle to find and use the right one for their study type and topic. The UK EQUATOR Centre developed the GoodReports website to direct authors to appropriate guidance. Pilot data suggested that authors did not improve their manuscripts when advised to use a particular reporting guideline by GoodReports.org at journal submission stage. User feedback suggested the checklist format of most reporting guidelines does not encourage use during manuscript writing. We tested whether providing customized reporting guidance within writing templates for use throughout the writing process resulted in clearer and more complete reporting than only giving advice on which reporting guideline to use. DESIGN AND METHODS GRReaT was a two-group parallel 1:1 randomized trial with a target sample size of 206. Participants were lead authors at an early stage of writing up a health-related study. Eligible study designs were cohort, cross-sectional, or case-control study, randomized trial, and systematic review. After randomization, the intervention group received an article template including items from the appropriate reporting guideline and links to explanations and examples. The control group received a reporting guideline recommendation and general advice on reporting. Participants sent their completed manuscripts to the GRReaT team before submitting for publication, for completeness of each item in the title, methods, and results section of the corresponding reporting guideline. The primary outcome was reporting completeness against the corresponding reporting guideline. Participants were not blinded to allocation. Assessors were blind to group allocation. As a recruitment incentive, all participants received a feedback report identifying missing or inadequately reported items in these three sections. RESULTS Between 9 June 2021 and 30 June 2023, we randomized 130 participants, 65 to the intervention and 65 to the control group. We present findings from the assessment of reporting completeness for the 37 completed manuscripts we received, 18 in the intervention group and 19 in the control group. The mean (standard deviation) proportion of completely reported items from the title, methods, and results sections of the manuscripts (primary outcome) was 0.57 (0.18) in the intervention group and 0.50 (0.17) in the control group. The mean difference between the two groups was 0.069 (95% CI -0.046 to 0.184; p = 0.231). In the sensitivity analysis, when partially reported items were counted as completely reported, the mean (standard deviation) proportion of completely reported items was 0.75 (0.15) in the intervention group and 0.71 (0.11) in the control group. The mean difference between the two groups was 0.036 (95% CI -0.127 to 0.055; p = 0.423). CONCLUSION As the dropout rate was higher than expected, we did not reach the recruitment target, and the difference between groups was not statistically significant. We therefore found no evidence that providing authors with customized article templates including items from reporting guidelines, increases reporting completeness. We discuss the challenges faced when conducting the trial and suggest how future research testing innovative ways of improving reporting could be designed to improve recruitment and reduce dropouts.
Collapse
Affiliation(s)
- Caroline Struthers
- UK EQUATOR Centre, Centre for Statistics in Medicine, NDORMS, University of Oxford, Oxford, UK.
| | - James Harwood
- UK EQUATOR Centre, Centre for Statistics in Medicine, NDORMS, University of Oxford, Oxford, UK
| | - Jennifer Anne de Beyer
- UK EQUATOR Centre, Centre for Statistics in Medicine, NDORMS, University of Oxford, Oxford, UK
| | - Patricia Logullo
- UK EQUATOR Centre, Centre for Statistics in Medicine, NDORMS, University of Oxford, Oxford, UK
| | - Gary S Collins
- UK EQUATOR Centre, Centre for Statistics in Medicine, NDORMS, University of Oxford, Oxford, UK
| |
Collapse
|
5
|
Aczel B, Barwich AS, Diekman AB, Fishbach A, Goldstone RL, Gomez P, Gundersen OE, von Hippel PT, Holcombe AO, Lewandowsky S, Nozari N, Pestilli F, Ioannidis JPA. The present and future of peer review: Ideas, interventions, and evidence. Proc Natl Acad Sci U S A 2025; 122:e2401232121. [PMID: 39869808 PMCID: PMC11804526 DOI: 10.1073/pnas.2401232121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2025] Open
Abstract
What is wrong with the peer review system? Is peer review sustainable? Useful? What other models exist? These are central yet contentious questions in today's academic discourse. This perspective critically discusses alternative models and revisions to the peer review system. The authors highlight possible changes to the peer review system, with the goal of fostering further dialog among the main stakeholders, including producers and consumers of scientific research. Neither our list of identified issues with the peer review system nor our discussed resolutions are complete. A point of agreement is that fair assessment and efficient change would require more comprehensive and rigorous data on the various aspects of the peer review system.
Collapse
Affiliation(s)
- Balazs Aczel
- Department of Affective Psychology, Institute of Psychology, Eotvos Lorand University, Budapest1063, Hungary
| | - Ann-Sophie Barwich
- Department of History and Philosophy of Science and Medicine, Indiana University, Bloomington, IN47405
- Cognitive Science Program, Indiana University, Bloomington, IN47405
| | - Amanda B. Diekman
- Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN47405
| | - Ayelet Fishbach
- Booth School of Business, University of Chicago, Chicago, IL60637
| | - Robert L. Goldstone
- Cognitive Science Program, Indiana University, Bloomington, IN47405
- Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN47405
| | - Pablo Gomez
- Psychology Department, Skidmore College, Saratoga Springs, NY12866
| | - Odd Erik Gundersen
- Department of Computer Science, Norwegian University of Science and Technology, Trondheim7491, Norway
- Aneo AI Research, Trondheim7031, Norway
| | | | - Alex O. Holcombe
- School of Psychology, University of Sydney, Sydney NSW2006, Australia
| | - Stephan Lewandowsky
- School of Psychological Science, University of Bristol, BristolBS81TU, United Kingdom
- Department of Psychology, University of Potsdam, Potsdam14469, Germany
| | - Nazbanou Nozari
- Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN47405
| | - Franco Pestilli
- Department of Psychology, College of Liberal Arts, The University of Texas, Austin, TX78712
- Department of Neuroscience, College of Natural Sciences, The University of Texas, Austin, TX78712
| | - John P. A. Ioannidis
- Meta-Research Innovation Center at Stanford, Stanford University, Stanford, CA94305
- Department of Medicine, Stanford University, Stanford, CA94305
- Department of Epidemiology and Population Health, Stanford University, Stanford, CA94305
- Department of Biomedical Data Science, Stanford, CA94305
| |
Collapse
|
6
|
Murigu A, Wong KHF, Mercer RT, Hinchliffe RJ, Twine CP. Reporting and Methodological Quality of Systematic Reviews Underpinning Clinical Practice Guidelines for Vascular Surgery: A Systematic Review. Eur J Vasc Endovasc Surg 2024:S1078-5884(24)00966-3. [PMID: 39547389 DOI: 10.1016/j.ejvs.2024.11.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2024] [Revised: 09/30/2024] [Accepted: 11/08/2024] [Indexed: 11/17/2024]
Abstract
OBJECTIVE Clinical practice guideline recommendations are often informed by systematic reviews. This review aimed to appraise the reporting and methodological quality of systematic reviews informing clinical practice recommendations relevant to vascular surgery. DATA SOURCES MEDLINE and Embase. METHODS MEDLINE and Embase were searched from 1 January 2021 to 5 May 2023 for clinical practice guidelines relevant to vascular surgery. Guidelines were then screened for systematic reviews informing recommendations. The reporting and methodological quality of these systematic reviews were assessed using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 statement and Assessment of Multiple Systematic Reviews 2 (AMSTAR 2) 2017 tool. Pearson correlation and multiple regression analyses were performed to determine associations between these scores and extracted study characteristics. RESULTS Eleven clinical practice guidelines were obtained, containing 1 783 references informing guideline recommendations. From these, 215 systematic reviews were included for synthesis. PRISMA item completeness ranged 14 - 100%, with a mean of 63% across reviews. AMSTAR 2 item completeness ranged 2 - 95%, with a mean of 50%. Pearson correlation highlighted a statistically significant association between a review's PRISMA and AMSTAR 2 score (r = 0.85, p < .001). A more recent publication year was associated with a statistically significant increase in both scores (PRISMA coefficient 1.28, p < .001; and AMSTAR 2 coefficient 1.31, p < .001). Similarly, the presence of funding in a systematic review was shown to be statistically significantly associated with an increase in both PRISMA and AMSTAR 2 scores (coefficient 4.93, p = .024; and coefficient 6.07, p = .019, respectively). CONCLUSION Systematic reviews informing clinical practice guidelines relevant to vascular surgery were of moderate quality at best. Organisations producing clinical practice guidelines should consider funding systematic reviews to improve the quality of their recommendations.
Collapse
Affiliation(s)
- Alex Murigu
- Bristol Medical School, University of Bristol, Bristol, UK
| | - Kitty H F Wong
- Bristol Medical School, University of Bristol, Bristol, UK; North Bristol NHS Trust, Bristol, UK
| | - Ross T Mercer
- University Hospitals Bristol and Weston NHS Foundation Trust, Bristol, UK
| | - Robert J Hinchliffe
- North Bristol NHS Trust, Bristol, UK; University Hospitals Bristol and Weston NHS Foundation Trust, Bristol, UK
| | - Christopher P Twine
- North Bristol NHS Trust, Bristol, UK; University Hospitals Bristol and Weston NHS Foundation Trust, Bristol, UK.
| |
Collapse
|
7
|
Shaikh H, Lyle ANJ, Oslin E, Gray MM, Weiss EM. Eligible Infants Included in Neonatal Clinical Trials and Reasons for Noninclusion: A Systematic Review. JAMA Netw Open 2024; 7:e2441372. [PMID: 39453652 PMCID: PMC11581680 DOI: 10.1001/jamanetworkopen.2024.41372] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/26/2024] [Accepted: 08/31/2024] [Indexed: 10/26/2024] Open
Abstract
Importance Results of clinical trials can only represent included participants, and many neonatal trials fail due to insufficient participation. Infants not included in research may differ from those included in meaningful ways, biasing the sample and limiting the generalizability of findings. Objective To describe the proportion of eligible infants included in neonatal clinical trials and the reasons for noninclusion. Evidence Review A systematic search of Cochrane CENTRAL was performed by retrieving articles meeting the following inclusion criteria: full-length, peer-reviewed articles describing clinical trial results in at least 20 human infants from US neonatal intensive care units, published in English, and added to Cochrane CENTRAL between 2017 and 2022. Retrieved articles were screened for inclusion by 2 independent researchers. Findings In total 120 articles met inclusion criteria and 91 of these (75.8%) reported the number of infants eligible for participation, which totaled 26 854 in aggregate. Drawing from these, an aggregate of 11 924 eligible infants (44.4%) were included in reported results. Among all eligible infants, most reasons for noninclusion in results were classified as modifiable or potentially modifiable by the research team. Parents declining to participate (8004 infants [29.8%]) or never being approached (2507 infants [9.3%]) were the 2 predominant reasons for noninclusion. Other modifiable reasons included factors related to study logistics, such as failure to appropriately collect data on enrolled infants (859 of 26 854 infants [3.2%]) and other reasons (1907 of 26 854 infants [7.1%]), such as loss to follow-up or eligible participants that were unaccounted for. Nonmodifiable reasons, including clinical change or death, accounted for a small proportion of eligible infants who were not included (858 of 26 854 infants [3.2%]). Conclusions and Relevance This systematic review of reporting on eligible infants included and not included in neonatal clinical trials highlights the need for improved documentation on the flow of eligible infants through neonatal clinical trials and may also inform recruitment expectations for trialists designing future protocols. Improved adherence to standardized reporting may clarify which potential participants are being missed, improving understanding of the generalizability of research findings. Furthermore, these findings suggest that future work to understand why parents decline to participate in neonatal research trials and why some are never approached about research may help increase overall participation.
Collapse
Affiliation(s)
- Henna Shaikh
- Department of Pediatrics, University of Washington School of Medicine, Seattle
| | - Allison N J Lyle
- Department of Pediatrics, University of Louisville School of Medicine, Norton Children's Medical Group-Neonatology, Louisville, Kentucky
| | - Ellie Oslin
- Department of Pediatrics, University of Washington School of Medicine, Seattle
- Department of Pediatrics, University of Louisville School of Medicine, Norton Children's Medical Group-Neonatology, Louisville, Kentucky
| | - Megan M Gray
- Department of Pediatrics, University of Washington School of Medicine, Seattle
| | - Elliott Mark Weiss
- Department of Pediatrics, University of Washington School of Medicine, Seattle
- Treuman Katz Center for Pediatric Bioethics & Palliative Care, Seattle Children's Research Institute, Seattle, Washington
| |
Collapse
|
8
|
Blanco D, Cadellans-Arróniz A, Donadio MVF, Sharp MK, Casals M, Edouard P. Using reporting guidelines in sports and exercise medicine research: why and how to raise the bar? Br J Sports Med 2024; 58:891-893. [PMID: 38844077 DOI: 10.1136/bjsports-2024-108101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/29/2024] [Indexed: 08/02/2024]
Affiliation(s)
- David Blanco
- Department of Physiotherapy, Universitat Internacional de Catalunya, Barcelona, Spain
| | | | - Márcio Vinícius Fagundes Donadio
- Department of Physiotherapy, Universitat Internacional de Catalunya, Barcelona, Spain
- Pontificia Universidade Catolica do Rio Grande do Sul, Porto Alegre, Brazil
| | - Melissa K Sharp
- Department of Public Health and Epidemiology, RCSI University of Medicine and Health Sciences, Dublin, Ireland
| | - Martí Casals
- National Institute of Physical Education of Catalonia (INEFC), University of Barcelona, Barcelona, Spain
- Sport and Physical Activity Studies Centre (CEEAF), Faculty of Medicine, University of Vic-Central University of Catalonia (UVic-UCC), Barcelona, Spain
- Sport Performance Analysis Research Group, University of VicCentral University of Catalonia (UVic-UCC), Barcelona, Spain
| | - Pascal Edouard
- Inter-university Laboratory of Human Movement Biology (EA 7424), Université Jean Monnet, Lyon 1, Université Savoie Mont-Blanc, Saint-Etienne, France
- Department of Clinical and Exercise Physiology, Sports Medicine Unit, University Hospital of Saint-Etienne, Faculty of Medicine, Saint-Etienne, France
| |
Collapse
|
9
|
Blanco D, Donadio MVF, Cadellans-Arróniz A. Enhancing reporting through structure: a before and after study on the effectiveness of SPIRIT-based templates to improve the completeness of reporting of randomized controlled trial protocols. Res Integr Peer Rev 2024; 9:6. [PMID: 38816752 PMCID: PMC11140857 DOI: 10.1186/s41073-024-00147-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Accepted: 05/22/2024] [Indexed: 06/01/2024] Open
Abstract
BACKGROUND Despite the improvements in the completeness of reporting of randomized trial protocols after the publication of the Standard Protocol Items: Recommendations for Interventional Trial (SPIRIT) guidelines, many items remain poorly reported. This study aimed to assess the effectiveness of using SPIRIT-tailored templates for trial protocols to improve the completeness of reporting of the protocols that master's students write as part of their master's theses. METHODS Before and after experimental study performed at the University Master's Degree in Orthopaedic Manual Physiotherapy of the Universitat Internacional de Catalunya (Barcelona, Spain). While students in the post-intervention period were instructed to use a trial protocol template that was tailored to SPIRIT, students in the pre-intervention period did not use the template. PRIMARY OUTCOME Difference between the pre- and post-intervention periods in the mean number of adequately reported items (0-10 scale). The outcomes were evaluated independently and in duplicate by two blinded assessors. Students and their supervisors were not aware that they were part of a research project. For the statistical analysis, we used a generalized linear regression model (dependent variable: number of adequately reported items in the protocol; independent variables: intervention period, call, language). RESULTS Thirty-four trial protocols were included (17, pre-intervention; 17, post-intervention). Protocols produced during the post-intervention period (mean: 8.24; SD: 1.52) were more completely reported than those produced during the pre-intervention period (mean: 6.35; SD: 1.80); adjusted difference: 1.79 (95% CI: 0.58 to 3.00). CONCLUSIONS SPIRIT-based templates could be used to improve the completeness of reporting of randomized trial protocols.
Collapse
Affiliation(s)
- David Blanco
- Department of Physiotherapy, Universitat Internacional de Catalunya, C/Josep Trueta S/N., Sant Cugat del Vallès, 08195, Barcelona, Spain.
| | - Márcio Vinícius Fagundes Donadio
- Department of Physiotherapy, Universitat Internacional de Catalunya, C/Josep Trueta S/N., Sant Cugat del Vallès, 08195, Barcelona, Spain
- Pontifícia Universidade Católica Do Rio Grande Do Sul (PUCRS), Porto Alegre, Brazil
| | - Aïda Cadellans-Arróniz
- Department of Physiotherapy, Universitat Internacional de Catalunya, C/Josep Trueta S/N., Sant Cugat del Vallès, 08195, Barcelona, Spain
| |
Collapse
|
10
|
El Emam K, Leung TI, Malin B, Klement W, Eysenbach G. Consolidated Reporting Guidelines for Prognostic and Diagnostic Machine Learning Models (CREMLS). J Med Internet Res 2024; 26:e52508. [PMID: 38696776 PMCID: PMC11107416 DOI: 10.2196/52508] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2024] [Accepted: 04/04/2024] [Indexed: 05/04/2024] Open
Abstract
The number of papers presenting machine learning (ML) models that are being submitted to and published in the Journal of Medical Internet Research and other JMIR Publications journals has steadily increased. Editors and peer reviewers involved in the review process for such manuscripts often go through multiple review cycles to enhance the quality and completeness of reporting. The use of reporting guidelines or checklists can help ensure consistency in the quality of submitted (and published) scientific manuscripts and, for example, avoid instances of missing information. In this Editorial, the editors of JMIR Publications journals discuss the general JMIR Publications policy regarding authors' application of reporting guidelines and specifically focus on the reporting of ML studies in JMIR Publications journals, using the Consolidated Reporting of Machine Learning Studies (CREMLS) guidelines, with an example of how authors and other journals could use the CREMLS checklist to ensure transparency and rigor in reporting.
Collapse
Affiliation(s)
- Khaled El Emam
- School of Epidemiology and Public Health, University of Ottawa, Ottawa, ON, Canada
- Children's Hospital of Eastern Ontario Research Institute, Ottawa, ON, Canada
| | - Tiffany I Leung
- JMIR Publications, Inc, Toronto, ON, Canada
- Department of Internal Medicine (adjunct), Southern Illinois University School of Medicine, Springfield, IL, United States
| | - Bradley Malin
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, United States
| | - William Klement
- Children's Hospital of Eastern Ontario Research Institute, Ottawa, ON, Canada
| | - Gunther Eysenbach
- JMIR Publications, Inc, Toronto, ON, Canada
- School of Health Information Science, University of Victoria, Victoria, BC, Canada
| |
Collapse
|
11
|
Wang P, Wolfram D, Gilbert E. Endorsements of five reporting guidelines for biomedical research by journals of prominent publishers. PLoS One 2024; 19:e0299806. [PMID: 38421981 PMCID: PMC10903802 DOI: 10.1371/journal.pone.0299806] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Accepted: 02/01/2024] [Indexed: 03/02/2024] Open
Abstract
Biomedical research reporting guidelines provide a framework by which journal editors and the researchers who conduct studies can ensure that the reported research is both complete and transparent. With more than 16 different guidelines for the 11 major study types of medical and health research, authors need to be familiar with journal reporting standards. To assess the current endorsements of reporting guidelines for biomedical and health research, this study examined the instructions for authors (IFAs) of 559 biomedical journals by 11 prominent publishers that publish original research or systematic reviews/meta-analyses. Data from the above original sources were cleaned and restructured, and analyzed in a database and text miner. Each journal's instructions or information for authors were examined to code if any of five prominent reporting guidelines were mentioned and what form the guideline adherence demonstration took. Seventeen journals published the reporting guidelines. Four of the five reporting guidelines listed journals as endorsers. For journals with open peer review reports, a sample of journals and peer reviews was analyzed for mention of adherence to reporting guidelines. The endorsement of research guidelines by publishers and their associated journals is inconsistent for some publishers, with only a small number of journals endorsing relevant guidelines. Based on the analysis of open peer reviews, there is evidence that some reviewers check the adherence to the endorsed reporting guidelines. Currently, there is no universal endorsement of reporting guidelines by publishers nor ways of demonstrating adherence to guidelines. Journals may not directly inform authors of their guideline endorsements, making it more difficult for authors to adhere to endorsed guidelines. Suggestions derived from the findings are provided for authors, journals, and reporting guidelines to ensure increased adequate use of endorsed reporting guidelines.
Collapse
Affiliation(s)
- Peiling Wang
- School of Information Sciences, University of Tennessee-Knoxville, Knoxville, Tennessee, United States of America
| | - Dietmar Wolfram
- School of Information Studies, University of Wisconsin-Milwaukee, Milwaukee, Wisconsin, United States of America
| | - Emrie Gilbert
- School of Information Sciences, University of Tennessee-Knoxville, Knoxville, Tennessee, United States of America
| |
Collapse
|
12
|
Rehlicki D, Plenkovic M, Delac L, Pieper D, Marušić A, Puljak L. Author instructions in biomedical journals infrequently address systematic review reporting and methodology: a cross-sectional study. J Clin Epidemiol 2024; 166:111218. [PMID: 37993073 DOI: 10.1016/j.jclinepi.2023.11.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 11/08/2023] [Accepted: 11/14/2023] [Indexed: 11/24/2023]
Abstract
OBJECTIVES We aimed to analyze how instructions for authors in journals indexed in MEDLINE address systematic review (SR) reporting and methodology. STUDY DESIGN AND SETTING We analyzed instructions for authors in 20% of MEDLINE-indexed journals listed in the online catalog of the National Library of Medicine on July 27, 2021. We extracted data only from the instructions published in English. We extracted data on the existence of instructions for reporting and methodology of SRs. RESULTS Instructions from 1,237 journals mentioned SRs in 45% (n = 560) of the cases. Systematic review (SR) registration was mentioned in 104/1,237 (8%) of instructions. Guidelines for reporting SR protocols were found in 155/1,237 (13%) of instructions. Guidelines for reporting SRs were explicitly mentioned in 461/1,237 (37%), whereas the EQUATOR (Enhancing the Quality and Transparency of Health Research) network was referred to in 474/1,237 (38%) of instructions. Less than 2% (n = 20) of instructions mentioned risk of bias and meta-analyses; less than 1% mentioned certainty of evidence assessment, methodological expectations, updating of SRs, overviews of SRs, or scoping reviews. CONCLUSION Journals indexed in MEDLINE rarely provide instructions for authors regarding SR reporting and methodology. Such instructions could potentially raise authors' awareness and improve how SRs are prepared and reported.
Collapse
Affiliation(s)
- Daniel Rehlicki
- Centre for Evidence-Based Medicine and Health Care, Catholic University of Croatia, Zagreb, Croatia
| | - Mia Plenkovic
- Department of Psychiatry, University of Split School of Medicine, Split, Croatia
| | - Ljerka Delac
- Division of Neurogeriatrics Karolinska Institutet, Department of Neurobiology, Care Sciences and Society, Solna, Sweden
| | - Dawid Pieper
- Faculty of Health Sciences Brandenburg, Brandenburg Medical School Theodor Fontane, Institute for Health Services and Health System Research, Rüdersdorf, Germany; Centre for Health Services Research, Brandenburg Medical School Theodor Fontane, Rüdersdorf, Germany
| | - Ana Marušić
- Department of Research in Biomedicine and Health, Centre for Evidence-based Medicine, University of Split School of Medicine, Split, Croatia
| | - Livia Puljak
- Centre for Evidence-Based Medicine and Health Care, Catholic University of Croatia, Zagreb, Croatia.
| |
Collapse
|
13
|
Hesselberg JO, Dalsbø TK, Stromme H, Svege I, Fretheim A. Reviewer training for improving grant and journal peer review. Cochrane Database Syst Rev 2023; 11:MR000056. [PMID: 38014743 PMCID: PMC10683016 DOI: 10.1002/14651858.mr000056.pub2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
BACKGROUND Funders and scientific journals use peer review to decide which projects to fund or articles to publish. Reviewer training is an intervention to improve the quality of peer review. However, studies on the effects of such training yield inconsistent results, and there are no up-to-date systematic reviews addressing this question. OBJECTIVES To evaluate the effect of peer reviewer training on the quality of grant and journal peer review. SEARCH METHODS We used standard, extensive Cochrane search methods. The latest search date was 27 April 2022. SELECTION CRITERIA We included randomized controlled trials (RCTs; including cluster-RCTs) that evaluated peer review with training interventions versus usual processes, no training interventions, or other interventions to improve the quality of peer review. DATA COLLECTION AND ANALYSIS We used standard Cochrane methods. Our primary outcomes were 1. completeness of reporting and 2. peer review detection of errors. Our secondary outcomes were 1. bibliometric scores, 2. stakeholders' assessment of peer review quality, 3. inter-reviewer agreement, 4. process-centred outcomes, 5. peer reviewer satisfaction, and 6. completion rate and speed of funded projects. We used the first version of the Cochrane risk of bias tool to assess the risk of bias, and we used GRADE to assess the certainty of evidence. MAIN RESULTS We included 10 RCTs with a total of 1213 units of analysis. The unit of analysis was the individual reviewer in seven studies (722 reviewers in total), and the reviewed manuscript in three studies (491 manuscripts in total). In eight RCTs, participants were journal peer reviewers. In two studies, the participants were grant peer reviewers. The training interventions can be broadly divided into dialogue-based interventions (interactive workshop, face-to-face training, mentoring) and one-way communication (written information, video course, checklist, written feedback). Most studies were small. We found moderate-certainty evidence that emails reminding peer reviewers to check items of reporting checklists, compared with standard journal practice, have little or no effect on the completeness of reporting, measured as the proportion of items (from 0.00 to 1.00) that were adequately reported (mean difference (MD) 0.02, 95% confidence interval (CI) -0.02 to 0.06; 2 RCTs, 421 manuscripts). There was low-certainty evidence that reviewer training, compared with standard journal practice, slightly improves peer reviewer ability to detect errors (MD 0.55, 95% CI 0.20 to 0.90; 1 RCT, 418 reviewers). We found low-certainty evidence that reviewer training, compared with standard journal practice, has little or no effect on stakeholders' assessment of review quality in journal peer review (standardized mean difference (SMD) 0.13 standard deviations (SDs), 95% CI -0.07 to 0.33; 1 RCT, 418 reviewers), or change in stakeholders' assessment of review quality in journal peer review (SMD -0.15 SDs, 95% CI -0.39 to 0.10; 5 RCTs, 258 reviewers). We found very low-certainty evidence that a video course, compared with no video course, has little or no effect on inter-reviewer agreement in grant peer review (MD 0.14 points, 95% CI -0.07 to 0.35; 1 RCT, 75 reviewers). There was low-certainty evidence that structured individual feedback on scoring, compared with general information on scoring, has little or no effect on the change in inter-reviewer agreement in grant peer review (MD 0.18 points, 95% CI -0.14 to 0.50; 1 RCT, 41 reviewers, low-certainty evidence). AUTHORS' CONCLUSIONS Evidence from 10 RCTs suggests that training peer reviewers may lead to little or no improvement in the quality of peer review. There is a need for studies with more participants and a broader spectrum of valid and reliable outcome measures. Studies evaluating stakeholders' assessments of the quality of peer review should ensure that these instruments have sufficient levels of validity and reliability.
Collapse
Affiliation(s)
- Jan-Ole Hesselberg
- Department of Psychology, University of Oslo, Oslo, Norway
- Stiftelsen Dam, Oslo, Norway
| | | | | | - Ida Svege
- Stiftelsen Dam, Oslo, Norway
- Faculty of Health Sciences, Oslo Metropolitan University, Oslo, Norway
| | - Atle Fretheim
- Faculty of Health Sciences, Oslo Metropolitan University, Oslo, Norway
- Centre of Epidemic Interventions Research, Norwegian Institute of Public Health, Oslo, Norway
| |
Collapse
|
14
|
Ioannidis JPA, Berkwits M, Flanagin A, Bloom T. Peer Review and Scientific Publication at a Crossroads: Call for Research for the 10th International Congress on Peer Review and Scientific Publication. JAMA 2023; 330:1232-1235. [PMID: 37738041 DOI: 10.1001/jama.2023.17607] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 09/23/2023]
Affiliation(s)
- John P A Ioannidis
- Meta-Research Innovation Center at Stanford (METRICS), Stanford University, Stanford, California
- Department of Medicine, Stanford University School of Medicine, Stanford, California
| | | | | | | |
Collapse
|
15
|
Thibault RT, Amaral OB, Argolo F, Bandrowski AE, Davidson AR, Drude NI. Open Science 2.0: Towards a truly collaborative research ecosystem. PLoS Biol 2023; 21:e3002362. [PMID: 37856538 PMCID: PMC10617723 DOI: 10.1371/journal.pbio.3002362] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Revised: 10/31/2023] [Indexed: 10/21/2023] Open
Abstract
Conversations about open science have reached the mainstream, yet many open science practices such as data sharing remain uncommon. Our efforts towards openness therefore need to increase in scale and aim for a more ambitious target. We need an ecosystem not only where research outputs are openly shared but also in which transparency permeates the research process from the start and lends itself to more rigorous and collaborative research. To support this vision, this Essay provides an overview of a selection of open science initiatives from the past 2 decades, focusing on methods transparency, scholarly communication, team science, and research culture, and speculates about what the future of open science could look like. It then draws on these examples to provide recommendations for how funders, institutions, journals, regulators, and other stakeholders can create an environment that is ripe for improvement.
Collapse
Affiliation(s)
- Robert T. Thibault
- 1 Meta-Research Innovation Center at Stanford (METRICS), Stanford University, Stanford, California, Unites States of America
| | - Olavo B. Amaral
- Institute of Medical Biochemistry Leopoldo de Meis, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
| | | | - Anita E. Bandrowski
- FAIR Data Informatics Lab, Department of Neuroscience, UCSD, San Diego, California, United States of America
- SciCrunch Inc., San Diego, California, United States of America
| | - Alexandra R, Davidson
- Institute for Evidence-Based Health Care, Bond University, Robina, Australia
- Faculty of Health Science and Medicine, Bond University, Robina, Australia
| | - Natascha I. Drude
- Berlin Institute of Health (BIH) at Charité, BIH QUEST Center for Responsible Research, Berlin, Germany
| |
Collapse
|
16
|
Kilicoglu H, Jiang L, Hoang L, Mayo-Wilson E, Vinkers CH, Otte WM. Methodology reporting improved over time in 176,469 randomized controlled trials. J Clin Epidemiol 2023; 162:19-28. [PMID: 37562729 PMCID: PMC10829891 DOI: 10.1016/j.jclinepi.2023.08.004] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 07/25/2023] [Accepted: 08/02/2023] [Indexed: 08/12/2023]
Abstract
OBJECTIVES To describe randomized controlled trial (RCT) methodology reporting over time. STUDY DESIGN AND SETTING We used a deep learning-based sentence classification model based on the Consolidated Standards of Reporting Trials (CONSORT) statement, considered minimum requirements for reporting RCTs. We included 176,469 RCT reports published between 1966 and 2018. We analyzed the reporting trends over 5-year time periods, grouping trials from 1966 to 1990 in a single stratum. We also explored the effect of journal impact factor (JIF) and medical discipline. RESULTS Population, Intervention, Comparator, Outcome (PICO) items were commonly reported during each period, and reporting increased over time (e.g., interventions: 79.1% during 1966-1990 to 87.5% during 2010-2018). Reporting of some methods information has increased, although there is room for improvement (e.g., sequence generation: 10.8-41.8%). Some items are reported infrequently (e.g., allocation concealment: 5.1-19.3%). The number of items reported and JIF are weakly correlated (Pearson's r (162,702) = 0.16, P < 0.001). The differences in the proportion of items reported between disciplines are small (<10%). CONCLUSION Our analysis provides large-scale quantitative support for the hypothesis that RCT methodology reporting has improved over time. Extending these models to all CONSORT items could facilitate compliance checking during manuscript authoring and peer review, and support metaresearch.
Collapse
Affiliation(s)
- Halil Kilicoglu
- School of Information Sciences, University of Illinois Urbana-Champaign, Champaign, IL, USA.
| | - Lan Jiang
- School of Information Sciences, University of Illinois Urbana-Champaign, Champaign, IL, USA
| | - Linh Hoang
- School of Information Sciences, University of Illinois Urbana-Champaign, Champaign, IL, USA
| | - Evan Mayo-Wilson
- Department of Epidemiology, University of North Carolina School of Global Public Health, Chapel Hill, NC, USA
| | - Christiaan H Vinkers
- Department of Psychiatry and Anatomy & Neurosciences, Amsterdam University Medical Center Location Vrije Universiteit Amsterdam, 1081 HV, Amsterdam, The Netherlands; Amsterdam Public Health, Mental Health Program and Amsterdam Neuroscience, Mood, Anxiety, Psychosis, Sleep & Stress Program, Amsterdam, The Netherlands; GGZ inGeest Mental Health Care, 1081 HJ, Amsterdam, The Netherlands
| | - Willem M Otte
- Department of Child Neurology, UMC Utrecht Brain Center, University Medical Center Utrecht, and Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
17
|
Affiliation(s)
- John P A Ioannidis
- Meta-Research Innovation Center at Stanford (METRICS), Stanford University, Stanford, California, USA
- Department of Medicine, Stanford University School of Medicine, Stanford, California
| | | | | | | |
Collapse
|
18
|
Speich B, Schroter S, Briel M. Guidance needed: where should randomized studies which do not assess a health outcome be registered? J Clin Epidemiol 2023; 161:183-184. [PMID: 37532109 DOI: 10.1016/j.jclinepi.2023.07.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Accepted: 07/25/2023] [Indexed: 08/04/2023]
Affiliation(s)
- Benjamin Speich
- CLEAR Methods Center, Division of Clinical Epidemiology, Department of Clinical Research, University Hospital Basel, University of Basel, Basel, Switzerland.
| | - Sara Schroter
- The BMJ, London, The United Kingdom; Faculty of Public Health and Policy, London School of Hygiene and Tropical Medicine, London, The United Kingdom
| | - Matthias Briel
- CLEAR Methods Center, Division of Clinical Epidemiology, Department of Clinical Research, University Hospital Basel, University of Basel, Basel, Switzerland; Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Canada
| |
Collapse
|
19
|
DeVito NJ. Increasing the Reporting Quality of Clinical Trials-No Easy Solutions? JAMA Netw Open 2023; 6:e2317665. [PMID: 37294573 DOI: 10.1001/jamanetworkopen.2023.17665] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 06/10/2023] Open
Affiliation(s)
- Nicholas J DeVito
- Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, United Kingdom
| |
Collapse
|