1
|
Paul M, Leeflang MM. Living systematic reviews: aims and standards. Clin Microbiol Infect 2024; 30:265-266. [PMID: 37572829 DOI: 10.1016/j.cmi.2023.08.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Accepted: 08/06/2023] [Indexed: 08/14/2023]
|
2
|
Bossuyt PM, Deeks JJ, Leeflang MM, Takwoingi Y, Flemyng E. Evaluating medical tests: introducing the Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy. Cochrane Database Syst Rev 2023; 7:ED000163. [PMID: 37470764 PMCID: PMC10408284 DOI: 10.1002/14651858.ed000163] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 07/21/2023]
|
3
|
Cohen JF, Deeks JJ, Hooft L, Salameh JP, Korevaar DA, Gatsonis C, Hopewell S, Hunt HA, Hyde CJ, Leeflang MM, Macaskill P, McGrath TA, Moher D, Reitsma JB, Rutjes AWS, Takwoingi Y, Tonelli M, Whiting P, Willis BH, Thombs B, Bossuyt PM, McInnes MDF. Preferred reporting items for journal and conference abstracts of systematic reviews and meta-analyses of diagnostic test accuracy studies (PRISMA-DTA for Abstracts): checklist, explanation, and elaboration. BMJ 2021; 372:n265. [PMID: 33722791 PMCID: PMC7957862 DOI: 10.1136/bmj.n265] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
For many users of the biomedical literature, abstracts may be the only source of information about a study. Hence, abstracts should allow readers to evaluate the objectives, key design features, and main results of the study. Several evaluations have shown deficiencies in the reporting of journal and conference abstracts across study designs and research fields, including systematic reviews of diagnostic test accuracy studies. Incomplete reporting compromises the value of research to key stakeholders. The authors of this article have developed a 12 item checklist of preferred reporting items for journal and conference abstracts of systematic reviews and meta-analyses of diagnostic test accuracy studies (PRISMA-DTA for Abstracts). This article presents the checklist, examples of complete reporting, and explanations for each item of PRISMA-DTA for Abstracts.
Collapse
|
4
|
Mustafa Hellou M, Górska A, Mazzaferri F, Cremonini E, Gentilotti E, De Nardo P, Poran I, Leeflang MM, Tacconelli E, Paul M. Nucleic acid amplification tests on respiratory samples for the diagnosis of coronavirus infections: a systematic review and meta-analysis. Clin Microbiol Infect 2021; 27:341-351. [PMID: 33188933 PMCID: PMC7657614 DOI: 10.1016/j.cmi.2020.11.002] [Citation(s) in RCA: 55] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Revised: 10/10/2020] [Accepted: 11/04/2020] [Indexed: 12/14/2022]
Abstract
BACKGROUND Management and control of coronavirus disease 2019 (COVID-19) relies on reliable diagnostic testing. OBJECTIVES To evaluate the diagnostic test accuracy (DTA) of nucleic acid amplification tests (NAATs) for the diagnosis of coronavirus infections. DATA SOURCES PubMed, Web of Science, the Cochrane Library, Embase, Open Grey and conference proceeding until May 2019. PubMed and medRxiv were updated for COVID-19 on 31st August 2020. STUDY ELIGIBILITY Studies were eligible if they reported on agreement rates between different NAATs using clinical samples. PARTICIPANTS Symptomatic patients with suspected upper or lower respiratory tract coronavirus infection. METHODS The new NAAT was defined as the index test and the existing NAAT as reference standard. Data were extracted independently in duplicate. Risk of bias was assessed using the Quality Assessment of Diagnostic Accuracy Studies 2 tool. Confidence regions (CRs) surrounding summary sensitivity/specificity pooled by bivariate meta-analysis are reported. Heterogeneity was assessed using meta-regression. RESULTS Fifty-one studies were included, 22 of which included 10 181 persons before COVID-19 and 29 including 8742 persons diagnosed with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The overall summary sensitivity was 89.1% (95%CR 84.0-92.7%) and specificity 98.9% (95%CR 98.0-99.4%). Nearly all the studies evaluated different PCRs as both index and reference standards. Real-time RT PCR assays resulted in significantly higher sensitivity than other tests. Reference standards at high risk of bias possibly exaggerated specificity. The pooled sensitivity and specificity of studies evaluating SARS-COV-2 were 90.4% (95%CR 83.7-94.5%) and 98.1% (95%CR 95.9-99.2), respectively. SARS-COV-2 studies using samples from the lower respiratory tract, real-time RT-PCR, and tests targeting the N or S gene or more than one gene showed higher sensitivity, and assays based on reverse transcriptase loop-mediated isothermal amplification (RT-LAMP), especially when targeting only the RNA-dependent RNA polymerase (RdRp) gene, showed significantly lower sensitivity compared to other studies. CONCLUSIONS Pooling all studies to date shows that on average 10% of patients with coronavirus infections might be missed with PCR tests. Variables affecting sensitivity and specificity can be used for test selection and development.
Collapse
|
5
|
Paul M, Leeflang MM. Reporting of systematic reviews and meta-analysis of observational studies. Clin Microbiol Infect 2020; 27:311-314. [PMID: 33217559 PMCID: PMC8885144 DOI: 10.1016/j.cmi.2020.11.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Accepted: 11/07/2020] [Indexed: 01/24/2023]
|
6
|
Dean CR, Bruin CM, O’Hara ME, Roseboom TJ, Leeflang MM, Spijker R, Painter RC. The chance of recurrence of hyperemesis gravidarum: A systematic review. Eur J Obstet Gynecol Reprod Biol X 2020; 5:100105. [PMID: 32021976 PMCID: PMC6994404 DOI: 10.1016/j.eurox.2019.100105] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Revised: 12/03/2019] [Accepted: 12/06/2019] [Indexed: 01/19/2023] Open
Abstract
Around 1 % of pregnancies develop Hyperemesis Gravidarum (HG), causing high physical and psychological morbidity. Reports on HG recurrence rate in subsequent pregnancies vary widely. An accurate rate of recurrence is needed for informed reproductive decision making. Our objective is to systematically review and aggregate reported rates for HG subsequent to index pregnancies affected by HG. We searched databases from inception as per the protocol registered on PROSPERO. No language restrictions were applied. Inclusion was not restricted based on how HG was defined; reports of severe NVP were included where authors defined the condition as HG. We included descriptive epidemiological, case control and cohort study designs. Eligibility screening was performed in duplo. We extracted data on populations, study methods and outcomes of significance. A panel of patients reviewed the results and provided discussion and feedback. Quality was assessed with the JBI (2017) critical appraisal tool independently by two reviewers. We performed the searches on 1st November 2019. Our search yielded 4454 unique studies, of which five (n = 40,350 HG cases) matched eligibility criteria; One longitudinal and four population-based cohort studies from five countries. Follow-up ranged from 2 to 31 years. Definition of HG and data collection methods in all the studies created heterogeneity. Quality was low; studies lacked valid and reliable exposure, and/or follow-up was insufficient. Meta-analysis was not possible due to clinical and statistical heterogeneity. This systematic review found five heterogeneous studies reporting recurrence rates from 15 to 81%. Defining HG as hospital cases may have introduced detection bias and contribute to clinical heterogeneity. A prospective longitudinal cohort study using an internationally agreed definition of HG and outcomes meaningful to patients is required to establish the true recurrence rate of HG.
Collapse
|
7
|
Holtman GA, Berger MY, Burger H, Deeks JJ, Donner-Banzhoff N, Fanshawe TR, Koshiaris C, Leeflang MM, Oke JL, Perera R, Reitsma JB, Van den Bruel A. Development of practical recommendations for diagnostic accuracy studies in low-prevalence situations. J Clin Epidemiol 2019; 114:38-48. [PMID: 31150837 DOI: 10.1016/j.jclinepi.2019.05.018] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2018] [Revised: 04/04/2019] [Accepted: 05/22/2019] [Indexed: 01/01/2023]
Abstract
OBJECTIVE Low disease prevalence poses challenges for diagnostic accuracy studies because of the large sample sizes that are required to obtain sufficient precision. The aim is to collate and discuss designs of diagnostic accuracy studies suited for use in low-prevalence situations. STUDY DESIGN AND SETTING We conducted a literature search including backward citation tracking and expert consultation. Two reviewers independently selected studies on designs for estimating diagnostic accuracy in a low-prevalence situation. During a 1-day expert meeting, all designs were discussed and recommendations were formulated. RESULTS We identified six designs for diagnostic accuracy studies that are suitable in low-prevalence situations because they reduced the total sample size or the number of patients undergoing the index test or reference standard depending on which poses the highest burden. We described the advantages and limitations of these designs and evaluated efficiencies in sample sizes, risk of bias, and alignment with the clinical pathway for applicability in routine care. CONCLUSION Choosing a study design for diagnostic accuracy studies in low-prevalence situations should depend on whether the aim is to limit the number of patients undergoing the index test or reference standard, and the risk of bias associated with a particular design type.
Collapse
|
8
|
de Vries EM, Wang J, Williamson KD, Leeflang MM, Boonstra K, Weersma RK, Beuers U, Chapman RW, Geskus RB, Ponsioen CY. A novel prognostic model for transplant-free survival in primary sclerosing cholangitis. Gut 2018; 67:1864-1869. [PMID: 28739581 PMCID: PMC6145288 DOI: 10.1136/gutjnl-2016-313681] [Citation(s) in RCA: 74] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/29/2016] [Revised: 06/09/2017] [Accepted: 06/15/2017] [Indexed: 12/17/2022]
Abstract
OBJECTIVE Most prognostic models for primary sclerosing cholangitis (PSC) are based on patients referred to tertiary care and may not be applicable for the majority of patients with PSC. The aim of this study was to construct and externally validate a novel, broadly applicable prognostic model for transplant-free survival in PSC, based on a large, predominantly population-based cohort using readily available variables. DESIGN The derivation cohort consisted of 692 patients with PSC from the Netherlands, the validation cohort of 264 patients with PSC from the UK. Retrospectively, clinical and biochemical variables were collected. We derived the prognostic index from a multivariable Cox regression model in which predictors were selected and parameters were estimated using the least absolute shrinkage and selection operator. The composite end point of PSC-related death and liver transplantation was used. To quantify the models' predictive value, we calculated the C-statistic as discrimination index and established its calibration accuracy by comparing predicted curves with Kaplan-Meier estimates. RESULTS The final model included the variables: PSC subtype, age at PSC diagnosis, albumin, platelets, aspartate aminotransferase, alkaline phosphatase and bilirubin. The C-statistic was 0.68 (95% CI 0.51 to 0.85). Calibration was satisfactory. The model was robust in the sense that the C-statistic did not change when prediction was based on biochemical variables collected at follow-up. CONCLUSION The Amsterdam-Oxford model for PSC showed adequate performance in estimating PSC-related death and/or liver transplant in a predominantly population-based setting. The transplant-free survival probability can be recalculated when updated biochemical values are available.
Collapse
|
9
|
Hooijmans CR, de Vries RBM, Ritskes-Hoitinga M, Rovers MM, Leeflang MM, IntHout J, Wever KE, Hooft L, de Beer H, Kuijpers T, Macleod MR, Sena ES, ter Riet G, Morgan RL, Thayer KA, Rooney AA, Guyatt GH, Schünemann HJ, Langendam MW. Facilitating healthcare decisions by assessing the certainty in the evidence from preclinical animal studies. PLoS One 2018; 13:e0187271. [PMID: 29324741 PMCID: PMC5764235 DOI: 10.1371/journal.pone.0187271] [Citation(s) in RCA: 70] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2017] [Accepted: 10/17/2017] [Indexed: 12/23/2022] Open
Abstract
Laboratory animal studies are used in a wide range of human health related research areas, such as basic biomedical research, drug research, experimental surgery and environmental health. The results of these studies can be used to inform decisions regarding clinical research in humans, for example the decision to proceed to clinical trials. If the research question relates to potential harms with no expectation of benefit (e.g., toxicology), studies in experimental animals may provide the only relevant or controlled data and directly inform clinical management decisions. Systematic reviews and meta-analyses are important tools to provide robust and informative evidence summaries of these animal studies. Rating how certain we are about the evidence could provide important information about the translational probability of findings in experimental animal studies to clinical practice and probably improve it. Evidence summaries and certainty in the evidence ratings could also be used (1) to support selection of interventions with best therapeutic potential to be tested in clinical trials, (2) to justify a regulatory decision limiting human exposure (to drug or toxin), or to (3) support decisions on the utility of further animal experiments. The Grading of Recommendations, Assessment, Development, and Evaluation (GRADE) approach is the most widely used framework to rate the certainty in the evidence and strength of health care recommendations. Here we present how the GRADE approach could be used to rate the certainty in the evidence of preclinical animal studies in the context of therapeutic interventions. We also discuss the methodological challenges that we identified, and for which further work is needed. Examples are defining the importance of consistency within and across animal species and using GRADE's indirectness domain as a tool to predict translation from animal models to humans.
Collapse
|
10
|
Kadouch DJ, Leeflang MM, Elshot YS, Longo C, Ulrich M, van der Wal AC, Wolkerstorfer A, Bekkenk MW, de Rie MA. Diagnostic accuracy of confocal microscopy imaging vs. punch biopsy for diagnosing and subtyping basal cell carcinoma. J Eur Acad Dermatol Venereol 2017; 31:1641-1648. [PMID: 28370434 PMCID: PMC5697654 DOI: 10.1111/jdv.14253] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2017] [Accepted: 02/27/2017] [Indexed: 12/11/2022]
Abstract
Background In vivo reflectance confocal microscopy (RCM) is a promising non‐invasive skin imaging technique that could facilitate early diagnosis of basal cell carcinoma (BCC) instead of routine punch biopsies. However, the clinical value and utility of RCM vs. a punch biopsy in diagnosing and subtyping BCC is unknown. Objective To assess diagnostic accuracy of RCM vs. punch biopsy for diagnosing and subtyping clinically suspected primary BCC. Methods A prospective, consecutive cohort of 100 patients with clinically suspected BCC were included at two tertiary hospitals in Amsterdam, the Netherlands, between 3 February 2015 and 2 October 2015. Patients were randomized between two test‐treatment pathways: diagnosing and subtyping using RCM imaging followed by direct surgical excision (RCM one‐stop‐shop) or planned excision based upon the histological diagnosis and subtype of punch biopsy (standard care). The primary outcome was the agreement between the index tests (RCM vs. punch biopsy) and reference standard (excision specimen) in correctly diagnosing BCC. The secondary outcome was the agreement between the index tests and reference standard in correctly identifying the most aggressive BCC subtypes. Results Sensitivity to detect BCC was similar for RCM and punch biopsy (100% vs. 93.94%), but a punch biopsy was more specific than RCM (79% vs. 38%). RCM expert evaluation for diagnosing BCC had a sensitivity of 100% and a specificity of 75%. The agreement between RCM and excision specimen in identifying the most aggressive BCC subtype ranged from 50% to 85% vs. 77% by a punch biopsy. Conclusion Reflectance confocal microscopy and punch biopsy have comparable diagnostic accuracy to diagnose and subtype BCC depending on RCM experience. Although experienced RCM users could accurately diagnose BCC at a distance, we found an important difference in subtyping BCC. Future RCM studies need to focus on diagnostic accuracy, reliability and specific criteria to improve BCC subtype differentiation.
Collapse
|
11
|
Cohen JF, Korevaar DA, Wang J, Leeflang MM, Bossuyt PM. Meta-epidemiologic study showed frequent time trends in summary estimates from meta-analyses of diagnostic accuracy studies. J Clin Epidemiol 2016; 77:60-67. [PMID: 27212137 DOI: 10.1016/j.jclinepi.2016.04.013] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2015] [Revised: 04/12/2016] [Accepted: 04/29/2016] [Indexed: 12/24/2022]
Abstract
OBJECTIVES To evaluate changes over time in summary estimates from meta-analyses of diagnostic accuracy studies. STUDY DESIGN AND SETTING We included 48 meta-analyses from 35 MEDLINE-indexed systematic reviews published between September 2011 and January 2012 (743 diagnostic accuracy studies; 344,015 participants). Within each meta-analysis, we ranked studies by publication date. We applied random-effects cumulative meta-analysis to follow how summary estimates of sensitivity and specificity evolved over time. Time trends were assessed by fitting a weighted linear regression model of the summary accuracy estimate against rank of publication. RESULTS The median of the 48 slopes was -0.02 (-0.08 to 0.03) for sensitivity and -0.01 (-0.03 to 0.03) for specificity. Twelve of 96 (12.5%) time trends in sensitivity or specificity were statistically significant. We found a significant time trend in at least one accuracy measure for 11 of the 48 (23%) meta-analyses. CONCLUSION Time trends in summary estimates are relatively frequent in meta-analyses of diagnostic accuracy studies. Results from early meta-analyses of diagnostic accuracy studies should be considered with caution.
Collapse
|
12
|
Kadouch DJ, Schram ME, Leeflang MM, Limpens J, Spuls PI, de Rie MA. In vivo confocal microscopy of basal cell carcinoma: a systematic review of diagnostic accuracy. J Eur Acad Dermatol Venereol 2015; 29:1890-7. [PMID: 26290493 DOI: 10.1111/jdv.13224] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2015] [Accepted: 05/22/2015] [Indexed: 12/22/2022]
Abstract
Basal cell carcinoma (BCC) is the most prevalent type of skin cancer. Histologic analysis of punch biopsy or direct excision specimen is used to confirm clinical diagnosis. In vivo reflectance confocal microscopy (RCM) is a non-invasive imaging modality that could facilitate early diagnosis and minimize unnecessary invasive procedures. We systematically reviewed diagnostic accuracy (sensitivity and specificity) of RCM in diagnosing primary BCCs to judge its usefulness. Eligible studies were reviewed for methodological quality using the QUADAS-2 tool. We used the bivariate random-effects model to calculate summary estimates of sensitivity and specificity. Six studies met the selection criteria and were included for analysis. The meta-analysis showed a summary estimate of sensitivity 0.97 (95% CI, 0.90-0.99) and specificity 0.93 (95% CI, 0.88-0.96). All but one of the QUADAS-2 items showed a high or unclear risk of bias with regards to patient selection. RCM may be a promising diagnostic tool, but the limited number of available studies and potential risk of bias of included studies do not allow us to draw firm conclusions. Future accuracy studies should take these limitations into account.
Collapse
|
13
|
Samara MT, Leucht C, Leeflang MM, Anghelescu IG, Chung YC, Crespo-Facorro B, Elkis H, Hatta K, Giegling I, Kane JM, Kayo M, Lambert M, Lin CH, Möller HJ, Pelayo-Terán JM, Riedel M, Rujescu D, Schimmelmann BG, Serretti A, Correll CU, Leucht S. Early Improvement As a Predictor of Later Response to Antipsychotics in Schizophrenia: A Diagnostic Test Review. Am J Psychiatry 2015; 172:617-29. [PMID: 26046338 DOI: 10.1176/appi.ajp.2015.14101329] [Citation(s) in RCA: 128] [Impact Index Per Article: 14.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
OBJECTIVE How long clinicians should wait before considering an antipsychotic ineffective and changing treatment in schizophrenia is an unresolved clinical question. Guidelines differ substantially in this regard. The authors conducted a diagnostic test meta-analysis using mostly individual patient data to assess whether lack of improvement at week 2 predicts later nonresponse. METHOD The search included EMBASE, MEDLINE, BIOSIS, PsycINFO, Cochrane Library, CINAHL, and reference lists of relevant articles, supplemented by requests to authors of all relevant studies. The main outcome was prediction of nonresponse, defined as <50% reduction in total score on either the Positive and Negative Syndrome Scale (PANSS) or Brief Psychiatric Rating Scale (BPRS) (corresponding to at least much improved) from baseline to endpoint (4-12 weeks), by <20% PANSS or BPRS improvement (corresponding to less than minimally improved) at week 2. Secondary outcomes were absent cross-sectional symptomatic remission and <20% PANSS or BPRS reduction at endpoint. Potential moderator variables were examined by meta-regression. RESULTS In 34 studies (N=9,460) a <20% PANSS or BPRS reduction at week 2 predicted nonresponse at endpoint with a specificity of 86% and a positive predictive value (PPV) of 90%. Using data for observed cases (specificity=86%, PPV=85%) or lack of remission (specificity=77%, PPV=88%) yielded similar results. Conversely, using the definition of <20% reduction at endpoint yielded worse results (specificity=70%, PPV=55%). The test specificity was significantly moderated by a trial duration of <6 weeks, higher baseline illness severity, and shorter illness duration. CONCLUSIONS Patients not even minimally improved by week 2 of antipsychotic treatment are unlikely to respond later and may benefit from a treatment change.
Collapse
|
14
|
van Enst WA, Naaktgeboren CA, Ochodo EA, de Groot JAH, Leeflang MM, Reitsma JB, Scholten RJPM, Moons KGM, Zwinderman AH, Bossuyt PMM, Hooft L. Small-study effects and time trends in diagnostic test accuracy meta-analyses: a meta-epidemiological study. Syst Rev 2015; 4:66. [PMID: 25956716 PMCID: PMC4450491 DOI: 10.1186/s13643-015-0049-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/04/2014] [Accepted: 04/16/2015] [Indexed: 01/23/2023] Open
Abstract
BACKGROUND Small-study effects and time trends have been identified in meta-analyses of randomized trials. We evaluated whether these effects are also present in meta-analyses of diagnostic test accuracy studies. METHODS A systematic search identified test accuracy meta-analyses published between May and September 2012. In each meta-analysis, the strength of the associations between estimated accuracy of the test (diagnostic odds ratio (DOR), sensitivity, and specificity) and sample size and between accuracy estimates and time since first publication were evaluated using meta-regression models. The regression coefficients over all meta-analyses were summarized using random effects meta-analysis. RESULTS Forty-six meta-analyses and their corresponding primary studies (N = 859) were included. There was a non-significant relative change in the DOR of 1.01 per 100 additional participants (95% CI 1.00 to 1.03; P = 0.07). In the subgroup of imaging studies, there was a relative increase in sensitivity of 1.13 per 100 additional diseased subjects (95% CI 1.05 to 1.22; P = 0.002). The relative change in DOR with time since first publication was 0.94 per 5 years (95% CI 0.80 to 1.10; P = 0.42). Sensitivity was lower in studies published later (relative change 0.89, 95% CI 0.80 to 0.99; P = 0.04). CONCLUSIONS Small-study effects and time trends do not seem to be as pronounced in meta-analyses of test accuracy studies as they are in meta-analyses of randomized trials. Small-study effects seem to be reversed in imaging, where larger studies tend to report higher sensitivity.
Collapse
|
15
|
Korevaar DA, Wang J, van Enst WA, Leeflang MM, Hooft L, Smidt N, Bossuyt PMM. Reporting Diagnostic Accuracy Studies: Some Improvements after 10 Years of STARD. Radiology 2015; 274:781-9. [DOI: 10.1148/radiol.14141160] [Citation(s) in RCA: 71] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
16
|
Naaktgeboren CA, van Enst WA, Ochodo EA, de Groot JAH, Hooft L, Leeflang MM, Bossuyt PM, Moons KGM, Reitsma JB. Systematic overview finds variation in approaches to investigating and reporting on sources of heterogeneity in systematic reviews of diagnostic studies. J Clin Epidemiol 2014; 67:1200-9. [PMID: 25063558 DOI: 10.1016/j.jclinepi.2014.05.018] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2013] [Revised: 05/08/2014] [Accepted: 05/12/2014] [Indexed: 01/22/2023]
Abstract
OBJECTIVES To examine how authors explore and report on sources of heterogeneity in systematic reviews of diagnostic accuracy studies. STUDY DESIGN AND SETTING A cohort of systematic reviews of diagnostic tests was systematically identified. Data were extracted on whether an exploration of the sources of heterogeneity was undertaken, how this was done, the number and type of potential sources explored, and how results and conclusions were reported. RESULTS Of the 65 systematic reviews, 12 did not perform a meta-analysis and eight of these gave heterogeneity between studies as a reason. Of the 53 reviews containing a meta-analysis, 40 explored potential sources of heterogeneity in a formal manner and 27 identified at least one source of heterogeneity. The reviews not investigating heterogeneity were smaller than those that did (median [interquartile range {IQR}], 8 [5-15] vs. 14 [11-19] primary studies). Twelve reviews performed a sensitivity analysis, 25 stratified analyses, and 19 metaregression. Many sources of heterogeneity were explored compared with the number of primary studies in a meta-analysis (median ratio, 1:5). Review authors placed importance on the exploration of sources of heterogeneity; 37 mentioned the exploration or the findings thereof in the abstract or conclusion of the main text.results CONCLUSION Methods for investigating sources of heterogeneity varied widely between reviews. Based on our findings of the review, we made suggestions on what to consider and report on when exploring sources of heterogeneity in systematic reviews of diagnostic studies.
Collapse
|
17
|
van Enst WA, Ochodo E, Scholten RJPM, Hooft L, Leeflang MM. Investigation of publication bias in meta-analyses of diagnostic test accuracy: a meta-epidemiological study. BMC Med Res Methodol 2014; 14:70. [PMID: 24884381 PMCID: PMC4035673 DOI: 10.1186/1471-2288-14-70] [Citation(s) in RCA: 234] [Impact Index Per Article: 23.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2014] [Accepted: 05/06/2014] [Indexed: 12/13/2022] Open
Abstract
Background The validity of a meta-analysis can be understood better in light of the possible impact of publication bias. The majority of the methods to investigate publication bias in terms of small study-effects are developed for meta-analyses of intervention studies, leaving authors of diagnostic test accuracy (DTA) systematic reviews with limited guidance. The aim of this study was to evaluate if and how publication bias was assessed in meta-analyses of DTA, and to compare the results of various statistical methods used to assess publication bias. Methods A systematic search was initiated to identify DTA reviews with a meta-analysis published between September 2011 and January 2012. We extracted all information about publication bias from the reviews and the two-by-two tables. Existing statistical methods for the detection of publication bias were applied on data from the included studies. Results Out of 1,335 references, 114 reviews could be included. Publication bias was explicitly mentioned in 75 reviews (65.8%) and 47 of these had performed statistical methods to investigate publication bias in terms of small study-effects: 6 by drawing funnel plots, 16 by statistical testing and 25 by applying both methods. The applied tests were Egger’s test (n = 18), Deeks’ test (n = 12), Begg’s test (n = 5), both the Egger and Begg tests (n = 4), and other tests (n = 2). Our own comparison of the results of Begg’s, Egger’s and Deeks’ test for 92 meta-analyses indicated that up to 34% of the results did not correspond with one another. Conclusions The majority of DTA review authors mention or investigate publication bias. They mainly use suboptimal methods like the Begg and Egger tests that are not developed for DTA meta-analyses. Our comparison of the Begg, Egger and Deeks tests indicated that these tests do give different results and thus are not interchangeable. Deeks’ test is recommended for DTA meta-analyses and should be preferred.
Collapse
|
18
|
Mugasa CM, Deborggraeve S, Schoone GJ, Laurent T, Leeflang MM, Ekangu RA, El Safi S, Saad AFA, Basiye FL, De Doncker S, Lubega GW, Kager PA, Büscher P, Schallig HDFH. Accordance and concordance of PCR and NASBA followed by oligochromatography for the molecular diagnosis of Trypanosoma brucei and Leishmania. Trop Med Int Health 2010; 15:800-5. [PMID: 20487429 DOI: 10.1111/j.1365-3156.2010.02547.x] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
OBJECTIVE To evaluate the repeatability and reproducibility of four simplified molecular assays for the diagnosis of Trypanosoma brucei spp. or Leishmania ssp. in a multicentre ring trial with seven participating laboratories. METHODS The tests are based on PCR or NASBA amplification of the parasites nucleic acids followed by rapid read-out by oligochromatographic dipstick (PCR-OC and NASBA-OC). RESULTS On purified nucleic acid specimens, the repeatability and reproducibility of the tests were Tryp-PRC-OC, 91.7% and 95.5%; Tryp-NASBA-OC, 95.8% and 100%; Leish-PCR-OC, 95.9% and 98.1%; Leish-NASBA-OC, 92.3% and 98.2%. On blood specimens spiked with parasites, the repeatability and reproducibility of the tests were Tryp-PRC-OC, 78.4% and 86.6%; Tryp-NASBA-OC, 81.5% and 89.0%; Leish-PCR-OC, 87.1% and 91.7%; Leish-NASBA-OC, 74.8% and 86.2%. CONCLUSION As repeatability and reproducibility of the tests were satisfactory, further phase II and III evaluations in clinical and population specimens from disease endemic countries are justified.
Collapse
|
19
|
Cnossen JS, Riet GT, Mol BW, Van Der Post JA, Leeflang MM, Meads CA, Hyde C, Khan KS. Are tests for predicting pre-eclampsia good enough to make screening viable? A review of reviews and critical appraisal. Acta Obstet Gynecol Scand 2009; 88:758-65. [DOI: 10.1080/00016340903008953] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
20
|
Leeflang MM, Debets-Ossenkopp YJ, Visser CE, Scholten RJPM, Hooft L, Bijlmer HA, Reitsma JB, Bossuyt PM, Vandenbroucke-Grauls CM. Galactomannan detection for invasive aspergillosis in immunocompromized patients. Cochrane Database Syst Rev 2008:CD007394. [PMID: 18843747 DOI: 10.1002/14651858.cd007394] [Citation(s) in RCA: 115] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
BACKGROUND Invasive aspergillosis (IA) is the most common life-threatening opportunistic invasive mycosis in immunocompromized patients. A test for IA needs to be not too invasive and not too big a burden for the already weakened patient. The serum galactomannan ELISA seems to have potential for both requirements. OBJECTIVES To obtain summary estimates of the diagnostic accuracy of galactomannan detection in serum for the diagnosis of IA. SEARCH STRATEGY We searched MEDLINE, EMBASE and Web of Science with both Medical Headings and text words for both aspergillosis and the sandwich ELISA. We checked reference lists of included studies and review articles for additional studies. SELECTION CRITERIA Cross-sectional studies, case-control designs and consecutive series of patients assessing the diagnostic accuracy of galactomannan detection for the diagnosis of IA in patients with neutropenia or patients whose neutrophils are functionally compromised were included. The reference standard was composed of the criteria given by the European Organization for Research and Treatment of Cancer (EORTC) and the Mycoses Study Group (MSG). DATA COLLECTION AND ANALYSIS Two review authors independently assessed quality and extracted data MAIN RESULTS Thirty studies were included in the meta-analyses, with a median prevalence of IA (proven or probable) of 7.7%. Seven of these (901 patients) reported results for an Optical Density Index (ODI) of 0.5 as cut-off value. The overall sensitivity in these studies was 78% (61% to 89%) and overall specificity was 81% (72% to 88%). Twelve studies (1744 patients) reported the results for cut-off value of 1.0 ODI, overall sensitivity was 75% (59% to 86%) and mean specificity 91% (84% to 95%). Seventeen studies (2600 patients) reported the results for cut-off value 1.5 ODI, sensitivity was 64% (50% to 77%) and mean specificity 95% (91% to 97%). AUTHORS' CONCLUSIONS At a cut-off value 0.5 ODI in a population of 100 patients with a disease prevalence of 8% (overall median prevalence), 2 patients who have IA, will be missed (sensitivity 78%, 22% false negatives), and 17 patients will be treated or further referred unnecessarily (specificity of 81%, 19% false negatives). If we use the test at cut-off value 1.5 in the same population, that will mean that 3 IA patients will be missed (sensitivity 64%, 36% false negatives) and 5 patients will be treated or referred unnecessarily (specificity of 95%, 5% false negatives). These numbers should however be interpreted with caution, because the results were very heterogeneous.
Collapse
|
21
|
Meads CA, Cnossen JS, Meher S, Juarez-Garcia A, ter Riet G, Duley L, Roberts TE, Mol BW, van der Post JA, Leeflang MM, Barton PM, Hyde CJ, Gupta JK, Khan KS. Methods of prediction and prevention of pre-eclampsia: systematic reviews of accuracy and effectiveness literature with economic modelling. Health Technol Assess 2008; 12:iii-iv, 1-270. [PMID: 18331705 DOI: 10.3310/hta12060] [Citation(s) in RCA: 148] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
OBJECTIVES To investigate the accuracy of predictive tests for pre-eclampsia and the effectiveness of preventative interventions for pre-eclampsia. Also to assess the cost-effectiveness of strategies (test-intervention combinations) to predict and prevent pre-eclampsia. DATA SOURCES Major electronic databases were searched to January 2005 at least. REVIEW METHODS Systematic reviews were carried out for test accuracy and effectiveness. Quality assessment was carried out using standard tools. For test accuracy, meta-analyses used a bivariate approach. Effectiveness reviews were conducted under the auspices of the Cochrane Pregnancy and Childbirth Group and used standard Cochrane review methods. The economic evaluation was from an NHS perspective and used a decision tree model. RESULTS For the 27 tests reviewed, the quality of included studies was generally poor. Some tests appeared to have high specificity, but at the expense of compromised sensitivity. Tests that reached specificities above 90% were body mass index greater than 34, alpha-foetoprotein and uterine artery Doppler (bilateral notching). The only Doppler test with a sensitivity of over 60% was resistance index and combinations of indices. A few tests not commonly found in routine practice, such as kallikreinuria and SDS-PAGE proteinuria, seemed to offer the promise of high sensitivity, without compromising specificity, but these would require further investigation. For the 16 effectiveness reviews, the quality of included studies was variable. The largest review was of antiplatelet agents, primarily low-dose aspirin, and included 51 trials (36,500 women). This was the only review where the intervention was shown to prevent both pre-eclampsia and its consequences for the baby. Calcium supplementation also reduced the risk of pre-eclampsia, but with some uncertainty about the impact on outcomes for the baby. The only other intervention associated with a reduction in RR of pre-eclampsia was rest at home, with or without a nutritional supplement, for women with normal blood pressure. However, this review included just two small trials and its results should be interpreted with caution. The cost of most of the tests was modest, ranging from 5 pounds for blood tests such as serum uric acid to approximately 20 pounds for Doppler tests. Similarly, the cost of most interventions was also modest. In contrast, the best estimate of additional average cost associated with an average case of pre-eclampsia was high at approximately 9000 pounds. The results of the modelling revealed that prior testing with the test accuracy sensitivities and specificities identified appeared to offer little as a way of improving cost-effectiveness. Based on the evidence reviewed, none of the tests appeared sufficiently accurate to be clinically useful and the results of the model favoured no-test/treat-all strategies. Rest at home without any initial testing appeared to be the most cost-effective 'test-treatment' combination. Calcium supplementation to all women, without any initial testing, appeared to be the second most cost-effective. The economic model provided little support that any form of Doppler test has sufficiently high sensitivity and specificity to be cost-effective for the early identification of pre-eclampsia. It also suggested that the pattern of cost-effectiveness was no different in high-risk mothers than the low-risk mothers considered in the base case. CONCLUSIONS The tests evaluated are not sufficiently accurate, in our opinion, to suggest their routine use in clinical practice. Calcium and antiplatelet agents, primarily low-dose aspirin, were the interventions shown to prevent pre-eclampsia. The most cost-effective approach to reducing pre-eclampsia is likely to be the provision of an effective, affordable and safe intervention applied to all mothers without prior testing to assess levels of risk. It is probably premature to suggest the implementation of a treat-all intervention strategy at present, however the feasibility and acceptability of this to women could be explored. Rigorous evaluation is needed of tests with modest cost whose initial assessments suggest that they may have high levels of both sensitivity and specificity. Similarly, there is a need for high-quality, adequately powered randomised controlled trials to investigate whether interventions such as advice to rest are indeed effective in reducing pre-eclampsia. In future, an economic model should be developed that considers not just pre-eclampsia, but other related outcomes, particularly those relevant to the infant such as perinatal death, preterm birth and small for gestational age. Such a modelling project should make provision for primary data collection on the safety of interventions and their associated costs.
Collapse
|