Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	[Subscribe] [Scholar Register]

Number

Cited by Other Article(s)

Shemilt I, Arno A, Thomas J, Lorenc T, Khouja C, Raine G, Sutcliffe K, Preethy D, Kwan I, Wright K, Sowden A. Cost-effectiveness of Microsoft Academic Graph with machine learning for automated study identification in a living map of coronavirus disease 2019 (COVID-19) research. Wellcome Open Res 2024;6:210. [PMID: 38686019 PMCID: PMC11056680 DOI: 10.12688/wellcomeopenres.17141.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/13/2024] [Indexed: 05/02/2024] Open

Abstract

Background

Identifying new, eligible studies for integration into living systematic reviews and maps usually relies on conventional Boolean updating searches of multiple databases and manual processing of the updated results. Automated searches of one, comprehensive, continuously updated source, with adjunctive machine learning, could enable more efficient searching, selection and prioritisation workflows for updating (living) reviews and maps, though research is needed to establish this. Microsoft Academic Graph (MAG) is a potentially comprehensive single source which also contains metadata that can be used in machine learning to help efficiently identify eligible studies. This study sought to establish whether: (a) MAG was a sufficiently sensitive single source to maintain our living map of COVID-19 research; and (b) eligible records could be identified with an acceptably high level of specificity.

Methods

We conducted an eight-arm cost-effectiveness analysis to assess the costs, recall and precision of semi-automated workflows, incorporating MAG with adjunctive machine learning, for continually updating our living map. Resource use data (time use) were collected from information specialists and other researchers involved in map production. Our systematic review software, EPPI-Reviewer, was adapted to incorporate MAG and associated machine learning workflows, and also used to collect data on recall, precision, and manual screening workload.

Results

The semi-automated MAG-enabled workflow dominated conventional workflows in both the base case and sensitivity analyses. At one month our MAG-enabled workflow with machine learning, active learning and fixed screening targets identified 469 additional, eligible articles for inclusion in our living map, and cost £3,179 GBP per week less, compared with conventional methods relying on Boolean searches of Medline and Embase.

Conclusions

We were able to increase recall and coverage of a large living map, whilst reducing its production costs. This finding is likely to be transferrable to OpenAlex, MAG's successor database platform.

Collapse

Boaz A, Baeza J, Fraser A, Persson E. 'It depends': what 86 systematic reviews tell us about what strategies to use to support the use of research in clinical practice. Implement Sci 2024;19:15. [PMID: 38374051 PMCID: PMC10875780 DOI: 10.1186/s13012-024-01337-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Accepted: 01/05/2024] [Indexed: 02/21/2024] Open

Abstract

BACKGROUND

The gap between research findings and clinical practice is well documented and a range of strategies have been developed to support the implementation of research into clinical practice. The objective of this study was to update and extend two previous reviews of systematic reviews of strategies designed to implement research evidence into clinical practice.

METHODS

We developed a comprehensive systematic literature search strategy based on the terms used in the previous reviews to identify studies that looked explicitly at interventions designed to turn research evidence into practice. The search was performed in June 2022 in four electronic databases: Medline, Embase, Cochrane and Epistemonikos. We searched from January 2010 up to June 2022 and applied no language restrictions. Two independent reviewers appraised the quality of included studies using a quality assessment checklist. To reduce the risk of bias, papers were excluded following discussion between all members of the team. Data were synthesised using descriptive and narrative techniques to identify themes and patterns linked to intervention strategies, targeted behaviours, study settings and study outcomes.

RESULTS

We identified 32 reviews conducted between 2010 and 2022. The reviews are mainly of multi-faceted interventions (n = 20) although there are reviews focusing on single strategies (ICT, educational, reminders, local opinion leaders, audit and feedback, social media and toolkits). The majority of reviews report strategies achieving small impacts (normally on processes of care). There is much less evidence that these strategies have shifted patient outcomes. Furthermore, a lot of nuance lies behind these headline findings, and this is increasingly commented upon in the reviews themselves.

DISCUSSION

Combined with the two previous reviews, 86 systematic reviews of strategies to increase the implementation of research into clinical practice have been identified. We need to shift the emphasis away from isolating individual and multi-faceted interventions to better understanding and building more situated, relational and organisational capability to support the use of research in clinical practice. This will involve drawing on a wider range of research perspectives (including social science) in primary studies and diversifying the types of synthesis undertaken to include approaches such as realist synthesis which facilitate exploration of the context in which strategies are employed.

Collapse

Adam GP, Paynter R. Development of literature search strategies for evidence syntheses: pros and cons of incorporating text mining tools and objective approaches. BMJ Evid Based Med 2023;28:137-139. [PMID: 35346974 DOI: 10.1136/bmjebm-2021-111892] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 03/05/2022] [Indexed: 11/04/2022]

Adam GP, Wallace BC, Trikalinos TA. Semi-automated Tools for Systematic Searches. Methods Mol Biol 2022;2345:17-40. [PMID: 34550582 DOI: 10.1007/978-1-0716-1566-9_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/24/2023]

Shemilt I, Arno A, Thomas J, Lorenc T, Khouja C, Raine G, Sutcliffe K, Preethy D, Kwan I, Wright K, Sowden A. Cost-effectiveness of Microsoft Academic Graph with machine learning for automated study identification in a living map of coronavirus disease 2019 (COVID-19) research. Wellcome Open Res 2021. [DOI: 10.12688/wellcomeopenres.17141.1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Abstract Background: Conventionally, searching for eligible articles to include in systematic reviews and maps of research has relied primarily on information specialists conducting Boolean searches of multiple databases and manually processing the results, including deduplication between these multiple sources. Searching one, comprehensive source, rather than multiple databases, could save time and resources. Microsoft Academic Graph (MAG) is potentially such a source, containing a network graph structure which provides metadata that can be exploited in machine learning processes. Research is needed to establish the relative advantage of using MAG as a single source, compared with conventional searches of multiple databases. This study sought to establish whether: (a) MAG is sufficiently comprehensive to maintain our living map of coronavirus disease 2019 (COVID-19) research; and (b) eligible records can be identified with an acceptably high level of specificity. Methods: We conducted a pragmatic, eight-arm cost-effectiveness analysis (simulation study) to assess the costs, recall and precision of our semi-automated MAG-enabled workflow versus conventional searches of MEDLINE and Embase (with and without machine learning classifiers, active learning and/or fixed screening targets) for maintaining a living map of COVID-19 research. Resource use data (time use) were collected from information specialists and other researchers involved in map production. Results: MAG-enabled workflows dominated MEDLINE-Embase workflows in both the base case and sensitivity analyses. At one month (base case analysis) our MAG-enabled workflow with machine learning, active learning and fixed screening targets identified n=469 more new, eligible articles for inclusion in our living map – and cost £3,179 GBP ($5,691 AUD) less – than conventional MEDLINE-Embase searches without any automation or fixed screening targets. Conclusions: MAG-enabled continuous surveillance workflows have potential to revolutionise study identification methods for living maps, specialised registers, databases of research studies and/or collections of systematic reviews, by increasing their recall and coverage, whilst reducing production costs. Collapse

Arno A, Elliott J, Wallace B, Turner T, Thomas J. The views of health guideline developers on the use of automation in health evidence synthesis. Syst Rev 2021;10:16. [PMID: 33419479 PMCID: PMC7796617 DOI: 10.1186/s13643-020-01569-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/18/2020] [Accepted: 12/21/2020] [Indexed: 12/11/2022] Open

Abstract

BACKGROUND

The increasingly rapid rate of evidence publication has made it difficult for evidence synthesis-systematic reviews and health guidelines-to be continually kept up to date. One proposed solution for this is the use of automation in health evidence synthesis. Guideline developers are key gatekeepers in the acceptance and use of evidence, and therefore, their opinions on the potential use of automation are crucial.

METHODS

The objective of this study was to analyze the attitudes of guideline developers towards the use of automation in health evidence synthesis. The Diffusion of Innovations framework was chosen as an initial analytical framework because it encapsulates some of the core issues which are thought to affect the adoption of new innovations in practice. This well-established theory posits five dimensions which affect the adoption of novel technologies: Relative Advantage, Compatibility, Complexity, Trialability, and Observability. Eighteen interviews were conducted with individuals who were currently working, or had previously worked, in guideline development. After transcription, a multiphase mixed deductive and grounded approach was used to analyze the data. First, transcripts were coded with a deductive approach using Rogers' Diffusion of Innovation as the top-level themes. Second, sub-themes within the framework were identified using a grounded approach.

RESULTS

Participants were consistently most concerned with the extent to which an innovation is in line with current values and practices (i.e., Compatibility in the Diffusion of Innovations framework). Participants were also concerned with Relative Advantage and Observability, which were discussed in approximately equal amounts. For the latter, participants expressed a desire for transparency in the methodology of automation software. Participants were noticeably less interested in Complexity and Trialability, which were discussed infrequently. These results were reasonably consistent across all participants.

CONCLUSIONS

If machine learning and other automation technologies are to be used more widely and to their full potential in systematic reviews and guideline development, it is crucial to ensure new technologies are in line with current values and practice. It will also be important to maximize the transparency of the methods of these technologies to address the concerns of guideline developers.

Collapse

Gates A, Gates M, DaRosa D, Elliott SA, Pillay J, Rahman S, Vandermeer B, Hartling L. Decoding semi-automated title-abstract screening: findings from a convenience sample of reviews. Syst Rev 2020;9:272. [PMID: 33243276 PMCID: PMC7694314 DOI: 10.1186/s13643-020-01528-x] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/08/2020] [Accepted: 11/11/2020] [Indexed: 01/08/2023] Open

Abstract

BACKGROUND

We evaluated the benefits and risks of using the Abstrackr machine learning (ML) tool to semi-automate title-abstract screening and explored whether Abstrackr's predictions varied by review or study-level characteristics.

METHODS

For a convenience sample of 16 reviews for which adequate data were available to address our objectives (11 systematic reviews and 5 rapid reviews), we screened a 200-record training set in Abstrackr and downloaded the relevance (relevant or irrelevant) of the remaining records, as predicted by the tool. We retrospectively simulated the liberal-accelerated screening approach. We estimated the time savings and proportion missed compared with dual independent screening. For reviews with pairwise meta-analyses, we evaluated changes to the pooled effects after removing the missed studies. We explored whether the tool's predictions varied by review and study-level characteristics.

RESULTS

Using the ML-assisted liberal-accelerated approach, we wrongly excluded 0 to 3 (0 to 14%) records that were included in the final reports, but saved a median (IQR) 26 (9, 42) h of screening time. One missed study was included in eight pairwise meta-analyses in one systematic review. The pooled effect for just one of those meta-analyses changed considerably (from MD (95% CI) - 1.53 (- 2.92, - 0.15) to - 1.17 (- 2.70, 0.36)). Of 802 records in the final reports, 87% were correctly predicted as relevant. The correctness of the predictions did not differ by review (systematic or rapid, P = 0.37) or intervention type (simple or complex, P = 0.47). The predictions were more often correct in reviews with multiple (89%) vs. single (83%) research questions (P = 0.01), or that included only trials (95%) vs. multiple designs (86%) (P = 0.003). At the study level, trials (91%), mixed methods (100%), and qualitative (93%) studies were more often correctly predicted as relevant compared with observational studies (79%) or reviews (83%) (P = 0.0006). Studies at high or unclear (88%) vs. low risk of bias (80%) (P = 0.039), and those published more recently (mean (SD) 2008 (7) vs. 2006 (10), P = 0.02) were more often correctly predicted as relevant.

CONCLUSION

Our screening approach saved time and may be suitable in conditions where the limited risk of missing relevant records is acceptable. Several of our findings are paradoxical and require further study to fully understand the tasks to which ML-assisted screening is best suited. The findings should be interpreted in light of the fact that the protocol was prepared for the funder, but not published a priori. Because we used a convenience sample, the findings may be prone to selection bias. The results may not be generalizable to other samples of reviews, ML tools, or screening approaches. The small number of missed studies across reviews with pairwise meta-analyses hindered strong conclusions about the effect of missed studies on the results and conclusions of systematic reviews.

Collapse

El Sherif R, Langlois A, Pandu X, Nie JY, Thomas J, Hong QN, Pluye P. Identifying empirical studies for mixed studies reviews: The mixed filter and the automated text classifier. EDUCATION FOR INFORMATION 2020. [DOI: 10.3233/efi-190347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Gates A, Guitard S, Pillay J, Elliott SA, Dyson MP, Newton AS, Hartling L. Performance and usability of machine learning for screening in systematic reviews: a comparative evaluation of three tools. Syst Rev 2019;8:278. [PMID: 31727150 PMCID: PMC6857345 DOI: 10.1186/s13643-019-1222-2] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/10/2019] [Accepted: 11/05/2019] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

We explored the performance of three machine learning tools designed to facilitate title and abstract screening in systematic reviews (SRs) when used to (a) eliminate irrelevant records (automated simulation) and (b) complement the work of a single reviewer (semi-automated simulation). We evaluated user experiences for each tool.

METHODS

We subjected three SRs to two retrospective screening simulations. In each tool (Abstrackr, DistillerSR, RobotAnalyst), we screened a 200-record training set and downloaded the predicted relevance of the remaining records. We calculated the proportion missed and workload and time savings compared to dual independent screening. To test user experiences, eight research staff tried each tool and completed a survey.

RESULTS

Using Abstrackr, DistillerSR, and RobotAnalyst, respectively, the median (range) proportion missed was 5 (0 to 28) percent, 97 (96 to 100) percent, and 70 (23 to 100) percent for the automated simulation and 1 (0 to 2) percent, 2 (0 to 7) percent, and 2 (0 to 4) percent for the semi-automated simulation. The median (range) workload savings was 90 (82 to 93) percent, 99 (98 to 99) percent, and 85 (85 to 88) percent for the automated simulation and 40 (32 to 43) percent, 49 (48 to 49) percent, and 35 (34 to 38) percent for the semi-automated simulation. The median (range) time savings was 154 (91 to 183), 185 (95 to 201), and 157 (86 to 172) hours for the automated simulation and 61 (42 to 82), 92 (46 to 100), and 64 (37 to 71) hours for the semi-automated simulation. Abstrackr identified 33-90% of records missed by a single reviewer. RobotAnalyst performed less well and DistillerSR provided no relative advantage. User experiences depended on user friendliness, qualities of the user interface, features and functions, trustworthiness, ease and speed of obtaining predictions, and practicality of the export file(s).

CONCLUSIONS

The workload savings afforded in the automated simulation came with increased risk of missing relevant records. Supplementing a single reviewer's decisions with relevance predictions (semi-automated simulation) sometimes reduced the proportion missed, but performance varied by tool and SR. Designing tools based on reviewers' self-identified preferences may improve their compatibility with present workflows.

SYSTEMATIC REVIEW REGISTRATION

Not applicable.

Collapse

Norman CR, Leeflang MMG, Porcher R, Névéol A. Measuring the impact of screening automation on meta-analyses of diagnostic test accuracy. Syst Rev 2019;8:243. [PMID: 31661028 PMCID: PMC6819363 DOI: 10.1186/s13643-019-1162-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/22/2019] [Accepted: 09/13/2019] [Indexed: 11/25/2022] Open

Abstract

BACKGROUND

The large and increasing number of new studies published each year is making literature identification in systematic reviews ever more time-consuming and costly. Technological assistance has been suggested as an alternative to the conventional, manual study identification to mitigate the cost, but previous literature has mainly evaluated methods in terms of recall (search sensitivity) and workload reduction. There is a need to also evaluate whether screening prioritization methods leads to the same results and conclusions as exhaustive manual screening. In this study, we examined the impact of one screening prioritization method based on active learning on sensitivity and specificity estimates in systematic reviews of diagnostic test accuracy.

METHODS

We simulated the screening process in 48 Cochrane reviews of diagnostic test accuracy and re-run 400 meta-analyses based on a least 3 studies. We compared screening prioritization (with technological assistance) and screening in randomized order (standard practice without technology assistance). We examined if the screening could have been stopped before identifying all relevant studies while still producing reliable summary estimates. For all meta-analyses, we also examined the relationship between the number of relevant studies and the reliability of the final estimates.

RESULTS

The main meta-analysis in each systematic review could have been performed after screening an average of 30% of the candidate articles (range 0.07 to 100%). No systematic review would have required screening more than 2308 studies, whereas manual screening would have required screening up to 43,363 studies. Despite an average 70% recall, the estimation error would have been 1.3% on average, compared to an average 2% estimation error expected when replicating summary estimate calculations.

CONCLUSION

Screening prioritization coupled with stopping criteria in diagnostic test accuracy reviews can reliably detect when the screening process has identified a sufficient number of studies to perform the main meta-analysis with an accuracy within pre-specified tolerance limits. However, many of the systematic reviews did not identify a sufficient number of studies that the meta-analyses were accurate within a 2% limit even with exhaustive manual screening, i.e., using current practice.

Collapse

O’Connor AM, Tsafnat G, Thomas J, Glasziou P, Gilbert SB, Hutton B. A question of trust: can we build an evidence base to gain trust in systematic review automation technologies? Syst Rev 2019;8:143. [PMID: 31215463 PMCID: PMC6582554 DOI: 10.1186/s13643-019-1062-0] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/23/2018] [Accepted: 06/05/2019] [Indexed: 01/05/2023] Open

Abstract

BACKGROUND

Although many aspects of systematic reviews use computational tools, systematic reviewers have been reluctant to adopt machine learning tools.

DISCUSSION

We discuss that the potential reason for the slow adoption of machine learning tools into systematic reviews is multifactorial. We focus on the current absence of trust in automation and set-up challenges as major barriers to adoption. It is important that reviews produced using automation tools are considered non-inferior or superior to current practice. However, this standard will likely not be sufficient to lead to widespread adoption. As with many technologies, it is important that reviewers see "others" in the review community using automation tools. Adoption will also be slow if the automation tools are not compatible with workflows and tasks currently used to produce reviews. Many automation tools being developed for systematic reviews mimic classification problems. Therefore, the evidence that these automation tools are non-inferior or superior can be presented using methods similar to diagnostic test evaluations, i.e., precision and recall compared to a human reviewer. However, the assessment of automation tools does present unique challenges for investigators and systematic reviewers, including the need to clarify which metrics are of interest to the systematic review community and the unique documentation challenges for reproducible software experiments.

CONCLUSION

We discuss adoption barriers with the goal of providing tool developers with guidance as to how to design and report such evaluations and for end users to assess their validity. Further, we discuss approaches to formatting and announcing publicly available datasets suitable for assessment of automation technologies and tools. Making these resources available will increase trust that tools are non-inferior or superior to current practice. Finally, we identify that, even with evidence that automation tools are non-inferior or superior to current practice, substantial set-up challenges remain for main stream integration of automation into the systematic review process.

Collapse

van Altena AJ, Spijker R, Olabarriaga SD. Usage of automation tools in systematic reviews. Res Synth Methods 2019;10:72-82. [PMID: 30561081 DOI: 10.1002/jrsm.1335] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2018] [Revised: 09/28/2018] [Accepted: 12/11/2018] [Indexed: 12/12/2022]

Opportunities for computer support for systematic reviewing - a gap analysis. ACTA ACUST UNITED AC 2018. [PMID: 30637417 DOI: 10.1007/978-3-319-78105-1_40] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]

Michie S, Johnston M. Optimising the value of the evidence generated in implementation science: the use of ontologies to address the challenges. Implement Sci 2017;12:131. [PMID: 29137660 PMCID: PMC5686802 DOI: 10.1186/s13012-017-0660-2] [Citation(s) in RCA: 65] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2017] [Accepted: 10/30/2017] [Indexed: 12/12/2022] Open

Stansfield C, O'Mara-Eves A, Thomas J. Text mining for search term development in systematic reviewing: A discussion of some methods and challenges. Res Synth Methods 2017;8:355-365. [PMID: 28660680 DOI: 10.1002/jrsm.1250] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2016] [Revised: 03/08/2017] [Accepted: 05/14/2017] [Indexed: 11/10/2022]

Evidence & Gap Maps: A tool for promoting evidence informed policy and strategic research agendas. J Clin Epidemiol 2016;79:120-129. [DOI: 10.1016/j.jclinepi.2016.05.015] [Citation(s) in RCA: 108] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2015] [Revised: 04/26/2016] [Accepted: 05/06/2016] [Indexed: 11/23/2022]

Literature Review of the National CLAS Standards: Policy and Practical Implications in Reducing Health Disparities. J Racial Ethn Health Disparities 2016;4:632-647. [PMID: 27444488 DOI: 10.1007/s40615-016-0267-3] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2016] [Revised: 07/01/2016] [Accepted: 07/05/2016] [Indexed: 10/21/2022]

Marshall IJ, Kuiper J, Wallace BC. RobotReviewer: evaluation of a system for automatically assessing bias in clinical trials. J Am Med Inform Assoc 2016;23:193-201. [PMID: 26104742 PMCID: PMC4713900 DOI: 10.1093/jamia/ocv044] [Citation(s) in RCA: 123] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2014] [Revised: 04/16/2015] [Accepted: 04/18/2015] [Indexed: 11/13/2022] Open

O’Mara-Eves A, Thomas J, McNaught J, Miwa M, Ananiadou S. Using text mining for study identification in systematic reviews: a systematic review of current approaches. Syst Rev 2015;4:5. [PMID: 25588314 PMCID: PMC4320539 DOI: 10.1186/2046-4053-4-5] [Citation(s) in RCA: 262] [Impact Index Per Article: 29.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/07/2014] [Accepted: 12/10/2014] [Indexed: 01/22/2023] Open

Abstract

BACKGROUND

The large and growing number of published studies, and their increasing rate of publication, makes the task of identifying relevant studies in an unbiased way for inclusion in systematic reviews both complex and time consuming. Text mining has been offered as a potential solution: through automating some of the screening process, reviewer time can be saved. The evidence base around the use of text mining for screening has not yet been pulled together systematically; this systematic review fills that research gap. Focusing mainly on non-technical issues, the review aims to increase awareness of the potential of these technologies and promote further collaborative research between the computer science and systematic review communities.

METHODS

Five research questions led our review: what is the state of the evidence base; how has workload reduction been evaluated; what are the purposes of semi-automation and how effective are they; how have key contextual problems of applying text mining to the systematic review field been addressed; and what challenges to implementation have emerged? We answered these questions using standard systematic review methods: systematic and exhaustive searching, quality-assured data extraction and a narrative synthesis to synthesise findings.

RESULTS

The evidence base is active and diverse; there is almost no replication between studies or collaboration between research teams and, whilst it is difficult to establish any overall conclusions about best approaches, it is clear that efficiencies and reductions in workload are potentially achievable. On the whole, most suggested that a saving in workload of between 30% and 70% might be possible, though sometimes the saving in workload is accompanied by the loss of 5% of relevant studies (i.e. a 95% recall).

CONCLUSIONS

Using text mining to prioritise the order in which items are screened should be considered safe and ready for use in 'live' reviews. The use of text mining as a 'second screener' may also be used cautiously. The use of text mining to eliminate studies automatically should be considered promising, but not yet fully proven. In highly technical/clinical areas, it may be used with a high degree of confidence; but more developmental and evaluative work is needed in other disciplines.

Collapse