1
|
A systematic review of the incidence and outcomes of ICD-11 defined stroke. J Stroke Cerebrovasc Dis 2024:107784. [PMID: 38795795 DOI: 10.1016/j.jstrokecerebrovasdis.2024.107784] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Revised: 05/18/2024] [Accepted: 05/20/2024] [Indexed: 05/28/2024] Open
Abstract
BACKGROUND The World Health Organisation has expanded the definition of stroke to include people with symptoms less than 24 hours if they have evidence of stroke on neuroimaging. The impact is that people previously diagnosed as having a transient ischaemic attack (TIA) would now be considered to have had a stroke. This change will impact incidence and outcomes of stroke and increase eligibility for secondary prevention. We aimed to evaluate the new ICD-11 criteria retrospectively to previous TIA studies to understand the change in incidence and outcomes of this type of stroke. METHODS We conducted a systematic review of observational studies of the incidence and outcomes of clinically defined TIA. We searched PubMed, EMBASE, and Google Scholar from inception to 23rd May 2023. Study quality was assessed using a risk of bias tool for prevalence studies. FINDINGS Our review included 25 studies. The rate of scan positivity for stroke among those with clinically defined TIA was 24%, (95% CI, 16% - 33%) but with high heterogeneity (I2=100%, p = 0). Sensitivity analyses provided evidence that heterogeneity could be explained by methodology and recruitment method. The scan positive rate when examining only studies at low risk of bias was substantially lower, at 13% (95% CI, 11 - 15%, I2=0, p = 0.77). We estimate from population-based incidence studies that ICD-11 would result in an increase stroke incidence between 4.8 and 10.5 per 100,000 persons/year. Of those with DWI-MRI evidence of stroke, 6% (95% CI, 3 - 11%) developed a recurrent stroke in the subsequent 90 days, but with substantial heterogeneity (I2=67%, p = 0.02). CONCLUSION The impact of the ICD-11 change in stroke definition on incidence and outcomes may have been overestimated by individual studies. Community-based stroke services with access to DWI MRI are likely to accurately diagnose greater numbers of people with mild ICD-11 stroke, increasing access to effective prevention.
Collapse
|
2
|
Natural history of depression up to 18 years after stroke: a population-based South London Stroke Register study. THE LANCET REGIONAL HEALTH. EUROPE 2024; 40:100882. [PMID: 38745986 PMCID: PMC11092885 DOI: 10.1016/j.lanepe.2024.100882] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 02/22/2024] [Accepted: 02/23/2024] [Indexed: 05/16/2024]
Abstract
Background Current evidence on the long-term natural history of post-stroke depression (PSD) is limited. We aim to determine the prevalence, incidence, duration and recurrence rates of depression to 18-years after stroke and assess differences by onset-time and depression severity. Methods Data were from the South London Stroke Register (1995-2019, N = 6641 at registration). Depression was defined using the Hospital Anxiety and Depression scale (scores > 7 = depression) at 3-months, then annually to 18-years after stroke. We compared early- (3-months post-stroke) vs late-onset depression (1-year) and initial mild (HADS scores > 7) vs severe depression (scores > 10). Findings 3864 patients were assessed for depression at any time-points during the follow-up (male:55.4% (2141), median age: 68.0 (20.4)), with the number ranging from 2293 at 1-year to 145 at 18-years after stroke. Prevalence of PSD ranged from 31.3% (28.9-33.8) to 41.5% (33.6-49.3). The cumulative incidence of depression was 59.4% (95% CI 57.8-60.9), of which 87.9% (86.5-89.2) occurred within 5-years after stroke. Of patients with incident PSD at 3-months after stroke, 46.6% (42.1-51.2) recovered after 1 year. Among those recovered, 66.7% (58.0-74.5) experienced recurrent depression and 94.4% (87.5-98.2) of recurrences occurred within 5-years since recovery. Similar estimates were observed in patients with PSD at 1-year. 34.3% (27.9-41.1) of patients with severe depression had recovered at the next time-point, compared to 56.7% (50.5-62.8) with mild depression. Recurrence rate at 1-year after recovery was higher in patients with severe depression (52.9% (35.1-70.2)) compared to mild depression (23.5% (14.1-35.4)) (difference: 29.4% (7.6-51.2), p = 0.003). Interpretation Long-term depressive status may be established by 5-years post-onset. Early- and late-onset depression presented similar natural history, while severe depression had a longer duration and quicker recurrence than mild depression. These estimates were limited to alive patients completing the depression assessment, who tended to have less severe stroke than excluded patients, so may be underestimated and not generalizable to all stroke survivors. Funding National Institute for Health and Care Research (NIHR202339).
Collapse
|
3
|
Question answering systems for health professionals at the point of care-a systematic review. J Am Med Inform Assoc 2024; 31:1009-1024. [PMID: 38366879 PMCID: PMC10990539 DOI: 10.1093/jamia/ocae015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 01/11/2024] [Accepted: 01/15/2024] [Indexed: 02/18/2024] Open
Abstract
OBJECTIVES Question answering (QA) systems have the potential to improve the quality of clinical care by providing health professionals with the latest and most relevant evidence. However, QA systems have not been widely adopted. This systematic review aims to characterize current medical QA systems, assess their suitability for healthcare, and identify areas of improvement. MATERIALS AND METHODS We searched PubMed, IEEE Xplore, ACM Digital Library, ACL Anthology, and forward and backward citations on February 7, 2023. We included peer-reviewed journal and conference papers describing the design and evaluation of biomedical QA systems. Two reviewers screened titles, abstracts, and full-text articles. We conducted a narrative synthesis and risk of bias assessment for each study. We assessed the utility of biomedical QA systems. RESULTS We included 79 studies and identified themes, including question realism, answer reliability, answer utility, clinical specialism, systems, usability, and evaluation methods. Clinicians' questions used to train and evaluate QA systems were restricted to certain sources, types and complexity levels. No system communicated confidence levels in the answers or sources. Many studies suffered from high risks of bias and applicability concerns. Only 8 studies completely satisfied any criterion for clinical utility, and only 7 reported user evaluations. Most systems were built with limited input from clinicians. DISCUSSION While machine learning methods have led to increased accuracy, most studies imperfectly reflected real-world healthcare information needs. Key research priorities include developing more realistic healthcare QA datasets and considering the reliability of answer sources, rather than merely focusing on accuracy.
Collapse
|
4
|
The feasibility, repeatability, validity and responsiveness of the EQ-5D-3L in Krio for patients with stroke in Sierra Leone. Health Qual Life Outcomes 2024; 22:29. [PMID: 38549069 PMCID: PMC10976786 DOI: 10.1186/s12955-024-02246-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Accepted: 03/22/2024] [Indexed: 04/02/2024] Open
Abstract
OBJECTIVES To assess the feasibility, repeatability, validity and responsiveness of the EQ-5D-3L in Krio for patients with stroke in Sierra Leone, the first psychometric assessment of the EQ-5D-3L to be conducted in patients with stroke in Sub Saharan Africa. METHODS A prospective stroke register at two tertiary government hospitals recruited all patients with the WHO definition of stroke and followed patients up at seven days, 90 days and one year post stroke. The newly translated EQ-5D-3L, Barthel Index (BI), modified Rankin Scale (mRS) and National Institute of Health Stroke Scale (NIHSS), a measure of stroke severity, were collected by trained researchers, face to face during admission and via phone at follow up. Feasibility was assessed by completion rate and proportion of floor/ceiling effects. Internal consistency was assessed by inter item correlations (IIC) and Cronbach's alpha. Repeatability of the EQ-5D-3L was examined using test-retest, EQ-5D-3L utility scores at 90 days were compared to EQ-5D-3L utility scores at one year in the same individuals, whose Barthel Index had remained within the minimally clinical important difference. Known group validity was assessed by stroke severity. Convergent validity was assessed against the BI, using Spearman's rho. Responsiveness was assessed in patients whose BI improved or deteriorated from seven to 90 days. Sensitivity analyses were conducted using the UK and Zimbabwe value sets, to evaluate the effect of value set, in a subgroup of patients with no formal education to evaluate the influence of patient educational attainment, and using the mRS instead of the BI to evaluate the influence of utilising an alternative functional scale. RESULTS The EQ-5D-3L was completed in 373/460 (81.1%), 360/367 (98.1%) and 299/308 (97.1%) eligible patients at seven days, 90 days and one year post stroke. Missing item data was low overall, but was highest in the anxiety/depression dimension 1.3% (5/373). Alpha was 0.81, 0.88 and 0.86 at seven days, 90 days and one year post stroke and IIC were within pre-specified ranges. Repeatability of the EQ-5D-3L was moderate to poor, weighted Kappa 0.23-0.49. EQ-5D-3L utility was significantly associated with stroke severity at all timepoints. Convergent validity with BI was strong overall and for shared subscales. EQ-5D-3L was moderately responsive to both improvement Cohen's D 0.55 (95% CI:0.15-0.94) and deterioration 0.92 (95% CI:0.29-1.55). Completion rates were similar in patients with no formal education 148/185 (80.0%) vs those with any formal education 225/275 (81.8%), and known group validity for stroke severity in patients with no formal education was strong. Using the Zimbabwe value set instead of the UK value set, and using the mRS instead of the BI did not change the direction or significance of results. CONCLUSIONS The EQ-5D-3L for stroke in Sierra Leone was feasible, and responsive including in patients with no formal education. However, repeatability was moderate to poor, which may be due to the study design, but should add a degree of caution in the analysis of repeated measures of EQ-5D-3L over time in this population. Known group validity and convergent validity with BI and mRS were strong. Further research should assess the EQ-5D in the general population, examine test-retest reliability over a shorter time period and assess the acceptability and validity of the anxiety/depression dimension against other validated mental health instruments. Development of an EQ-5D value set for West Africa should be a research priority.
Collapse
|
5
|
Cohort Profile: The Stroke in Sierra Leone (SISLE) Register. Int J Epidemiol 2023; 52:e308-e314. [PMID: 37555838 PMCID: PMC10749756 DOI: 10.1093/ije/dyad112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 07/30/2023] [Indexed: 08/10/2023] Open
|
6
|
Trajectories of depressive symptoms 10 years after stroke and associated risk factors: a prospective cohort study. Lancet 2023; 402 Suppl 1:S64. [PMID: 37997108 DOI: 10.1016/s0140-6736(23)02111-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 08/18/2023] [Accepted: 09/22/2023] [Indexed: 11/25/2023]
Abstract
BACKGROUND Previous studies have investigated the risk factors for post-stroke depression at only one timepoint, neglecting its dynamic nature. We aimed to identify trajectories of post-stroke depression from multiple assessments and explore their risk factors. METHODS We did a population-based cohort study with the South London Stroke Register (1995-2019). All stroke patients with three or more measurements of the Hospital Anxiety and Depression Scale were included. We identified trajectories of post-stroke depression over a 10-year follow-up using group-based trajectory modelling. We determined the optimal number and shape of trajectories based on the lowest Bayesian information criterion, average posterior probability of assignment of each group over 0·70, and inclusion of at least 5% of participants within each group. We used multinomial logistic regression adjusted for age, sex, ethnicity, comorbidity, physical disability, stroke severity, history of depression and cognitive impairment to explore associations with different trajectories. FINDINGS The analysis comprised 1968 participants (mean age 64·9 years [SD 13·8], 56·6% male and 43·4% female, 65·1% white ethnicity, 30·7% severe disability and 32·7% severe stroke). We identified four patterns of symptoms: no depressive symptoms (14·1%, n=277), low symptoms (41·7%, n=820), moderate symptoms and symptoms worsening early and then improving (34·6%, n=681), and high and increasing symptoms (9·7%, n=190). Compared with no depressive symptom trajectory, patients with severe disability, severe stroke, pre-stroke depression, and cognitive impairment were more likely to be in the moderate and high symptom groups (adjusted odds ratios [ORs] 2·26 [95% CI 1·56-3·28], 1·75 [1·19-2·57], 2·20 [1·02-4·74], and 2·04 [1·25-3·32], respectively). Female sex was associated with high depression (OR 1·65 [1·13-2·41]), while older age (≥65 years) was associated with moderate depression (OR 1·82 [1·36-2·45]). In men, the ORs for patients with severe disability, severe stroke, pre-stroke depression, and cognitive impairment being in the high depression group were 1·91 (1·01-3·60), 2·41 (1·26-4·60), 2·57 (0·84-7·88), and 2·68 (1·28-5·60), respectively. In women, the ORs were 1·08 (0·52-2·23), 1·30 (0·60-2·79), 19·2 (2·35-156·05), and 3·80 (1·44-10·01), respectively. INTERPRETATION Female sex and older age were associated with distinct courses of depressive symptoms. In men, high depressive symptom trajectory was associated with severe stroke and severe disability, which was not the case in women. These findings were limited to patients with three or more assessments, who tended to have less severe disabilities than excluded patients and might not generalise to all stroke survivors. FUNDING National Institute for Health and Care Research (NIHR).
Collapse
|
7
|
Cohort profile: The South London Stroke Register - a population-based register measuring the incidence and outcomes of stroke. J Stroke Cerebrovasc Dis 2023; 32:107210. [PMID: 37384980 DOI: 10.1016/j.jstrokecerebrovasdis.2023.107210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 06/04/2023] [Indexed: 07/01/2023] Open
Abstract
PURPOSE The South London Stroke Register (SLSR) is a population-based cohort study, which was established in 1995 to study the causes, incidence, and outcomes of stroke. The SLSR aims to estimate incidence, and acute and long term needs in a multi-ethnic inner-city population, with follow-up durations for some participants exceeding 20 years. PARTICIPANTS The SLSR aims to recruit residents of a defined area within Lambeth and Southwark who experience a first stroke. More than 7700 people have been registered since inception, and >2750 people continue to be followed up. At the 2011 census, the source population was 357,308. FINDINGS TO DATE The SLSR was instrumental in highlighting the inequalities in risk and outcomes in the UK, and demonstrating the dramatic improvements in care quality and outcomes in recent decades. Data from the SLSR informed the UK National Audit Office in its 2005 report criticising the poor state of stroke care in England. For people living in the SLSR area the likelihood of being treated in a stroke unit increased from 19% in 1995-7 to 75% in 2007-9. The SLSR has investigated health inequalities in stroke incidence and outcome. SLSR analyses have demonstrated that lower socioeconomic status was associated with poorer outcome, and that Black people and younger people have not experienced the same improvements in stroke incidence as other groups. FUTURE PLANS As part of an NIHR Programme Grant for Applied Research, from April 2022 the SLSR has expanded to recruit ICD-11 defined stroke (including those with <24 h symptoms where there are neuroimaging findings), and have expanded the follow up interviews to collect more detailed information on quality of life, cognition, and care needs. Additional data items will be added over the Programme based on feedback from patients and other stakeholders.
Collapse
|
8
|
Automatically Summarizing Evidence from Clinical Trials: A Prototype Highlighting Current Challenges. PROCEEDINGS OF THE CONFERENCE. ASSOCIATION FOR COMPUTATIONAL LINGUISTICS. MEETING 2023; 2023:236-247. [PMID: 37483390 PMCID: PMC10361334] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 07/25/2023]
Abstract
We present TrialsSummarizer, a system that aims to automatically summarize evidence presented in the set of randomized controlled trials most relevant to a given query. Building on prior work (Marshall et al., 2020), the system retrieves trial publications matching a query specifying a combination of condition, intervention(s), and outcome(s), and ranks these according to sample size and estimated study quality. The top-k such studies are passed through a neural multi-document summarization system, yielding a synopsis of these trials. We consider two architectures: A standard sequence-to-sequence model based on BART (Lewis et al., 2019), and a multi-headed architecture intended to provide greater transparency to end-users. Both models produce fluent and relevant summaries of evidence retrieved for queries, but their tendency to introduce unsupported statements render them inappropriate for use in this domain at present. The proposed architecture may help users verify outputs allowing users to trace generated tokens back to inputs. The demonstration video is available at: https://vimeo.com/735605060 The prototype, source code, and model weights are available at: https://sanjanaramprasad.github.io/trials-summarizer/.
Collapse
|
9
|
Stroke in Sierra Leone: Case fatality rate and functional outcome after stroke in Freetown. Int J Stroke 2023:17474930231164892. [PMID: 36905336 DOI: 10.1177/17474930231164892] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/12/2023]
Abstract
BACKGROUND There is limited information on long-term outcomes after stroke in sub-Saharan Africa (SSA). Current estimates of case fatality rate (CFR) in SSA are based on small sample sizes with varying study design and report heterogeneous results. AIMS We report CFR and functional outcomes from a large, prospective, longitudinal cohort of stroke patients in Sierra Leone and describe factors associated with mortality and functional outcome. METHODS A prospective longitudinal stroke register was established at both adult tertiary government hospitals in Freetown, Sierra Leone. It recruited all patients ⩾ 18 years with stroke, using the World Health Organization definition, from May 2019 until October 2021. To reduce selection bias onto the register, all investigations were paid by the funder and outreach conducted to raise awareness of the study. Sociodemographic data, National Institute of Health Stroke Scale (NIHSS), and Barthel Index (BI) were collected on all patients on admission, at 7 days, 90 days, 1 year, and 2 years post stroke. Cox proportional hazards models were constructed to identify factors associated with all-cause mortality. A binomial logistic regression model reports odds ratio (OR) for functional independence at 1 year. RESULTS A total of 986 patients with stroke were included, of which 857 (87%) received neuroimaging. Follow-up rate was 82% at 1 year, missing item data were <1% for most variables. Stroke cases were equally split by sex and mean age was 58.9 (SD: 14.0) years. About 625 (63%) were ischemic, 206 (21%) primary intracerebral hemorrhage, 25 (3%) subarachnoid hemorrhage, and 130 (13%) were of undetermined stroke type. Median NIHSS was 16 (9-24). CFR at 30 days, 90 days, 1 year, and 2 years was 37%, 44%, 49%, and 53%, respectively. Factors associated with increased fatality at any timepoint were male sex (hazard ratio (HR): 1.28 (1.05-1.56)), previous stroke (HR: 1.34 (1.04-1.71)), atrial fibrillation (HR: 1.58(1.06-2.34)), subarachnoid hemorrhage (HR: 2.31 (1.40-3.81)), undetermined stroke type (HR: 3.18 (2.44-4.14)), and in-hospital complications (HR: 1.65 (1.36-1.98)). About 93% of patients were completely independent prior to their stroke, declining to 19% at 1 year after stroke. Functional improvement was most likely to occur between 7 and 90 days post stroke with 35% patients improving, and 13% improving between 90 days to 1 year. Increasing age (OR: 0.97 (0.95-0.99)), previous stroke (OR: 0.50 (0.26-0.98)), NIHSS (OR: 0.89 (0.86-0.91)), undetermined stroke type (OR: 0.18 (0.05-0.62)), and ⩾1 in-hospital complication (OR: 0.52 (0.34-0.80)) were associated with lower OR of functional independence at 1 year. Hypertension (OR: 1.98 (1.14-3.44)) and being the primary breadwinner of the household (OR: 1.59 (1.01-2.49)) were associated with functional independence at 1 year. CONCLUSION Stroke affected younger people and resulted in high rates of fatality and functional impairment relative to global averages. Key clinical priorities for reducing fatality include preventing stroke-related complications through evidence-based stroke care, improved detection and management of atrial fibrillation, and increasing coverage of secondary prevention. Further research into care pathways and interventions to encourage care seeking for less severe strokes should be prioritized, including reducing the cost barrier for stroke investigations and care.
Collapse
|
10
|
Prevalence and natural history of depression after stroke: A systematic review and meta-analysis of observational studies. PLoS Med 2023; 20:e1004200. [PMID: 36976794 PMCID: PMC10047522 DOI: 10.1371/journal.pmed.1004200] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Accepted: 02/13/2023] [Indexed: 03/29/2023] Open
Abstract
BACKGROUND Depression is the most frequent psychiatric condition after stroke and is associated with negative health outcomes. We aim to undertake a systematic review and meta-analysis of the prevalence and natural history of depression after stroke. METHODS AND FINDINGS Studies published up to 4 November 2022 on Medline, Embase, PsycINFO, and Web of Science Core Collection were searched. We included studies of adults with stroke, where depression was assessed at a prespecified time point. Studies excluding people with aphasia and history of depression are excluded. Critical Appraisal Skills Programme(CASP) cohort study tool was used to assess risk of bias. A total of 77 studies were included in the pooled estimates of the prevalence of poststroke depression (PSD). The overall prevalence of depression was 27% (95% CI 25 to 30). Prevalence of depression was 24% (95% CI 21 to 28) by clinical interview and 29% (95% CI 25 to 32) by rating scales. Twenty-four studies with more than one assessment time point reported the natural history of PSD. Among people who were depressed within 3 months of stroke, 53% (95% CI 47 to 59) experienced persistent depression, while 44% (95% CI 38 to 50) recovered. The incidence of later depression (3 to 12 months after stroke) was 9% (95% CI 7 to 12). The cumulative incidence during 1 year after stroke was 38% (95% CI 33 to 43), and the majority (71% (95% CI 65 to 76)) of depression had onset within 3 months after stroke. The main limitation of the present study is that excluding people in source studies with severe impairments may produce imprecise estimates of the prevalence of PSD. CONCLUSIONS In this study, we observed that stroke survivors with early-onset depression (within 3 months after stroke) are at high risks for remaining depressed and make up two-thirds of the incident cases during 1 year after stroke. This highlights the need for ongoing clinical monitoring of patients depressed shortly after stroke. TRIAL REGISTRATION PROSPERO CRD42022314146.
Collapse
|
11
|
In a pilot study, automated real-time systematic review updates were feasible, accurate, and work-saving. J Clin Epidemiol 2023; 153:26-33. [PMID: 36150548 DOI: 10.1016/j.jclinepi.2022.08.013] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 08/22/2022] [Accepted: 08/29/2022] [Indexed: 01/31/2023]
Abstract
OBJECTIVES The aim of this study is to describe and pilot a novel method for continuously identifying newly published trials relevant to a systematic review, enabled by combining artificial intelligence (AI) with human expertise. STUDY DESIGN AND SETTING We used RobotReviewer LIVE to keep a review of COVID-19 vaccination trials updated from February to August 2021. We compared the papers identified by the system with those found by the conventional manual process by the review team. RESULTS The manual update searches (last search date July 2021) retrieved 135 abstracts, of which 31 were included after screening (23% precision, 100% recall). By the same date, the automated system retrieved 56 abstracts, of which 31 were included after manual screening (55% precision, 100% recall). Key limitations of the system include that it is limited to searches of PubMed/MEDLINE, and considers only randomized controlled trial reports. We aim to address these limitations in future. The system is available as open-source software for further piloting and evaluation. CONCLUSION Our system identified all relevant studies, reduced manual screening work, and enabled rolling updates on publication of new primary research.
Collapse
|
12
|
The provision of person-centred care for care home residents with stroke: An ethnographic study. HEALTH & SOCIAL CARE IN THE COMMUNITY 2022; 30:e5186-e5195. [PMID: 35869786 PMCID: PMC10084099 DOI: 10.1111/hsc.13936] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Revised: 05/31/2022] [Accepted: 07/08/2022] [Indexed: 06/15/2023]
Abstract
Care home residents with stroke have higher levels of disability and poorer access to health services than those living in their own homes. We undertook observations and semi-structured interviews (n = 28 participants) with managers, staff, residents who had experienced a stroke and their relatives in four homes in London, England, in 2018/2019. Thematic analysis revealed that residents' needs regarding valued activity and stroke-specific care and rehabilitation were not always being met. This resulted from an interplay of factors: staff's lack of recognition of stroke and its effects; gaps in skills; time pressures; and the prioritisation of residents' safety. To improve residential care provision and residents' quality of life, care commissioners, regulators and providers may need to re-examine how care homes balance safety and limits on staff time against residents' valued activity, alongside improving access to specialist healthcare treatment and support.
Collapse
|
13
|
Accuracy and Efficiency of Machine Learning-Assisted Risk-of-Bias Assessments in "Real-World" Systematic Reviews : A Noninferiority Randomized Controlled Trial. Ann Intern Med 2022; 175:1001-1009. [PMID: 35635850 DOI: 10.7326/m22-0092] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
BACKGROUND Automation is a proposed solution for the increasing difficulty of maintaining up-to-date, high-quality health evidence. Evidence assessing the effectiveness of semiautomated data synthesis, such as risk-of-bias (RoB) assessments, is lacking. OBJECTIVE To determine whether RobotReviewer-assisted RoB assessments are noninferior in accuracy and efficiency to assessments conducted with human effort only. DESIGN Two-group, parallel, noninferiority, randomized trial. (Monash Research Office Project 11256). SETTING Health-focused systematic reviews using Covidence. PARTICIPANTS Systematic reviewers, who had not previously used RobotReviewer, completing Cochrane RoB assessments between February 2018 and May 2020. INTERVENTION In the intervention group, reviewers received an RoB form prepopulated by RobotReviewer; in the comparison group, reviewers received a blank form. Studies were assigned in a 1:1 ratio via simple randomization to receive RobotReviewer assistance for either Reviewer 1 or Reviewer 2. Participants were blinded to study allocation before starting work on each RoB form. MEASUREMENTS Co-primary outcomes were the accuracy of individual reviewer RoB assessments and the person-time required to complete individual assessments. Domain-level RoB accuracy was a secondary outcome. RESULTS Of the 15 recruited review teams, 7 completed the trial (145 included studies). Integration of RobotReviewer resulted in noninferior overall RoB assessment accuracy (risk difference, -0.014 [95% CI, -0.093 to 0.065]; intervention group: 88.8% accurate assessments; control group: 90.2% accurate assessments). Data were inconclusive for the person-time outcome (RobotReviewer saved 1.40 minutes [CI, -5.20 to 2.41 minutes]). LIMITATION Variability in user behavior and a limited number of assessable reviews led to an imprecise estimate of the time outcome. CONCLUSION In health-related systematic reviews, RoB assessments conducted with RobotReviewer assistance are noninferior in accuracy to those conducted without RobotReviewer assistance. PRIMARY FUNDING SOURCE University College London and Monash University.
Collapse
|
14
|
What Would it Take to get Biomedical QA Systems into Practice? PROCEEDINGS OF THE CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING. CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING 2021; 2021:28-41. [PMID: 35663506 PMCID: PMC9162079 DOI: 10.18653/v1/2021.mrqa-1.3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Medical question answering (QA) systems have the potential to answer clinicians' uncertainties about treatment and diagnosis on-demand, informed by the latest evidence. However, despite the significant progress in general QA made by the NLP community, medical QA systems are still not widely used in clinical environments. One likely reason for this is that clinicians may not readily trust QA system outputs, in part because transparency, trustworthiness, and provenance have not been key considerations in the design of such models. In this paper we discuss a set of criteria that, if met, we argue would likely increase the utility of biomedical QA systems, which may in turn lead to adoption of such systems in practice. We assess existing models, tasks, and datasets with respect to these criteria, highlighting shortcomings of previously proposed approaches and pointing toward what might be more usable QA systems.
Collapse
|
15
|
A Prospective Stroke Register in Sierra Leone: Demographics, Stroke Type, Stroke Care and Hospital Outcomes. Front Neurol 2021; 12:712060. [PMID: 34557147 PMCID: PMC8453059 DOI: 10.3389/fneur.2021.712060] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Accepted: 07/09/2021] [Indexed: 11/17/2022] Open
Abstract
Introduction: Stroke is the second most common cause of adult death in Africa. This study reports the demographics, stroke types, stroke care and hospital outcomes for stroke in Freetown, Sierra Leone. Methods: A prospective observational register recorded all patients 18 years and over with stroke between May 2019 and April 2020. Stroke was defined according to the WHO criteria. Pearson's chi-squared test was used to examine associations between categorical variables and unpaired t-tests for continuous variables. Multivariable logistic regression, to explain in-hospital death, was reported as odds ratios (ORs) and 95% confidence intervals. Results: Three hundred eighty-five strokes were registered, and 315 (81.8%) were first-in-a-lifetime events. Mean age was 59.2 (SD 13.8), and 187 (48.6%) were male. Of the strokes, 327 (84.9%) were confirmed by CT scan. Two hundred thirty-one (60.0%) were ischaemic, 85 (22.1%) intracerebral haemorrhage, 11 (2.9%) subarachnoid haemorrhage and 58 (15.1%) undetermined stroke type. The median National Institutes of Health Stroke Scale on presentation was 17 [interquartile range (IQR) 9-25]. Haemorrhagic strokes compared with ischaemic strokes were more severe, 20 (IQR 12-26) vs. 13 (IQR 7-22) (p < 0.001), and occurred in a younger population, mean age 52.3 (SD 12.0) vs. 61.6 (SD 13.8) (p < 0.001), with a lower level of educational attainment of 28.2 vs. 40.7% (p = 0.04). The median time from stroke onset to arrival at the principal referral hospital was 25 hours (IQR 6-73). Half of the patients (50.4%) sought care at another health provider prior to arrival. One hundred fifty-one patients died in the hospital (39.5%). Forty-three deaths occurred within 48 hours of arriving at the hospital, with median time to death of 4 days (IQR 0-7 days). Of the patients, 49.6% had ≥1 complication, 98 (25.5%) pneumonia and 33 (8.6%) urinary tract infection. Male gender (OR 3.33, 1.65-6.75), pneumonia (OR 3.75, 1.82-7.76), subarachnoid haemorrhage (OR 43.1, 6.70-277.4) and undetermined stroke types (OR 6.35, 2.17-18.60) were associated with higher risk of in-hospital death. Discussion: We observed severe strokes occurring in a young population with high in-hospital mortality. Further work to deliver evidence-based stroke care is essential to reduce stroke mortality in Sierra Leone.
Collapse
|
16
|
Paragraph-level Simplification of Medical Texts. PROCEEDINGS OF THE CONFERENCE. ASSOCIATION FOR COMPUTATIONAL LINGUISTICS. NORTH AMERICAN CHAPTER. MEETING 2021; 2021:4972-4984. [PMID: 35663507 PMCID: PMC9161242 DOI: 10.18653/v1/2021.naacl-main.395] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
We consider the problem of learning to simplify medical texts. This is important because most reliable, up-to-date information in biomedicine is dense with jargon and thus practically inaccessible to the lay audience. Furthermore, manual simplification does not scale to the rapidly growing body of biomedical literature, motivating the need for automated approaches. Unfortunately, there are no large-scale resources available for this task. In this work we introduce a new corpus of parallel texts in English comprising technical and lay summaries of all published evidence pertaining to different clinical topics. We then propose a new metric based on likelihood scores from a masked language model pretrained on scientific texts. We show that this automated measure better differentiates between technical and lay summaries than existing heuristics. We introduce and evaluate baseline encoder-decoder Transformer models for simplification and propose a novel augmentation to these in which we explicitly penalize the decoder for producing 'jargon' terms; we find that this yields improvements over baselines in terms of readability.
Collapse
|
17
|
Understanding Clinical Trial Reports: Extracting Medical Entities and Their Relations. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2021; 2021:485-494. [PMID: 34457164 PMCID: PMC8378650] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
The best evidence concerning comparative treatment effectiveness comes from clinical trials, the results of which are reported in unstructured articles. Medical experts must manually extract information from articles to inform decision-making, which is time-consuming and expensive. Here we consider the end-to-end task of both (a) extracting treatments and outcomes from full-text articles describing clinical trials (entity identification) and, (b) inferring the reported results for the former with respect to the latter (relation extraction). We introduce new data for this task, and evaluate models that have recently achieved state-of-the-art results on similar tasks in Natural Language Processing. We then propose a new method motivated by how trial results are typically presented that outperforms these purely data-driven baselines. Finally, we run a fielded evaluation of the model with a non-profit seeking to identify existing drugs that might be re-purposed for cancer, showing the potential utility of end-to-end evidence extraction systems.
Collapse
|
18
|
Generating (Factual?) Narrative Summaries of RCTs: Experiments with Neural Multi-Document Summarization. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2021; 2021:605-614. [PMID: 34457176 PMCID: PMC8378607] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
We consider the problem of automatically generating a narrative biomedical evidence summary from multiple trial reports. We evaluate modern neural models for abstractive summarization of relevant article abstracts from systematic reviews previously conducted by members of the Cochrane collaboration, using the authors conclusions section of the review abstract as our target. We enlist medical professionals to evaluate generated summaries, and we find that summarization systems yield consistently fluent and relevant synopses, but these often contain factual inaccuracies. We propose new approaches that capitalize on domain-specific models to inform summarization, e.g., by explicitly demarcating snippets of inputs that convey key findings, and emphasizing the reports of large and high-quality trials. We find that these strategies modestly improve the factual accuracy of generated summaries. Finally, we propose a new method for automatically evaluating the factuality of generated narrative evidence syntheses using models that infer the directionality of reported findings.
Collapse
|
19
|
Machine learning reduced workload with minimal risk of missing studies: development and evaluation of a randomized controlled trial classifier for Cochrane Reviews. J Clin Epidemiol 2021; 133:140-151. [PMID: 33171275 PMCID: PMC8168828 DOI: 10.1016/j.jclinepi.2020.11.003] [Citation(s) in RCA: 64] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2020] [Revised: 10/13/2020] [Accepted: 11/03/2020] [Indexed: 12/29/2022]
Abstract
OBJECTIVES This study developed, calibrated, and evaluated a machine learning classifier designed to reduce study identification workload in Cochrane for producing systematic reviews. METHODS A machine learning classifier for retrieving randomized controlled trials (RCTs) was developed (the "Cochrane RCT Classifier"), with the algorithm trained using a data set of title-abstract records from Embase, manually labeled by the Cochrane Crowd. The classifier was then calibrated using a further data set of similar records manually labeled by the Clinical Hedges team, aiming for 99% recall. Finally, the recall of the calibrated classifier was evaluated using records of RCTs included in Cochrane Reviews that had abstracts of sufficient length to allow machine classification. RESULTS The Cochrane RCT Classifier was trained using 280,620 records (20,454 of which reported RCTs). A classification threshold was set using 49,025 calibration records (1,587 of which reported RCTs), and our bootstrap validation found the classifier had recall of 0.99 (95% confidence interval 0.98-0.99) and precision of 0.08 (95% confidence interval 0.06-0.12) in this data set. The final, calibrated RCT classifier correctly retrieved 43,783 (99.5%) of 44,007 RCTs included in Cochrane Reviews but missed 224 (0.5%). Older records were more likely to be missed than those more recently published. CONCLUSIONS The Cochrane RCT Classifier can reduce manual study identification workload for Cochrane Reviews, with a very low and acceptable risk of missing eligible RCTs. This classifier now forms part of the Evidence Pipeline, an integrated workflow deployed within Cochrane to help improve the efficiency of the study identification processes that support systematic review production.
Collapse
|
20
|
Trialstreamer: A living, automatically updated database of clinical trial reports. J Am Med Inform Assoc 2020; 27:1903-1912. [PMID: 32940710 PMCID: PMC7727361 DOI: 10.1093/jamia/ocaa163] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2020] [Accepted: 07/07/2020] [Indexed: 11/14/2022] Open
Abstract
OBJECTIVE Randomized controlled trials (RCTs) are the gold standard method for evaluating whether a treatment works in health care but can be difficult to find and make use of. We describe the development and evaluation of a system to automatically find and categorize all new RCT reports. MATERIALS AND METHODS Trialstreamer continuously monitors PubMed and the World Health Organization International Clinical Trials Registry Platform, looking for new RCTs in humans using a validated classifier. We combine machine learning and rule-based methods to extract information from the RCT abstracts, including free-text descriptions of trial PICO (populations, interventions/comparators, and outcomes) elements and map these snippets to normalized MeSH (Medical Subject Headings) vocabulary terms. We additionally identify sample sizes, predict the risk of bias, and extract text conveying key findings. We store all extracted data in a database, which we make freely available for download, and via a search portal, which allows users to enter structured clinical queries. Results are ranked automatically to prioritize larger and higher-quality studies. RESULTS As of early June 2020, we have indexed 673 191 publications of RCTs, of which 22 363 were published in the first 5 months of 2020 (142 per day). We additionally include 304 111 trial registrations from the International Clinical Trials Registry Platform. The median trial sample size was 66. CONCLUSIONS We present an automated system for finding and categorizing RCTs. This yields a novel resource: a database of structured information automatically extracted for all published RCTs in humans. We make daily updates of this database available on our website (https://trialstreamer.robotreviewer.net).
Collapse
|
21
|
Trialstreamer: Mapping and Browsing Medical Evidence in Real-Time. PROCEEDINGS OF THE CONFERENCE. ASSOCIATION FOR COMPUTATIONAL LINGUISTICS. NORTH AMERICAN CHAPTER. MEETING 2020; 2020:63-69. [PMID: 34136886 PMCID: PMC8204713 DOI: 10.18653/v1/2020.acl-demos.9] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
We introduce Trialstreamer, a living database of clinical trial reports. Here we mainly describe the evidence extraction component; this extracts from biomedical abstracts key pieces of information that clinicians need when appraising the literature, and also the relations between these. Specifically, the system extracts descriptions of trial participants, the treatments compared in each arm (the interventions), and which outcomes were measured. The system then attempts to infer which interventions were reported to work best by determining their relationship with identified trial outcome measures. In addition to summarizing individual trials, these extracted data elements allow automatic synthesis of results across many trials on the same topic. We apply the system at scale to all reports of randomized controlled trials indexed in MEDLINE, powering the automatic generation of evidence maps, which provide a global view of the efficacy of different interventions combining data from all relevant clinical trials on a topic. We make all code and models freely available alongside a demonstration of the web interface.
Collapse
|
22
|
A systematic review of machine learning models for predicting outcomes of stroke with structured data. PLoS One 2020; 15:e0234722. [PMID: 32530947 PMCID: PMC7292406 DOI: 10.1371/journal.pone.0234722] [Citation(s) in RCA: 65] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2020] [Accepted: 06/01/2020] [Indexed: 12/11/2022] Open
Abstract
Background and purpose Machine learning (ML) has attracted much attention with the hope that it could make use of large, routinely collected datasets and deliver accurate personalised prognosis. The aim of this systematic review is to identify and critically appraise the reporting and developing of ML models for predicting outcomes after stroke. Methods We searched PubMed and Web of Science from 1990 to March 2019, using previously published search filters for stroke, ML, and prediction models. We focused on structured clinical data, excluding image and text analysis. This review was registered with PROSPERO (CRD42019127154). Results Eighteen studies were eligible for inclusion. Most studies reported less than half of the terms in the reporting quality checklist. The most frequently predicted stroke outcomes were mortality (7 studies) and functional outcome (5 studies). The most commonly used ML methods were random forests (9 studies), support vector machines (8 studies), decision trees (6 studies), and neural networks (6 studies). The median sample size was 475 (range 70–3184), with a median of 22 predictors (range 4–152) considered. All studies evaluated discrimination with thirteen using area under the ROC curve whilst calibration was assessed in three. Two studies performed external validation. None described the final model sufficiently well to reproduce it. Conclusions The use of ML for predicting stroke outcomes is increasing. However, few met basic reporting standards for clinical prediction tools and none made their models available in a way which could be used or evaluated. Major improvements in ML study conduct and reporting are needed before it can meaningfully be considered for practice.
Collapse
|
23
|
Abstract
The evidence base in health psychology is vast and growing rapidly. These factors make it difficult (and sometimes practically impossible) to consider all available evidence when making decisions about the state of knowledge on a given phenomenon (e.g., associations of variables, effects of interventions on particular outcomes). Systematic reviews, meta-analyses, and other rigorous syntheses of the research mitigate this problem by providing concise, actionable summaries of knowledge in a given area of study. Yet, conducting these syntheses has grown increasingly laborious owing to the fast accumulation of new evidence; existing, manual methods for synthesis do not scale well. In this article, we discuss how semi-automation via machine learning and natural language processing methods may help researchers and practitioners to review evidence more efficiently. We outline concrete examples in health psychology, highlighting practical, open-source technologies available now. We indicate the potential of more advanced methods and discuss how to avoid the pitfalls of automated reviews.
Collapse
|
24
|
Blood eosinophil count, a marker of inhaled corticosteroid effectiveness in preventing COPD exacerbations in post-hoc RCT and observational studies: systematic review and meta-analysis. Respir Res 2020; 21:3. [PMID: 31900184 PMCID: PMC6942335 DOI: 10.1186/s12931-019-1268-7] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2019] [Accepted: 12/23/2019] [Indexed: 12/17/2022] Open
Abstract
Background Blood eosinophil count has been proposed as a predictor of response to inhaled corticosteroid (ICS) in the prevention of acute exacerbations of COPD. An optimal threshold of blood eosinophil count for prescribing ICS has not been agreed. Doubt has been cast on the role by observational studies. The role of inhaled corticosteroids in this relationship, independent of long-acting bronchodilators, has not been examined. Methods We conducted a systematic review of post-hoc analyses of randomised controlled trials (RCTs) and observational studies examining three blood eosinophil thresholds and the independent role of ICS. Included studies were categorised by the form (relative or absolute count) and cut point of eosinophil threshold used. Thresholds assessed were relative eosinophil count of 2%, and absolute counts of 150 cells/μL and 300 cells/μL. Three meta-analyses of the effect of ICS use in post-hoc analyses of RCTs based on these counts were carried out. Initial analysis included all studies of ICS vs. any non-ICS regimen. Further analysis examined the effect of ICS, independent of the effect of long-acting bronchodilators. Results Sixteen studies examined the association between blood eosinophil count and response of exacerbation risk to ICS, in COPD patients. Eleven studies (25,881 patients) were post-hoc analyses of RCTs. Five studies (109,704 patients) were retrospective observational studies. The independent effect of ICS on the reduction of exacerbation risk was 20% at ≥2% blood eosinophil threshold (RR, 0.80; 95% CI, 0.74–0.85), 35% at ≥150 cells/μL blood eosinophil threshold (RR, 0.65; 0.52–0.79), and 39% at ≥300 cells/μL blood eosinophil threshold (RR, 0.61; 0.44–0.78). No association was found in four out of five observational studies. Conclusion This is the first systematic review to assess, in post-hoc analyses of RCTs, the independent effect of ICS in reducing the risk of COPD exacerbation across a range of blood eosinophil thresholds. Association between ICS prescription and reduced exacerbation risk at these thresholds was confirmed. The lack of association found in the observational studies questions the relevance of these observations to a “real world” COPD population. To clarify the clinical utility of this biomarker, the association should be tested in prospective effectiveness studies.
Collapse
|
25
|
Collaborative design of a decision aid for stroke survivors with multimorbidity: a qualitative study in the UK engaging key stakeholders. BMJ Open 2019; 9:e030385. [PMID: 31420396 PMCID: PMC6701575 DOI: 10.1136/bmjopen-2019-030385] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/11/2019] [Revised: 06/27/2019] [Accepted: 07/01/2019] [Indexed: 01/01/2023] Open
Abstract
OBJECTIVES Effective secondary stroke prevention strategies are suboptimally used. Novel development of interventions to enable healthcare professionals and stroke survivors to manage risk factors for stroke recurrence are required. We sought to engage key stakeholders in the design and evaluation of an intervention informed by a learning health system approach, to improve risk factor management and secondary prevention for stroke survivors with multimorbidity. DESIGN Qualitative, including focus groups, semistructured interviews and usability evaluations. Data was audio recorded, transcribed and coded thematically. PARTICIPANTS Stroke survivors, carers, health and social care professionals, commissioners, policymakers and researchers. SETTING Stroke survivors were recruited from the South London Stroke Register; health and social care professionals through South London general practices and King's College London (KCL) networks; carers, commissioners, policymakers and researchers through KCL networks. RESULTS 53 stakeholders in total participated in focus groups, interviews and usability evaluations. Thirty-seven participated in focus groups and interviews, including stroke survivors and carers (n=11), health and social care professionals (n=16), commissioners and policymakers (n=6) and researchers (n=4). Sixteen participated in usability evaluations, including stroke survivors (n=8) and general practitioners (GPs; n=8). Eight themes informed the collaborative design of DOTT (Deciding On Treatments Together), a decision aid integrated with the electronic health record system, to be used in primary care during clinical consultations between the healthcare professional and stroke survivor. DOTT aims to facilitate shared decision-making on personalised treatments leading to improved treatment adherence and risk control. DOTT was found acceptable and usable among stroke survivors and GPs during a series of evaluations. CONCLUSIONS Adopting a user-centred data-driven design approach informed an intervention that is acceptable to users and has the potential to improve patient outcomes. A future feasibility study and subsequent clinical trial will provide evidence of the effectiveness of DOTT in reducing risk of stroke recurrence.
Collapse
|
26
|
Toward systematic review automation: a practical guide to using machine learning tools in research synthesis. Syst Rev 2019; 8:163. [PMID: 31296265 PMCID: PMC6621996 DOI: 10.1186/s13643-019-1074-9] [Citation(s) in RCA: 188] [Impact Index Per Article: 37.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/18/2019] [Accepted: 06/24/2019] [Indexed: 11/10/2022] Open
Abstract
Technologies and methods to speed up the production of systematic reviews by reducing the manual labour involved have recently emerged. Automation has been proposed or used to expedite most steps of the systematic review process, including search, screening, and data extraction. However, how these technologies work in practice and when (and when not) to use them is often not clear to practitioners. In this practical guide, we provide an overview of current machine learning methods that have been proposed to expedite evidence synthesis. We also offer guidance on which of these are ready for use, their strengths and weaknesses, and how a systematic review team might go about using them in practice.
Collapse
|
27
|
Machine learning to help researchers evaluate biases in clinical trials: a prospective, randomized user study. BMC Med Inform Decis Mak 2019; 19:96. [PMID: 31068178 PMCID: PMC6505190 DOI: 10.1186/s12911-019-0814-z] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2018] [Accepted: 04/15/2019] [Indexed: 01/08/2023] Open
Abstract
Objective Assessing risks of bias in randomized controlled trials (RCTs) is an important but laborious task when conducting systematic reviews. RobotReviewer (RR), an open-source machine learning (ML) system, semi-automates bias assessments. We conducted a user study of RobotReviewer, evaluating time saved and usability of the tool. Materials and methods Systematic reviewers applied the Cochrane Risk of Bias tool to four randomly selected RCT articles. Reviewers judged: whether an RCT was at low, or high/unclear risk of bias for each bias domain in the Cochrane tool (Version 1); and highlighted article text justifying their decision. For a random two of the four articles, the process was semi-automated: users were provided with ML-suggested bias judgments and text highlights. Participants could amend the suggestions if necessary. We measured time taken for the task, ML suggestions, usability via the System Usability Scale (SUS) and collected qualitative feedback. Results For 41 volunteers, semi-automation was quicker than manual assessment (mean 755 vs. 824 s; relative time 0.75, 95% CI 0.62–0.92). Reviewers accepted 301/328 (91%) of the ML Risk of Bias (RoB) judgments, and 202/328 (62%) of text highlights without change. Overall, ML suggested text highlights had a recall of 0.90 (SD 0.14) and precision of 0.87 (SD 0.21) with respect to the users’ final versions. Reviewers assigned the system a mean 77.7 SUS score, corresponding to a rating between “good” and “excellent”. Conclusions Semi-automation (where humans validate machine learning suggestions) can improve the efficiency of evidence synthesis. Our system was rated highly usable, and expedited bias assessment of RCTs.
Collapse
|
28
|
Rapid reviews may produce different results to systematic reviews: a meta-epidemiological study. J Clin Epidemiol 2019; 109:30-41. [PMID: 30590190 PMCID: PMC6524137 DOI: 10.5281/zenodo.1447087] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2018] [Revised: 12/07/2018] [Accepted: 12/19/2018] [Indexed: 08/31/2023]
Abstract
OBJECTIVE To simulate possible changes in systematic review results if rapid review methods were used. STUDY DESIGN AND SETTING We recalculated meta-analyses for binary primary outcomes in Cochrane systematic reviews, simulating rapid review methods. We simulated searching only PubMed, excluding older articles (5, 7, 10, 15, and 20 years before the search date), excluding smaller trials (<50, <100, and <200 participants), and using the largest trial only. We examined percentage changes in pooled odds ratios (ORs) (classed as no important change [<5%], small [<20%], moderate [<30%], or large [≥30%]), statistical significance, and biases observed using rapid methods. RESULTS Two thousand five hundred and twelve systematic reviews (16,088 studies) were included. Rapid methods resulted in the loss of all data in 3.7-44.7% of meta-analyses. Searching only PubMed had the smallest risk of changed ORs (19% [477/2,512] were small changes or greater; 10% [260/2,512] were moderate or greater). Changes in ORs varied substantially with each rapid review method; 8.4-21.3% were small, 1.9-8.8% were moderate, and 4.7-34.1% were large. Changes in statistical significance occurred in 6.5-38.6% of meta-analyses. Changes from significant to nonsignificant were most common (2.1-13.7% meta-analyses). We found no evidence of bias with any rapid review method. CONCLUSION Searching PubMed only might be considered where a ∼10% risk of the primary outcome OR changing by >20% could be tolerated. This could be the case in scoping reviews, resource limitation, or where syntheses are needed urgently. Other situations, such as clinical guidelines and regulatory decisions, favor more comprehensive systematic review methods.
Collapse
|
29
|
A comparison of trends in stroke care and outcomes between in-hospital and community-onset stroke - The South London Stroke Register. PLoS One 2019; 14:e0212396. [PMID: 30789929 PMCID: PMC6383917 DOI: 10.1371/journal.pone.0212396] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2018] [Accepted: 02/03/2019] [Indexed: 01/19/2023] Open
Abstract
Background Stroke care and outcomes have improved significantly over the past decades. It is unclear if patients who had a stroke in hospital (in-hospital stroke, IHS) experienced similar improvements to those who were admitted with stroke (community-onset stroke, COS). Methods Data from the South London Stroke Register were analysed to estimate trends in processes of care and outcomes across three cohorts (1995–2001, 2002–2008, 2009–2015). Kaplan-Meier survival curves were calculated for each cohort. Associations between patient location at stroke onset, processes of care, and outcomes were investigated using multiple logistic regression and Cox proportional hazards models. Results Of 5,119 patients admitted to hospital and registered between 1995 and 2015, 552(10.8%) had IHS. Brain imaging rates increased from 92.4%(COS) and 78.3%(IHS) in 1995–2001 to 100% for COS and IHS in 2009–2015. Rates of stroke unit admission rose but remained lower for IHS (1995–2001: 32.2%(COS) vs. 12.4%(IHS), 2002–2008: 77.1%(COS) vs. 50.0%(IHS), 2009–2015: 86.3%(COS) vs. 65.4%(IHS)). After adjusting for patient characteristics and case-mix, IHS was independently associated with lower rates of stroke unit admission in each cohort (1995–2001: OR 0.49, 95%CI 0.29–0.82, 2002–2008: 0.29, 0.18–0.45, 2009–2015: 0.22, 0.11–0.43). In 2009–2015, thrombolysis rates were lower for ischaemic IHS (17.8%(COS) vs. 13.8%(IHS)). Despite a decline, in-hospital mortality remained significantly higher after IHS in 2009–2015 (13.7%(COS) vs. 26.7%(IHS)). Five-year mortality rates declined for COS from 58.9%(1995–2001) to 35.2%(2009–2015) and for IHS from 80.8%(1995–2001) to 51.1%(2009–2015). In multivariable analysis, IHS was associated with higher mortality over five years post-stroke in each cohort (1995–2001: HR 1.27, 95%CI 1.03–1.57, 2002–2008: 1.24, 0.99–1.55, 2009–2016: 1.39, 0.95–2.04). Conclusions Despite significant improvements for IHS patients similar to those for COS patients, rates of stroke unit admission and thrombolysis remain lower, and short- and long-term outcomes poorer after IHS. Factors preventing IHS patients from entering evidence-based stroke-specific hospital pathways in a timely fashion need further investigation.
Collapse
|
30
|
Rapid reviews may produce different results to systematic reviews: a meta-epidemiological study. J Clin Epidemiol 2018; 109:30-41. [PMID: 30590190 PMCID: PMC6524137 DOI: 10.1016/j.jclinepi.2018.12.015] [Citation(s) in RCA: 50] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2018] [Revised: 12/07/2018] [Accepted: 12/19/2018] [Indexed: 11/17/2022]
Abstract
Objective To simulate possible changes in systematic review results if rapid review methods were used. Study Design and Setting We recalculated meta-analyses for binary primary outcomes in Cochrane systematic reviews, simulating rapid review methods. We simulated searching only PubMed, excluding older articles (5, 7, 10, 15, and 20 years before the search date), excluding smaller trials (<50, <100, and <200 participants), and using the largest trial only. We examined percentage changes in pooled odds ratios (ORs) (classed as no important change [<5%], small [<20%], moderate [<30%], or large [≥30%]), statistical significance, and biases observed using rapid methods. Results Two thousand five hundred and twelve systematic reviews (16,088 studies) were included. Rapid methods resulted in the loss of all data in 3.7–44.7% of meta-analyses. Searching only PubMed had the smallest risk of changed ORs (19% [477/2,512] were small changes or greater; 10% [260/2,512] were moderate or greater). Changes in ORs varied substantially with each rapid review method; 8.4–21.3% were small, 1.9–8.8% were moderate, and 4.7–34.1% were large. Changes in statistical significance occurred in 6.5–38.6% of meta-analyses. Changes from significant to nonsignificant were most common (2.1–13.7% meta-analyses). We found no evidence of bias with any rapid review method. Conclusion Searching PubMed only might be considered where a ∼10% risk of the primary outcome OR changing by >20% could be tolerated. This could be the case in scoping reviews, resource limitation, or where syntheses are needed urgently. Other situations, such as clinical guidelines and regulatory decisions, favor more comprehensive systematic review methods.
Collapse
|
31
|
'People like you?': how people with hypertension make sense of future cardiovascular risk-a qualitative study. BMJ Open 2018; 8:e023726. [PMID: 30413514 PMCID: PMC6231586 DOI: 10.1136/bmjopen-2018-023726] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/24/2018] [Revised: 07/24/2018] [Accepted: 09/14/2018] [Indexed: 01/08/2023] Open
Abstract
OBJECTIVES Cardiovascular disease (CVD) prevention guidelines recommend that patients' future CVD risk (as a percentage) is estimated and used to inform shared treatment decisions. We sought to understand the perspectives of patients with hypertension on their future risk of CVD. DESIGN Qualitative, semistructured interviews and thematic analysis. PARTICIPANTS People with hypertension who had not experienced a cardiovascular event recruited from primary care. SETTING Participants were purposively sampled from two primary care practices in South London. Interviews were transcribed, and a thematic analysis was conducted. RESULTS 24 people participated; participants were diverse in age, sex, ethnicity and socioeconomic status. Younger working-aged people were under-represented. Contrasting with probabilistic risk, many participants understood future CVD as binary and unknowable. Roughly half of participants avoided contemplating future CVD risk; for some, lifestyle change and medication obviated the need to think about CVD risk. Some participants identified with one portion of the probability fraction ('I'd be one of those ones.'). Comparison with peers (typically partners, siblings and friends of a similar age, including both 'healthy' and 'unhealthy' people) was most frequently used to describe risk, both among those who engaged with and avoided risk discussion. This contrasts with current risk scores, which describe probabilities in people with similar risk factors; many participants did not identify with such a group, and hence did not find these probabilities meaningful, even where correctly understood. CONCLUSIONS Risk as typically calculated and communicated (eg, the risk of '100 people like you') may not be meaningful for patients who do not identify with the denominator. Comparing an individual's risk with their peers could be more meaningful.
Collapse
|
32
|
A Corpus with Multi-Level Annotations of Patients, Interventions and Outcomes to Support Language Processing for Medical Literature. PROCEEDINGS OF THE CONFERENCE. ASSOCIATION FOR COMPUTATIONAL LINGUISTICS. MEETING 2018; 2018:197-207. [PMID: 30305770 PMCID: PMC6174533] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
We present a corpus of 5,000 richly annotated abstracts of medical articles describing clinical randomized controlled trials. Annotations include demarcations of text spans that describe the Patient population enrolled, the Interventions studied and to what they were Compared, and the Outcomes measured (the 'PICO' elements). These spans are further annotated at a more granular level, e.g., individual interventions within them are marked and mapped onto a structured medical vocabulary. We acquired annotations from a diverse set of workers with varying levels of expertise and cost. We describe our data collection process and the corpus itself in detail. We then outline a set of challenging NLP tasks that would aid searching of the medical literature and the practice of evidence-based medicine.
Collapse
|
33
|
Identifying reports of randomized controlled trials (RCTs) via a hybrid machine learning and crowdsourcing approach. J Am Med Inform Assoc 2018; 24:1165-1168. [PMID: 28541493 PMCID: PMC5975623 DOI: 10.1093/jamia/ocx053] [Citation(s) in RCA: 89] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2017] [Accepted: 05/18/2017] [Indexed: 11/14/2022] Open
Abstract
Objectives Identifying all published reports of randomized controlled trials (RCTs) is an important aim, but it requires extensive manual effort to separate RCTs from non-RCTs, even using current machine learning (ML) approaches. We aimed to make this process more efficient via a hybrid approach using both crowdsourcing and ML. Methods We trained a classifier to discriminate between citations that describe RCTs and those that do not. We then adopted a simple strategy of automatically excluding citations deemed very unlikely to be RCTs by the classifier and deferring to crowdworkers otherwise. Results Combining ML and crowdsourcing provides a highly sensitive RCT identification strategy (our estimates suggest 95%-99% recall) with substantially less effort (we observed a reduction of around 60%-80%) than relying on manual screening alone. Conclusions Hybrid crowd-ML strategies warrant further exploration for biomedical curation/annotation tasks.
Collapse
|
34
|
Machine learning for identifying Randomized Controlled Trials: An evaluation and practitioner's guide. Res Synth Methods 2018; 9:602-614. [PMID: 29314757 PMCID: PMC6030513 DOI: 10.1002/jrsm.1287] [Citation(s) in RCA: 241] [Impact Index Per Article: 40.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2017] [Revised: 10/31/2017] [Accepted: 12/05/2017] [Indexed: 12/03/2022]
Abstract
Machine learning (ML) algorithms have proven highly accurate for identifying Randomized Controlled Trials (RCTs) but are not used much in practice, in part because the best way to make use of the technology in a typical workflow is unclear. In this work, we evaluate ML models for RCT classification (support vector machines, convolutional neural networks, and ensemble approaches). We trained and optimized support vector machine and convolutional neural network models on the titles and abstracts of the Cochrane Crowd RCT set. We evaluated the models on an external dataset (Clinical Hedges), allowing direct comparison with traditional database search filters. We estimated area under receiver operating characteristics (AUROC) using the Clinical Hedges dataset. We demonstrate that ML approaches better discriminate between RCTs and non‐RCTs than widely used traditional database search filters at all sensitivity levels; our best‐performing model also achieved the best results to date for ML in this task (AUROC 0.987, 95% CI, 0.984‐0.989). We provide practical guidance on the role of ML in (1) systematic reviews (high‐sensitivity strategies) and (2) rapid reviews and clinical question answering (high‐precision strategies) together with recommended probability cutoffs for each use case. Finally, we provide open‐source software to enable these approaches to be used in practice.
Collapse
|
35
|
A Neural Candidate-Selector Architecture for Automatic Structured Clinical Text Annotation. PROCEEDINGS OF THE ... ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT. ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT 2017; 2017:1519-1528. [PMID: 29308293 PMCID: PMC5752318 DOI: 10.1145/3132847.3132989] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
We consider the task of automatically annotating free texts describing clinical trials with concepts from a controlled, structured medical vocabulary. Specifically we aim to build a model to infer distinct sets of (ontological) concepts describing complementary clinically salient aspects of the underlying trials: the populations enrolled, the interventions administered and the outcomes measured, i.e., the PICO elements. This important practical problem poses a few key challenges. One issue is that the output space is vast, because the vocabulary comprises many unique concepts. Compounding this problem, annotated data in this domain is expensive to collect and hence sparse. Furthermore, the outputs (sets of concepts for each PICO element) are correlated: specific populations (e.g., diabetics) will render certain intervention concepts likely (insulin therapy) while effectively precluding others (radiation therapy). Such correlations should be exploited. We propose a novel neural model that addresses these challenges. We introduce a Candidate-Selector architecture in which the model considers setes of candidate concepts for PICO elements, and assesses their plausibility conditioned on the input text to be annotated. This relies on a 'candidate set' generator, which may be learned or relies on heuristics. A conditional discriminative neural model then jointly selects candidate concepts, given the input text. We compare the predictive performance of our approach to strong baselines, and show that it outperforms them. Finally, we perform a qualitative evaluation of the generated annotations by asking domain experts to assess their quality.
Collapse
|
36
|
Automating Biomedical Evidence Synthesis: RobotReviewer. PROCEEDINGS OF THE CONFERENCE. ASSOCIATION FOR COMPUTATIONAL LINGUISTICS. MEETING 2017; 2017:7-12. [PMID: 29093610 PMCID: PMC5662138 DOI: 10.18653/v1/p17-4002] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
We present RobotReviewer, an open-source web-based system that uses machine learning and NLP to semi-automate biomedical evidence synthesis, to aid the practice of Evidence-Based Medicine. RobotReviewer processes full-text journal articles (PDFs) describing randomized controlled trials (RCTs). It appraises the reliability of RCTs and extracts text describing key trial characteristics (e.g., descriptions of the population) using novel NLP methods. RobotReviewer then automatically generates a report synthesising this information. Our goal is for RobotReviewer to automatically extract and synthesise the full-range of structured data needed to inform evidence-based practice.
Collapse
|
37
|
Trends in the prevalence and management of pre-stroke atrial fibrillation, the South London Stroke Register, 1995-2014. PLoS One 2017; 12:e0175980. [PMID: 28410424 PMCID: PMC5391932 DOI: 10.1371/journal.pone.0175980] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2017] [Accepted: 04/03/2017] [Indexed: 11/18/2022] Open
Abstract
Background Previous studies have found low use of anticoagulation prior to stroke, in people with atrial fibrillation (AF). This study examined data on patients with AF-related stroke from a population-based stroke register, and sought to examine changes in management of AF prior to stroke, and reasons for suboptimal treatment, in those who were known to be at a high risk of stroke. Methods The South London Stroke Register (SLSR) is an ongoing population-based register recording first-in-a-lifetime stroke. Trends in the prevalence of AF, and antithrombotic medication prescribed before the stroke, were investigated from 1995 to 2014. Multivariable logistic regression analyses were conducted to assess the factors associated with appropriate management. Results Of the 5041 patients on the register, 816 (16.2%) were diagnosed with AF before their stroke. AF related stroke increased substantially among Black Carribean and Black African patients, comprising 5% of the overall cohort in 1995–1998, increasing to 25% by 2011–2014 (p<0.001). Anticoagulant prescription in AF patients at high-risk of stroke (CHADS2 score [> = 2]) increased from 9% (1995–1998) to 30% (2011–2014) (p<0.001). Antiplatelet prescription was more commonly prescribed throughout all time periods (43% to 64% of high-risk patients.) Elderly patients (>65) were significantly less likely to be prescribed an anticoagulant, with ethnicity, gender and deprivation showing no association with anticoagulation. Conclusions Most AF-related strokes occurred in people who could have been predicted to be at high risk before their stroke, yet were not prescribed optimal preventative treatment. The elderly,despite being at highest stroke risk, were rarely prescribed anticoagulants.
Collapse
|
38
|
The effects of socioeconomic status on stroke risk and outcomes. Lancet Neurol 2016; 14:1206-18. [PMID: 26581971 DOI: 10.1016/s1474-4422(15)00200-8] [Citation(s) in RCA: 210] [Impact Index Per Article: 26.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2015] [Revised: 07/23/2015] [Accepted: 07/29/2015] [Indexed: 11/19/2022]
Abstract
The latest evidence on socioeconomic status and stroke shows that stroke not only disproportionately affects low-income and middle-income countries, but also socioeconomically deprived populations within high-income countries. These disparities are reflected not only in risk of stroke but also in short-term and long-term outcomes after stroke. Increased average levels of conventional risk factors (eg, hypertension, hyperlipidaemia, excessive alcohol intake, smoking, obesity, and sedentary lifestyle) in populations with low socioeconomic status account for about half of these effects. In many countries, evidence shows that people with lower socioeconomic status are less likely to receive good-quality acute hospital and rehabilitation care than people with higher socioeconomic status. For clinical practice, better implementation of well established treatments, effective management of risk factors, and equity of access to high-quality acute stroke care and rehabilitation will probably reduce inequality substantially. Overcoming barriers and adapting evidence-based interventions to different countries and health-care settings remains a research priority.
Collapse
|
39
|
Extracting PICO Sentences from Clinical Trial Reports using Supervised Distant Supervision. JOURNAL OF MACHINE LEARNING RESEARCH : JMLR 2016; 17:132. [PMID: 27746703 PMCID: PMC5065023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Systematic reviews underpin Evidence Based Medicine (EBM) by addressing precise clinical questions via comprehensive synthesis of all relevant published evidence. Authors of systematic reviews typically define a Population/Problem, Intervention, Comparator, and Outcome (a PICO criteria) of interest, and then retrieve, appraise and synthesize results from all reports of clinical trials that meet these criteria. Identifying PICO elements in the full-texts of trial reports is thus a critical yet time-consuming step in the systematic review process. We seek to expedite evidence synthesis by developing machine learning models to automatically extract sentences from articles relevant to PICO elements. Collecting a large corpus of training data for this task would be prohibitively expensive. Therefore, we derive distant supervision (DS) with which to train models using previously conducted reviews. DS entails heuristically deriving 'soft' labels from an available structured resource. However, we have access only to unstructured, free-text summaries of PICO elements for corresponding articles; we must derive from these the desired sentence-level annotations. To this end, we propose a novel method - supervised distant supervision (SDS) - that uses a small amount of direct supervision to better exploit a large corpus of distantly labeled instances by learning to pseudo-annotate articles using the available DS. We show that this approach tends to outperform existing methods with respect to automated PICO extraction.
Collapse
|
40
|
RobotReviewer: evaluation of a system for automatically assessing bias in clinical trials. J Am Med Inform Assoc 2016; 23:193-201. [PMID: 26104742 PMCID: PMC4713900 DOI: 10.1093/jamia/ocv044] [Citation(s) in RCA: 117] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2014] [Revised: 04/16/2015] [Accepted: 04/18/2015] [Indexed: 11/13/2022] Open
Abstract
OBJECTIVE To develop and evaluate RobotReviewer, a machine learning (ML) system that automatically assesses bias in clinical trials. From a (PDF-formatted) trial report, the system should determine risks of bias for the domains defined by the Cochrane Risk of Bias (RoB) tool, and extract supporting text for these judgments. METHODS We algorithmically annotated 12,808 trial PDFs using data from the Cochrane Database of Systematic Reviews (CDSR). Trials were labeled as being at low or high/unclear risk of bias for each domain, and sentences were labeled as being informative or not. This dataset was used to train a multi-task ML model. We estimated the accuracy of ML judgments versus humans by comparing trials with two or more independent RoB assessments in the CDSR. Twenty blinded experienced reviewers rated the relevance of supporting text, comparing ML output with equivalent (human-extracted) text from the CDSR. RESULTS By retrieving the top 3 candidate sentences per document (top3 recall), the best ML text was rated more relevant than text from the CDSR, but not significantly (60.4% ML text rated 'highly relevant' v 56.5% of text from reviews; difference +3.9%, [-3.2% to +10.9%]). Model RoB judgments were less accurate than those from published reviews, though the difference was <10% (overall accuracy 71.0% with ML v 78.3% with CDSR). CONCLUSION Risk of bias assessment may be automated with reasonable accuracy. Automatically identified text supporting bias assessment is of equal quality to the manually identified text in the CDSR. This technology could substantially reduce reviewer workload and expedite evidence syntheses.
Collapse
|
41
|
Abstract
Systematic reviews, which summarize the entirety of the evidence pertaining to a specific clinical question, have become critical for evidence-based decision making in healthcare. But such reviews have become increasingly onerous to produce due to the exponentially expanding biomedical literature base. This study proposes a step toward mitigating this problem by automating risk of bias assessment in systematic reviews, in which reviewers determine whether study results may be affected by biases (e.g., poor randomization or blinding). Conducting risk of bias assessment is an important but onerous task. We thus describe a machine learning approach to automate this assessment, using the standard Cochrane Risk of Bias Tool which assesses seven common types of bias. Training such a system would typically require a large labeled corpus, which would be prohibitively expensive to collect here. Instead, we use distant supervision, using data from the Cochrane Database of Systematic Reviews (a large repository of systematic reviews), to pseudoannotate a corpus of 2200 clinical trial reports in PDF format. We then develop a joint model which, using the full text of a clinical trial report as input, predicts the risks of bias while simultaneously extracting the text fragments supporting these assessments. This study represents a step toward automating or semiautomating extraction of data necessary for the synthesis of clinical trials.
Collapse
|
42
|
Trends in risk factor prevalence and management before first stroke: data from the South London Stroke Register 1995-2011. Stroke 2013; 44:1809-16. [PMID: 23660847 DOI: 10.1161/strokeaha.111.000655] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
BACKGROUND AND PURPOSE Vascular risk factors are suboptimally managed internationally. This study investigated time trends in risk factors diagnosed before stroke and their treatment, and factors associated with appropriate medication use. METHODS A total of 4416 patients with a first stroke were registered in the population-based South London Stroke Register from 1995 to 2011. Previously diagnosed risk factors and usual medications were collected from patients' primary care and hospital records. Trends and associations were assessed using multivariate logistic regression. RESULTS Seventy-two percent of patients were diagnosed previously with 1 or more risk factors; 30% had diagnosed risk factors that were untreated. Hypercholesterolemia increased significantly during the study period; myocardial infarction and transient ischemic attack prevalences decreased. Antiplatelet prescription increased in atrial fibrillation (AF), myocardial infarction, and transient ischemic attack (AF, 37%-51%, P<0.001; myocardial infarction, 48%-69%, P<0.001; transient ischemic attack, 49%-61%, P=0.015). Anticoagulant prescription for AF showed a nonsignificant increase (12%-23%; P=0.059). Fewer older patients with AF were prescribed anticoagulants (age, >85 versus <65 years; adjusted relative risk, 0.19; 95% confidence interval, 0.08-0.41). Black ethnicity (adjusted relative risk, 1.17; 95% confidence interval, 1.10-1.23) and female sex (adjusted relative risk, 1.09; 95% confidence interval, 1.03-1.15) were associated with increased antihypertensive drug prescription; other medications did not vary by ethnicity or sex. CONCLUSIONS Antiplatelet and cholesterol-lowering treatment prescribing have improved significantly over time; however, only a minority with AF received anticoagulants, and this did not improve significantly. Overall, 30% of strokes occurred in patients with previously diagnosed but untreated risk factors.
Collapse
|
43
|
Abstract
OBJECTIVE To synthesise the findings from individual qualitative studies on patients' understanding and experiences of hypertension and drug taking; to investigate whether views differ internationally by culture or ethnic group and whether the research could inform interventions to improve adherence. DESIGN Systematic review and narrative synthesis of qualitative research using the 2006 UK Economic and Social Research Council research methods programme guidance. DATA SOURCES Medline, Embase, the British Nursing Index, Social Policy and Practice, and PsycInfo from inception to October 2011. STUDY SELECTION Qualitative interviews or focus groups among people with uncomplicated hypertension (studies principally in people with diabetes, established cardiovascular disease, or pregnancy related hypertension were excluded). RESULTS 59 papers reporting on 53 qualitative studies were included in the synthesis. These studies came from 16 countries (United States, United Kingdom, Brazil, Sweden, Canada, New Zealand, Denmark, Finland, Ghana, Iran, Israel, Netherlands, South Korea, Spain, Tanzania, and Thailand). A large proportion of participants thought hypertension was principally caused by stress and produced symptoms, particularly headache, dizziness, and sweating. Participants widely intentionally reduced or stopped treatment without consulting their doctor. Participants commonly perceived that their blood pressure improved when symptoms abated or when they were not stressed, and that treatment was not needed at these times. Participants disliked treatment and its side effects and feared addiction. These findings were consistent across countries and ethnic groups. Participants also reported various external factors that prevented adherence, including being unable to find time to take the drugs or to see the doctor; having insufficient money to pay for treatment; the cost of appointments and healthy food; a lack of health insurance; and forgetfulness. CONCLUSIONS Non-adherence to hypertension treatment often resulted from patients' understanding of the causes and effects of hypertension; particularly relying on the presence of stress or symptoms to determine if blood pressure was raised. These beliefs were remarkably similar across ethnic and geographical groups; calls for culturally specific education for individual ethnic groups may therefore not be justified. To improve adherence, clinicians and educational interventions must better understand and engage with patients' ideas about causality, experiences of symptoms, and concerns about drug side effects.
Collapse
|