1
|
Socias-Morales C, Konda S, Bell JL, Wurzelbacher SJ, Naber SJ, Scott Earnest G, Garza EP, Meyers AR, Scharf T. Construction industry workers' compensation injury claims due to slips, trips, and falls - Ohio, 2010-2017. JOURNAL OF SAFETY RESEARCH 2023; 86:80-91. [PMID: 37718072 PMCID: PMC10772999 DOI: 10.1016/j.jsr.2023.06.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Revised: 04/20/2023] [Accepted: 06/29/2023] [Indexed: 09/19/2023]
Abstract
PROBLEM Compared to other industries, construction workers have higher risks for serious fall injuries. This study describes the burden and circumstances surrounding injuries related to compensable slip, trip, and fall (STF) claims from private construction industries covered by the Ohio Bureau of Workers' Compensation. METHODS STF injury claims in the Ohio construction industry from 2010-2017 were manually reviewed. Claims were classified as: slips or trips without a fall (STWOF), falls on the same level (FSL), falls to a lower level (FLL), and other. Claim narratives were categorized by work-related risk and contributing factors. Demographic, employer, and injury characteristics were examined by fall type and claim type (medical-only (MO, 0-7 days away from work, DAFW) or lost-time (LT, ≥8 DAFW)). Claim rates per 10,000 estimated full-time equivalent employees (FTEs) were calculated. RESULTS 9,517 Ohio construction industry STF claims occurred during the 8-year period, with an average annual rate of 75 claims per 10,000 FTEs. The rate of STFs decreased by 37% from 2010 to 2017. About half of the claims were FLL (51%), 29% were FSL, 17% were STWOF, and 3% were "other." Nearly 40% of all STF claims were LT; mostly among males (96%). The top three contributing factors for STWOF and FSL were: slip/trip hazards, floor irregularities, and ice/snow; and ladders, vehicles, and stairs/steps for FLL. FLL injury rates per 10,000 FTE were highest in these industries: Foundation, Structure, and Building Exterior Contractors (52); Building Finishing Contractors (45); and Residential Building Construction (45). The highest rate of FLL LT claims occurred in the smallest firms, and the FLL rate decreased as construction firm size increased. Discussion and Practical Applications: STF rates declined over time, yet remain common, requiring prevention activities. Safety professionals should focus on contributing factors when developing prevention strategies, especially high-risk subsectors and small firms.
Collapse
Affiliation(s)
| | | | | | - Steven J Wurzelbacher
- NIOSH, Division of Field Studies and Engineering, Center for Workers' Compensation Studies, United States
| | | | - G Scott Earnest
- NIOSH, Office of Construction Safety and Health, United States
| | | | - Alysha R Meyers
- NIOSH, Division of Field Studies and Engineering, Center for Workers' Compensation Studies, United States
| | - Ted Scharf
- NIOSH, Division of Science Integration, United States
| |
Collapse
|
2
|
Macedo JB, Ramos PMS, Maior CBS, Moura MJC, Lins ID, Vilela RFT. Identifying low-quality patterns in accident reports from textual data. INTERNATIONAL JOURNAL OF OCCUPATIONAL SAFETY AND ERGONOMICS 2022:1-13. [PMID: 35980110 DOI: 10.1080/10803548.2022.2111847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2022]
Abstract
Accident investigation reports provide useful knowledge to support companies to propose preventive and mitigative measures. However, the information presented in accident report databases is normally large, complex, filled with errors and has missing and/or redundant data. In this article, we propose text mining and natural language processing techniques to investigate low-quality accident reports. We adopted machine learning (ML) to detect and investigate inconsistencies on accident reports. The methodology was applied to 626 documents collected from an actual hydroelectric power company. The initial ML performances indicated data divergences and concerns related to the report structure. Then, the accident database was restructured to a more proper form confirming the supposition about the quality of the reports investigated. The proposed approach can be used as a diagnostic tool to improve the design of accident investigation reports to provide a more useful source of knowledge to support decisions in the safety context.
Collapse
Affiliation(s)
- July B Macedo
- CEERMA - Center for Risk Analysis, Reliability Engineering and Environmental Modeling, Federal University of Pernambuco, Brazil.,Department of Production Engineering, Federal University of Pernambuco, Brazil
| | - Plinio M S Ramos
- CEERMA - Center for Risk Analysis, Reliability Engineering and Environmental Modeling, Federal University of Pernambuco, Brazil.,Department of Production Engineering, Federal University of Pernambuco, Brazil
| | - Caio B S Maior
- CEERMA - Center for Risk Analysis, Reliability Engineering and Environmental Modeling, Federal University of Pernambuco, Brazil.,Technology Center, Universidade Federal de Pernambuco, Brazil
| | - Márcio J C Moura
- CEERMA - Center for Risk Analysis, Reliability Engineering and Environmental Modeling, Federal University of Pernambuco, Brazil.,Department of Production Engineering, Federal University of Pernambuco, Brazil
| | - Isis D Lins
- CEERMA - Center for Risk Analysis, Reliability Engineering and Environmental Modeling, Federal University of Pernambuco, Brazil.,Department of Production Engineering, Federal University of Pernambuco, Brazil
| | | |
Collapse
|
3
|
Lowe BD, Hayden M, Albers J, Naber S. Case Studies of Robots and Automation as Health/Safety Interventions in Small Manufacturing Enterprises. HUMAN FACTORS AND ERGONOMICS IN MANUFACTURING 2022; 33:69-103. [PMID: 37206917 PMCID: PMC10191138 DOI: 10.1002/hfm.20971] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Accepted: 08/01/2022] [Indexed: 05/21/2023]
Abstract
This paper reviews the experiences of 63 case studies of small businesses (< 250 employees) with manufacturing automation equipment acquired through a health/safety intervention grant program. The review scope included equipment technologies classified as industrial robots (n = 17), computer numerical control (CNC) machining (n = 29), or other programmable automation systems (n = 17). Descriptions of workers' compensation (WC) claim injuries and identified risk factors that motivated acquisition of the equipment were extracted from grant applications. Other aspects of the employer experiences, including qualitative and quantitative assessment of effects on risk factors for musculoskeletal disorders (MSD), effects on productivity, and employee acceptance of the intervention were summarized from the case study reports. Case studies associated with a combination of large reduction in risk factors, lower cost per affected employee, and reported increases in productivity were: CNC stone cutting system, CNC/vertical machining system, automated system for bottling, CNC/routing system for plastics products manufacturing, and a CNC/Cutting system for vinyl/carpet. Six case studies of industrial robots reported quantitative reductions in MSD risk factors in these diverse manufacturing industries: Snack Foods; Photographic Film, Paper, Plate, and Chemical; Machine Shops; Leather Good and Allied Products; Plastic Products; and Iron and Steel Forging. This review of health/safety intervention case studies indicates that advanced (programmable) manufacturing automation, including industrial robots, reduced workplace musculoskeletal risk factors and improved process productivity in most cases.
Collapse
Affiliation(s)
- Brian D Lowe
- formerly, National Institute for Occupational Safety and Health, 1090 Tusculum Ave., Cincinnati, OH 45226
| | - Marie Hayden
- National Institute for Occupational Safety and Health, Cincinnati, OH 45226
| | - James Albers
- formerly, National Institute for Occupational Safety and Health, Cincinnati, OH 45226
| | - Steven Naber
- Ohio Bureau of Workers Compensation, 30 West Spring Street, 25th floor, Columbus, OH 43215
| |
Collapse
|
4
|
Association Mining of Near Misses in Hydropower Engineering Construction Based on Convolutional Neural Network Text Classification. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:4851615. [PMID: 35024045 PMCID: PMC8747904 DOI: 10.1155/2022/4851615] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Revised: 12/05/2021] [Accepted: 12/08/2021] [Indexed: 11/18/2022]
Abstract
Accidents of various types in the construction of hydropower engineering projects occur frequently, which leads to significant numbers of casualties and economic losses. Identifying and eliminating near misses are a significant means of preventing accidents. Mining near-miss data can provide valuable information on how to mitigate and control hazards. However, most of the data generated in the construction of hydropower engineering projects are semi-structured text data without unified standard expression, so data association analysis is time-consuming and labor-intensive. Thus, an artificial intelligence (AI) automatic classification method based on a convolutional neural network (CNN) is adopted to obtain structured data on near-miss locations and near-miss types from safety records. The apriori algorithm is used to further mine the associations between “locations” and “types” by scanning structured data. The association results are visualized using a network diagram. A Sankey diagram is used to reveal the information flow of near-miss specific objects using the “location ⟶ type” strong association rule. The proposed method combines text classification, association rules, and the Sankey diagrams and provides a novel approach for mining semi-structured text. Moreover, the method is proven to be useful and efficient for exploring near-miss distribution laws in hydropower engineering construction to reduce the possibility of accidents and efficiently improve the safety level of hydropower engineering construction sites.
Collapse
|
5
|
Chan VCH, Ross GB, Clouthier AL, Fischer SL, Graham RB. The role of machine learning in the primary prevention of work-related musculoskeletal disorders: A scoping review. APPLIED ERGONOMICS 2022; 98:103574. [PMID: 34547578 DOI: 10.1016/j.apergo.2021.103574] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/20/2021] [Revised: 08/22/2021] [Accepted: 08/24/2021] [Indexed: 06/13/2023]
Abstract
To determine the applications of machine learning (ML) techniques used for the primary prevention of work-related musculoskeletal disorders (WMSDs), a scoping review was conducted using seven literature databases. Of the 4,639 initial results, 130 primary research studies were deemed relevant for inclusion. Studies were reviewed and classified as a contribution to one of six steps within the primary WMSD prevention research framework by van der Beek et al. (2017). ML techniques provided the greatest contributions to the development of interventions (48 studies), followed by risk factor identification (33 studies), underlying mechanisms (29 studies), incidence of WMSDs (14 studies), evaluation of interventions (6 studies), and implementation of effective interventions (0 studies). Nearly a quarter (23.8%) of all included studies were published in 2020. These findings provide insight into the breadth of ML techniques used for primary WMSD prevention and can help identify areas for future research and development.
Collapse
Affiliation(s)
- Victor C H Chan
- School of Human Kinetics, Faculty of Health Sciences, University of Ottawa, 200 Lees Avenue, Ottawa, Ontario, K1N 6N5, Canada
| | - Gwyneth B Ross
- School of Human Kinetics, Faculty of Health Sciences, University of Ottawa, 200 Lees Avenue, Ottawa, Ontario, K1N 6N5, Canada
| | - Allison L Clouthier
- School of Human Kinetics, Faculty of Health Sciences, University of Ottawa, 200 Lees Avenue, Ottawa, Ontario, K1N 6N5, Canada
| | - Steven L Fischer
- Department of Kinesiology, University of Waterloo, Waterloo, ON, Canada
| | - Ryan B Graham
- School of Human Kinetics, Faculty of Health Sciences, University of Ottawa, 200 Lees Avenue, Ottawa, Ontario, K1N 6N5, Canada; Department of Kinesiology, University of Waterloo, Waterloo, ON, Canada.
| |
Collapse
|
6
|
Wurzelbacher SJ, Meyers AR, Lampl MP, Timothy Bushnell P, Bertke SJ, Robins DC, Tseng CY, Naber SJ. Workers' compensation claim counts and rates by injury event/exposure among state-insured private employers in Ohio, 2007-2017. JOURNAL OF SAFETY RESEARCH 2021; 79:148-167. [PMID: 34847999 PMCID: PMC9026720 DOI: 10.1016/j.jsr.2021.08.015] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Revised: 03/23/2021] [Accepted: 08/30/2021] [Indexed: 06/13/2023]
Abstract
INTRODUCTION This study analyzed workers' compensation (WC) claims among private employers insured by the Ohio state-based WC carrier to identify high-risk industries by detailed cause of injury. METHODS A machine learning algorithm was used to code each claim by U.S. Bureau of Labor Statistics (BLS) event/exposure. The codes assigned to lost-time (LT) claims with lower algorithm probabilities of accurate classification or those LT claims with high costs were manually reviewed. WC data were linked with the state's unemployment insurance (UI) data to identify the employer's industry and number of employees. BLS data on hours worked per employee were used to estimate full-time equivalents (FTE) and calculate rates of WC claims per 100 FTE. RESULTS 140,780 LT claims and 633,373 medical-only claims were analyzed. Although counts and rates of LT WC claims declined from 2007 to 2017, the shares of leading LT injury event/exposures remained largely unchanged. LT claims due to Overexertion and Bodily Reaction (33.0%) were most common, followed by Falls, Slips, and Trips (31.4%), Contact with Objects and Equipment (22.5%), Transportation Incidents (7.0%), Exposure to Harmful Substances or Environments (2.8%), Violence and Other Injuries by Persons or Animals (2.5%), and Fires and Explosions (0.4%). These findings are consistent with other reported data. The proportions of injury event/exposures varied by industry, and high-risk industries were identified. CONCLUSIONS Injuries have been reduced, but prevention challenges remain in certain industries. Available evidence on intervention effectiveness was summarized and mapped to the analysis results to demonstrate how the results can guide prevention efforts. Practical Applications: Employers, safety/health practitioners, researchers, WC insurers, and bureaus can use these data and machine learning methods to understand industry differences in the level and mix of risks, as well as industry trends, and to tailor safety, health, and disability prevention services and research.
Collapse
Affiliation(s)
- Steven J Wurzelbacher
- National Institute for Occupational Safety and Health, 1090 Tusculum Ave, Cincinnati, OH 45226-1998, United States.
| | - Alysha R Meyers
- National Institute for Occupational Safety and Health, 1090 Tusculum Ave, Cincinnati, OH 45226-1998, United States.
| | - Michael P Lampl
- Ohio Bureau of Workers' Compensation, 30 W Spring St Ste L1, Columbus, OH 43215, United States.
| | - P Timothy Bushnell
- National Institute for Occupational Safety and Health, 1090 Tusculum Ave, Cincinnati, OH 45226-1998, United States.
| | - Stephen J Bertke
- National Institute for Occupational Safety and Health, 1090 Tusculum Ave, Cincinnati, OH 45226-1998, United States.
| | - David C Robins
- Ohio Bureau of Workers' Compensation, 30 W Spring St Ste L1, Columbus, OH 43215, United States.
| | - Chih-Yu Tseng
- National Institute for Occupational Safety and Health, 1090 Tusculum Ave, Cincinnati, OH 45226-1998, United States.
| | - Steven J Naber
- Ohio Bureau of Workers' Compensation, 30 W Spring St Ste L1, Columbus, OH 43215, United States.
| |
Collapse
|
7
|
Santiago-Colón A, Rocheleau CM, Bertke S, Christianson A, Collins DT, Trester-Wilson E, Sanderson W, Waters MA, Reefhuis J. Testing and Validating Semi-automated Approaches to the Occupational Exposure Assessment of Polycyclic Aromatic Hydrocarbons. Ann Work Expo Health 2021; 65:682-693. [PMID: 33889928 PMCID: PMC8435754 DOI: 10.1093/annweh/wxab002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2020] [Revised: 11/12/2020] [Accepted: 01/07/2021] [Indexed: 11/14/2022] Open
Abstract
INTRODUCTION When it is not possible to capture direct measures of occupational exposure or conduct biomonitoring, retrospective exposure assessment methods are often used. Among the common retrospective assessment methods, assigning exposure estimates by multiple expert rater review of detailed job descriptions is typically the most valid, but also the most time-consuming and expensive. Development of screening protocols to prioritize a subset of jobs for expert rater review can reduce the exposure assessment cost and time requirement, but there is often little data with which to evaluate different screening approaches. We used existing job-by-job exposure assessment data (assigned by consensus between multiple expert raters) from a large, population-based study of women to create and test screening algorithms for polycyclic aromatic hydrocarbons (PAHs) that would be suitable for use in other population-based studies. METHODS We evaluated three approaches to creating a screening algorithm: a machine-learning algorithm, a set of a priori decision rules created by experts based on features (such as keywords) found in the job description, and a hybrid algorithm incorporating both sets of criteria. All coded jobs held by mothers of infants participating in National Birth Defects Prevention Study (NBDPS) (n = 35,424) were used in developing or testing the screening algorithms. The job narrative fields considered for all approaches included job title, type of product made by the company, main activities or duties, and chemicals or substances handled. Each screening approach was evaluated against the consensus rating of two or more expert raters. RESULTS The machine-learning algorithm considered over 30,000 keywords and industry/occupation codes (separate and in combination). Overall, the hybrid method had a similar sensitivity (87.1%) as the expert decision rules (85.5%) but was higher than the machine-learning algorithm (67.7%). Specificity was best in the machine-learning algorithm (98.1%), compared to the expert decision rules (89.2%) and hybrid approach (89.1%). Using different probability cutoffs in the hybrid approach resulted in improvements in sensitivity (24-30%), without the loss of much specificity (7-18%). CONCLUSION Both expert decision rules and the machine-learning algorithm performed reasonably well in identifying the majority of jobs with potential exposure to PAHs. The hybrid screening approach demonstrated that by reviewing approximately 20% of the total jobs, it could identify 87% of all jobs exposed to PAHs; sensitivity could be further increased, albeit with a decrease in specificity, by adjusting the algorithm. The resulting screening algorithm could be applied to other population-based studies of women. The process of developing the algorithm also provides a useful illustration of the strengths and potential pitfalls of these approaches to developing exposure assessment algorithms.
Collapse
Affiliation(s)
- Albeliz Santiago-Colón
- Centers for Disease Control and Prevention, National Institute for Occupational Safety and Health, Cincinnati, OH, USA
| | - Carissa M Rocheleau
- Centers for Disease Control and Prevention, National Institute for Occupational Safety and Health, Cincinnati, OH, USA
| | - Stephen Bertke
- Centers for Disease Control and Prevention, National Institute for Occupational Safety and Health, Cincinnati, OH, USA
| | - Annette Christianson
- Centers for Disease Control and Prevention, National Institute for Occupational Safety and Health, Cincinnati, OH, USA.,Department of Environmental and Public Health Sciences, University of Cincinnati, Cincinnati, OH, USA
| | - Devon T Collins
- Department of Epidemiology, University of Kentucky, College of Public Health, Lexington, KY, USA.,Inova Fairfax Medical Campus, Falls Church, VA, USA
| | - Emma Trester-Wilson
- Department of Epidemiology, University of Kentucky, College of Public Health, Lexington, KY, USA
| | - Wayne Sanderson
- Department of Epidemiology, University of Kentucky, College of Public Health, Lexington, KY, USA
| | - Martha A Waters
- Centers for Disease Control and Prevention, National Institute for Occupational Safety and Health, Cincinnati, OH, USA
| | - Jennita Reefhuis
- Centers for Disease Control and Prevention, National Center on Birth Defects and Developmental Disabilities, Atlanta, GA, USA
| | | |
Collapse
|
8
|
Abstract
The construction sector is widely recognized as having the most hazardous working environment among the various business sectors, and many research studies have focused on injury prevention strategies for use on construction sites. The risk-based theory emphasizes the analysis of accident causes extracted from accident reports to understand, predict, and prevent the occurrence of construction accidents. The first step in the analysis is to classify the incidents from a massive number of reports into different cause categories, a task which is usually performed on a manual basis by domain experts. The research described in this paper proposes a convolutional bidirectional long short-term memory (C-BiLSTM)-based method to automatically classify construction accident reports. The proposed approach was applied on a dataset of construction accident narratives obtained from the Occupational Safety and Health Administration website, and the results indicate that this model performs better than some of the classic machine learning models commonly used in classification tasks, including support vector machine (SVM), naïve Bayes (NB), and logistic regression (LR). The results of this study can help safety managers to develop risk management strategies.
Collapse
|
9
|
Applying Machine Learning to Workers' Compensation Data to Identify Industry-Specific Ergonomic and Safety Prevention Priorities: Ohio, 2001 to 2011. J Occup Environ Med 2019; 60:55-73. [PMID: 28953071 DOI: 10.1097/jom.0000000000001162] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
OBJECTIVE This study leveraged a state workers' compensation claims database and machine learning techniques to target prevention efforts by injury causation and industry. METHODS Injury causation auto-coding methods were developed to code more than 1.2 million Ohio Bureau of Workers' Compensation claims for this study. Industry groups were ranked for soft-tissue musculoskeletal claims that may have been preventable with biomechanical ergonomic (ERGO) or slip/trip/fall (STF) interventions. RESULTS On the basis of the average of claim count and rate ranks for more than 200 industry groups, Skilled Nursing Facilities (ERGO) and General Freight Trucking (STF) were the highest risk for lost-time claims (>7 days). CONCLUSION This study created a third, major causation-specific U.S. occupational injury surveillance system. These findings are being used to focus prevention resources on specific occupational injury types in specific industry groups, especially in Ohio. Other state bureaus or insurers may use similar methods.
Collapse
|
10
|
Scott E, Hirabayashi L, Krupa N, Jenkins P. Emergency Medical Services Pre-Hospital Care Reports as a Data Source for Logging Injury Surveillance. J Agromedicine 2019; 24:133-137. [DOI: 10.1080/1059924x.2019.1572558] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
- Erika Scott
- Northeast Center for Occupational Health and Safety, Bassett Healthcare Network, Cooperstown, USA
| | - Liane Hirabayashi
- Northeast Center for Occupational Health and Safety, Bassett Healthcare Network, Cooperstown, USA
| | - Nicole Krupa
- Bassett Research Institute, Bassett Healthcare Network, Cooperstown, USA
| | - Paul Jenkins
- Bassett Research Institute, Bassett Healthcare Network, Cooperstown, USA
| |
Collapse
|
11
|
Song B, Suh Y. Narrative texts-based anomaly detection using accident report documents: The case of chemical process safety. J Loss Prev Process Ind 2019. [DOI: 10.1016/j.jlp.2018.08.010] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
12
|
Reichard AA, Al-Tarawneh IS, Konda S, Wei C, Wurzelbacher SJ, Meyers AR, Bertke SJ, Bushnell PT, Tseng CY, Lampl MP, Robins DC. Workers' compensation injury claims among workers in the private ambulance services industry-Ohio, 2001-2011. Am J Ind Med 2018; 61:986-996. [PMID: 30417397 DOI: 10.1002/ajim.22917] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/22/2018] [Indexed: 11/10/2022]
Abstract
BACKGROUND Ambulance service workers frequently transfer and transport patients. These tasks involve occupational injury risks such as heavy lifting, awkward postures, and frequent motor vehicle travel. METHODS We examined Ohio workers' compensation injury claims among state-insured ambulance service workers working for private employers from 2001 to 2011. Injury claim counts and rates are presented by claim types, diagnoses, and injury events; only counts are available by worker characteristics. RESULTS We analyzed a total of 5882 claims. The majority were medical-only (<8 days away from work). The overall injury claim rate for medical-only and lost-time cases was 12.1 per 100 full-time equivalents. Sprains and strains accounted for 60% of all injury claims. Overexertion from patient handling was the leading injury event, followed by motor vehicle roadway incidents. CONCLUSIONS Study results can guide the development or improvement of injury prevention strategies. Focused efforts related to patient handling and vehicle incidents are needed.
Collapse
Affiliation(s)
- Audrey A. Reichard
- Division of Safety Research; National Institute for Occupational Safety and Health; Morgantown West Virginia
| | | | - Srinivas Konda
- Division of Safety Research; National Institute for Occupational Safety and Health; Morgantown West Virginia
| | - Chia Wei
- Division of Surveillance, Hazard Evaluations, and Field Studies; National Institute for Occupational Safety and Health; Cincinnati Ohio
| | - Steven J. Wurzelbacher
- Division of Surveillance, Hazard Evaluations, and Field Studies; National Institute for Occupational Safety and Health; Cincinnati Ohio
| | - Alysha R. Meyers
- Division of Surveillance, Hazard Evaluations, and Field Studies; National Institute for Occupational Safety and Health; Cincinnati Ohio
| | - Stephen J. Bertke
- Division of Surveillance, Hazard Evaluations, and Field Studies; National Institute for Occupational Safety and Health; Cincinnati Ohio
| | - P. Timothy Bushnell
- Economic Research Support Office, Office of the Director; National Institute for Occupational Safety and Health; Cincinnati Ohio
| | - Chih-Yu Tseng
- Division of Surveillance, Hazard Evaluations, and Field Studies; National Institute for Occupational Safety and Health; Cincinnati Ohio
| | - Michael P. Lampl
- Division of Safety and Hygiene; Ohio Bureau of Workers’ Compensation; Columbus Ohio
| | - David C. Robins
- Division of Safety and Hygiene; Ohio Bureau of Workers’ Compensation; Columbus Ohio
| |
Collapse
|
13
|
Moore LL, Wurzelbacher SJ, Shockey TM. Workers' compensation insurer risk control systems: Opportunities for public health collaborations. JOURNAL OF SAFETY RESEARCH 2018; 66:141-150. [PMID: 30121100 PMCID: PMC8609819 DOI: 10.1016/j.jsr.2018.07.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/20/2017] [Revised: 04/20/2018] [Accepted: 07/10/2018] [Indexed: 06/08/2023]
Abstract
INTRODUCTION Workers' compensation (WC) insurers offer services and programs for prospective client selection and insured client risk control (RC) purposes. Toward these aims, insurers collect employer data that may include information on types of hazards present in the workplace, safety and health programs and controls in place to prevent injury/illness, and return-to-work programs to reduce injury/illness severity. Despite the potential impact of RC systems on workplace safety and health and the use of RC data in guiding prevention efforts, few research studies on the types of RC services provided to employers or the RC data collected have been published in the peer-reviewed literature. METHODS Researchers conducted voluntary interviews with nine private and state-fund WC insurers to collect qualitative information on RC data and systems. RESULTS Insurers provided information describing their RC data, tools, and practices. Unique practices as well as similarities including those related to RC services, policyholder goals, and databases were identified. CONCLUSIONS Insurers collect and store extensive RC data, which have utility for public health research for improving workplace safety and health. PRACTICAL APPLICATIONS Increased public health understanding of RC data and systems and an identification of key collaboration opportunities between insurers and researchers will facilitate increased use of RC data for public health purposes.
Collapse
Affiliation(s)
- Libby L Moore
- Centers for Disease Control and Prevention, National Institute for Occupational Safety and Health, 1090 Tusculum Ave., Cincinnati, OH 45226, USA.
| | - Steven J Wurzelbacher
- Centers for Disease Control and Prevention, National Institute for Occupational Safety and Health, 1090 Tusculum Ave., Cincinnati, OH 45226, USA.
| | - Taylor M Shockey
- Centers for Disease Control and Prevention, National Institute for Occupational Safety and Health, 1090 Tusculum Ave., Cincinnati, OH 45226, USA.
| |
Collapse
|
14
|
Nanda G, Vallmuur K, Lehto M. Improving autocoding performance of rare categories in injury classification: Is more training data or filtering the solution? ACCIDENT; ANALYSIS AND PREVENTION 2018; 110:115-127. [PMID: 29127808 DOI: 10.1016/j.aap.2017.10.020] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/06/2017] [Revised: 08/13/2017] [Accepted: 10/21/2017] [Indexed: 06/07/2023]
Abstract
INTRODUCTION Classical Machine Learning (ML) models have been found to assign the external-cause-of-injury codes (E-codes) based on injury narratives with good overall accuracy but often struggle with rare categories, primarily due to lack of enough training cases and heavily skewed nature of injurdata. In this paper, we have: a) studied the effect of increasing the size of training data on the prediction performance of three classical ML models: Multinomial Naïve Bayes (MNB), Support Vector Machine (SVM) and Logistic Regression (LR), and b) studied the effect of filtering based on prediction strength of LR model when the model is trained on very-small (10,000 cases) and very-large (450,000 cases) training sets. METHOD Data from Queensland Injury Surveillance Unit from years 2002-2012, which was categorized into 20 broad E-codes was used for this study. Eleven randomly chosen training sets of size ranging from 10,000 to 450,000 cases were used to train the ML models, and the prediction performance was analyzed on a prediction set of 50,150 cases. Filtering approach was tested on LR models trained on smallest and largest training datasets. Sensitivity was used as the performance measure for individual categories. Weighted average sensitivity (WAvg) and Unweighted average sensitivity (UAvg) were used as the measures of overall performance. Filtering approach was also tested for estimating category counts and was compared with approaches of summing prediction probabilities and counting direct predictions by ML model. RESULTS The overall performance of all three ML models improved with increase in the size of training data. The overall sensitivities with maximum training size for LR and SVM models were similar (∼82%), and higher than MNB (76%). For all the ML models, the sensitivities of rare categories improved with increasing training data but they were considerably less than sensitivities of larger categories. With increasing training data size, LR and SVM exhibited diminishing improvement in UAvg whereas the improvement was relatively steady in case of MNB. Filtering based on prediction strength of LR model (and manual review of filtered cases) helped in improving the sensitivities of rare categories. A sizeable portion of cases still needed to be filtered even when the LR model was trained on very large training set. For estimating category counts, filtering approach provided best estimates for most E-codes and summing prediction probabilities approach provided better estimates for rare categories. CONCLUSIONS Increasing the size of training data alone cannot solve the problem of poor classification performance on rare categories by ML models. Filtering could be an effective strategy to improve classification performance of rare categories when large training data is not available.
Collapse
Affiliation(s)
- Gaurav Nanda
- School of Industrial Engineering, Purdue University, USA.
| | - Kirsten Vallmuur
- Current: Australian Centre for Health Services Innovation, School of Public Health and Social Work, Queensland University of Technology, Australia; Formerly: Centre for Accident Research and Road Safety-Queensland, School of Psychology and Counselling, Queensland University of Technology, Australia
| | - Mark Lehto
- School of Industrial Engineering, Purdue University, USA
| |
Collapse
|
15
|
Goh YM, Ubeynarayana CU. Construction accident narrative classification: An evaluation of text mining techniques. ACCIDENT; ANALYSIS AND PREVENTION 2017; 108:122-130. [PMID: 28865927 DOI: 10.1016/j.aap.2017.08.026] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/15/2017] [Revised: 07/24/2017] [Accepted: 08/26/2017] [Indexed: 06/07/2023]
Abstract
Learning from past accidents is fundamental to accident prevention. Thus, accident and near miss reporting are encouraged by organizations and regulators. However, for organizations managing large safety databases, the time taken to accurately classify accident and near miss narratives will be very significant. This study aims to evaluate the utility of various text mining classification techniques in classifying 1000 publicly available construction accident narratives obtained from the US OSHA website. The study evaluated six machine learning algorithms, including support vector machine (SVM), linear regression (LR), random forest (RF), k-nearest neighbor (KNN), decision tree (DT) and Naive Bayes (NB), and found that SVM produced the best performance in classifying the test set of 251 cases. Further experimentation with tokenization of the processed text and non-linear SVM were also conducted. In addition, a grid search was conducted on the hyperparameters of the SVM models. It was found that the best performing classifiers were linear SVM with unigram tokenization and radial basis function (RBF) SVM with uni-gram tokenization. In view of its relative simplicity, the linear SVM is recommended. Across the 11 labels of accident causes or types, the precision of the linear SVM ranged from 0.5 to 1, recall ranged from 0.36 to 0.9 and F1 score was between 0.45 and 0.92. The reasons for misclassification were discussed and suggestions on ways to improve the performance were provided.
Collapse
Affiliation(s)
- Yang Miang Goh
- Safety and Resilience Research Unit (SaRRU), Dept. of Building, School of Design and Environment, National Univ. of Singapore, 4 Architecture Dr., 117566, Singapore.
| | - C U Ubeynarayana
- Safety and Resilience Research Unit (SaRRU), Dept. of Building, School of Design and Environment, National Univ. of Singapore, 4 Architecture Dr., 117566, Singapore
| |
Collapse
|
16
|
Wearable Devices for Classification of Inadequate Posture at Work Using Neural Networks. SENSORS 2017; 17:s17092003. [PMID: 28862665 PMCID: PMC5621084 DOI: 10.3390/s17092003] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/31/2017] [Revised: 08/20/2017] [Accepted: 08/30/2017] [Indexed: 11/17/2022]
Abstract
Inadequate postures adopted by an operator at work are among the most important risk factors in Work-related Musculoskeletal Disorders (WMSDs). Although several studies have focused on inadequate posture, there is limited information on its identification in a work context. The aim of this study is to automatically differentiate between adequate and inadequate postures using two wearable devices (helmet and instrumented insole) with an inertial measurement unit (IMU) and force sensors. From the force sensors located inside the insole, the center of pressure (COP) is computed since it is considered an important parameter in the analysis of posture. In a first step, a set of 60 features is computed with a direct approach, and later reduced to eight via a hybrid feature selection. A neural network is then employed to classify the current posture of a worker, yielding a recognition rate of 90%. In a second step, an innovative graphic approach is proposed to extract three additional features for the classification. This approach represents the main contribution of this study. Combining both approaches improves the recognition rate to 95%. Our results suggest that neural network could be applied successfully for the classification of adequate and inadequate posture.
Collapse
|
17
|
|
18
|
Scott E, Bell E, Hirabayashi L, Krupa N, Jenkins P. Trends in Nonfatal Agricultural Injury in Maine and New Hampshire: Results From a Low-Cost Passive Surveillance System. J Agromedicine 2017; 22:109-117. [DOI: 10.1080/1059924x.2017.1282908] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- Erika Scott
- Northeast Center for Occupational Health and Safety in Agriculture, Forestry, and Fishing, New York Center for Agricultural Medicine and Health, Bassett Healthcare Network, Cooperstown, New York, USA
| | - Erin Bell
- Department of Environmental Health Sciences, School of Public Health, University at Albany, Rensselaer, New York, USA
| | - Liane Hirabayashi
- Northeast Center for Occupational Health and Safety in Agriculture, Forestry, and Fishing, New York Center for Agricultural Medicine and Health, Bassett Healthcare Network, Cooperstown, New York, USA
- Bassett Research Institute, Bassett Healthcare Network, Cooperstown, New York, USA
| | - Nicole Krupa
- Bassett Research Institute, Bassett Healthcare Network, Cooperstown, New York, USA
| | - Paul Jenkins
- Bassett Research Institute, Bassett Healthcare Network, Cooperstown, New York, USA
| |
Collapse
|
19
|
Jacinto C, Santos FP, Guedes Soares C, Silva SA. Assessing the coding reliability of work accidents statistical data: How coders make a difference. JOURNAL OF SAFETY RESEARCH 2016; 59:9-21. [PMID: 27847003 DOI: 10.1016/j.jsr.2016.09.005] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/25/2016] [Revised: 06/24/2016] [Accepted: 09/28/2016] [Indexed: 06/06/2023]
Abstract
INTRODUCTION This study assesses the reliability of the coding procedure for a set of variables belonging to the European Statistics of Accidents at Work (ESAW). The work focused on the Portuguese data and experience with the system. In Portugal, this task has been systematically carried out by GEP (the governmental Cabinet for Strategy and Planning), here defined as the "reference group" or "expert group." However, it is anticipated that this coding task will be performed by non-expert people, since paper-forms will be replaced by e-forms, similarly to what happened in a few EU countries. OBJECTIVE This study aims to: (a) assess the current situation, that is, to quantify reliability of data coded by GEP (reference group), and (b) assess the impact on the reliability level when the coding is carried out by non-experts (two different groups of coders). METHODS The study comprises the estimation of both intercoder and intracoder reliability for a set of 8 nominal variables. The assessment applies 3 reliability coefficients calculated by 3 software packages. RESULTS The results reveal that the expert group (GEP) holds good to excellent reliability (inter- and intracoder agreements), between 68-98%, while there is a considerable "loss of reliability" (-5% to -39%) when the coding process is transferred to other people, without special training or knowledge in this task. CONCLUSIONS This work gives quantified evidence that reliability of coding accident data is substantially affected by the coders' profile. Moreover, certain variables, regardless of the coder, systematically hold a higher level of coding reliability than others, suggesting that certain codes may need improvement. Future studies should assess coding quality across the EU countries using the ESAW protocol. PRACTICAL APPLICATIONS Directions for improving the quality of accident data and related statistics; data that is used by researchers and governmental decision-makers to derive prevention strategies.
Collapse
Affiliation(s)
- Celeste Jacinto
- UNIDEMI, Department of Mechanical and Industrial Engineering, Faculty of Science and Technology, Universidade Nova de Lisboa, 2829-516 Caparica, Portugal.
| | - Fernando P Santos
- Centre for Marine Technology and Ocean Engineering (CENTEC), Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais, 1049-001 Lisboa, Portugal.
| | - Carlos Guedes Soares
- Centre for Marine Technology and Ocean Engineering (CENTEC), Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais, 1049-001 Lisboa, Portugal.
| | - Sílvia A Silva
- Instituto Universitário de Lisboa (ISCTE-IUL), ISCTE Business School (IBS), Business Research Unit (BRU-IUL), Av. das Forças Armadas, Edifício ISCTE, 1649-026 Lisboa, Portugal.
| |
Collapse
|
20
|
Wurzelbacher SJ, Al-Tarawneh IS, Meyers AR, Bushnell P, Lampl MP, Robins DC, Tseng CY, Wei C, Bertke SJ, Raudabaugh JA, Haviland TM, Schnorr TM. Development of methods for using workers' compensation data for surveillance and prevention of occupational injuries among State-insured private employers in Ohio. Am J Ind Med 2016; 59:1087-1104. [PMID: 27667651 DOI: 10.1002/ajim.22653] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/16/2016] [Indexed: 11/09/2022]
Abstract
BACKGROUND Workers' compensation (WC) claims data may be useful for identifying high-risk industries and developing prevention strategies. METHODS WC claims data from private-industry employers insured by the Ohio state-based workers' compensation carrier from 2001 to 2011 were linked with the state's unemployment insurance (UI) data on the employer's industry and number of employees. National Labor Productivity and Costs survey data were used to adjust UI data and estimate full-time equivalents (FTE). Rates of WC claims per 100 FTE were computed and Poisson regression was used to evaluate differences in rates. RESULTS Most industries showed substantial claim count and rate reductions from 2001 to 2008, followed by a leveling or slight increase in claim count and rate from 2009 to 2011. Despite reductions, there were industry groups that had consistently higher rates. CONCLUSION WC claims data linked to employment data could be used to prioritize industries for injury research and prevention activities among State-insured private employers. Am. J. Ind. Med. 59:1087-1104, 2016. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Steven J. Wurzelbacher
- Division of Surveillance, Hazard Evaluations, and Field Studies, Center for Workers’ Compensation Studies; National Institute for Occupational Safety and Health; Cincinnati Ohio
| | | | - Alysha R. Meyers
- Division of Surveillance, Hazard Evaluations, and Field Studies, Center for Workers’ Compensation Studies; National Institute for Occupational Safety and Health; Cincinnati Ohio
| | - P.Timothy Bushnell
- Division of Surveillance, Hazard Evaluations, and Field Studies, Center for Workers’ Compensation Studies; National Institute for Occupational Safety and Health; Cincinnati Ohio
| | - Michael P. Lampl
- Division of Safety and Hygiene; Ohio Bureau of Workers’ Compensation; Columbus Ohio
| | - David C. Robins
- Division of Safety and Hygiene; Ohio Bureau of Workers’ Compensation; Columbus Ohio
| | - Chih-Yu Tseng
- Division of Surveillance, Hazard Evaluations, and Field Studies, Center for Workers’ Compensation Studies; National Institute for Occupational Safety and Health; Cincinnati Ohio
| | - Chia Wei
- Division of Surveillance, Hazard Evaluations, and Field Studies, Center for Workers’ Compensation Studies; National Institute for Occupational Safety and Health; Cincinnati Ohio
| | - Stephen J. Bertke
- Division of Surveillance, Hazard Evaluations, and Field Studies, Center for Workers’ Compensation Studies; National Institute for Occupational Safety and Health; Cincinnati Ohio
| | | | - Thomas M. Haviland
- Division of Surveillance, Hazard Evaluations, and Field Studies, Center for Workers’ Compensation Studies; National Institute for Occupational Safety and Health; Cincinnati Ohio
| | - Teresa M. Schnorr
- Division of Surveillance, Hazard Evaluations, and Field Studies, Center for Workers’ Compensation Studies; National Institute for Occupational Safety and Health; Cincinnati Ohio
| |
Collapse
|
21
|
Yamin SC, Bejan A, Parker DL, Xi M, Brosseau LM. Analysis of workers' compensation claims data for machine-related injuries in metal fabrication businesses. Am J Ind Med 2016; 59:656-64. [PMID: 27195962 DOI: 10.1002/ajim.22603] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/21/2016] [Indexed: 01/09/2023]
Abstract
BACKGROUND Metal fabrication workers are at high risk for machine-related injury. Apart from amputations, data on factors contributing to this problem are generally absent. METHODS Narrative text analysis was performed on workers' compensation claims in order to identify machine-related injuries and determine work tasks involved. Data were further evaluated on the basis of cost per claim, nature of injury, and part of body. RESULTS From an initial set of 4,268 claims, 1,053 were classified as machine-related. Frequently identified tasks included machine operation (31%), workpiece handling (20%), setup/adjustment (15%), and removing chips (12%). Lacerations to finger(s), hand, or thumb comprised 38% of machine-related injuries; foreign body in the eye accounted for 20%. Amputations were relatively rare but had highest costs per claim (mean $21,059; median $11,998). CONCLUSIONS Despite limitations, workers' compensation data were useful in characterizing machine-related injuries. Improving the quality of data collected by insurers would enhance occupational injury surveillance and prevention efforts. Am. J. Ind. Med. 59:656-664, 2016. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
| | - Anca Bejan
- HealthPartners Institute; Bloomington Minnesota
| | | | - Min Xi
- HealthPartners Institute; Bloomington Minnesota
| | - Lisa M. Brosseau
- Division of Environmental and Occupational Health Sciences, School of Public Health; University of Illinois, Chicago; Chicago Illinios
| |
Collapse
|
22
|
Chen W, Wheeler KK, Lin S, Huang Y, Xiang H. Computerized "Learn-As-You-Go" classification of traumatic brain injuries using NEISS narrative data. ACCIDENT; ANALYSIS AND PREVENTION 2016; 89:111-117. [PMID: 26851618 PMCID: PMC5119271 DOI: 10.1016/j.aap.2016.01.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/18/2015] [Revised: 01/08/2016] [Accepted: 01/21/2016] [Indexed: 06/05/2023]
Abstract
One important routine task in injury research is to effectively classify injury circumstances into user-defined categories when using narrative text. However, traditional manual processes can be time consuming, and existing batch learning systems can be difficult to utilize by novice users. This study evaluates a "Learn-As-You-Go" machine-learning program. When using this program, the user trains classification models and interactively checks on accuracy until a desired threshold is reached. We examined the narrative text of traumatic brain injuries (TBIs) in the National Electronic Injury Surveillance System (NEISS) and classified TBIs into sport and non-sport categories. Our results suggest that the DUALIST "Learn-As-You-Go" program, which features a user-friendly online interface, is effective in injury narrative classification. In our study, the time frame to classify tens of thousands of narratives was reduced from a few days to minutes after approximately sixty minutes of training.
Collapse
Affiliation(s)
- Wei Chen
- Research Information Solutions and Innovation, The Research Institute at Nationwide Children's Hospital, Columbus, OH, USA
| | - Krista K Wheeler
- Center for Injury Research and Policy, The Research Institute at Nationwide Children's Hospital, Columbus, OH, USA; Center for Pediatric Trauma Research, Nationwide Children's Hospital, Columbus, OH, USA
| | - Simon Lin
- Research Information Solutions and Innovation, The Research Institute at Nationwide Children's Hospital, Columbus, OH, USA
| | - Yungui Huang
- Research Information Solutions and Innovation, The Research Institute at Nationwide Children's Hospital, Columbus, OH, USA
| | - Huiyun Xiang
- Center for Injury Research and Policy, The Research Institute at Nationwide Children's Hospital, Columbus, OH, USA; Center for Pediatric Trauma Research, Nationwide Children's Hospital, Columbus, OH, USA.
| |
Collapse
|
23
|
Bertke SJ, Meyers AR, Wurzelbacher SJ, Measure A, Lampl MP, Robins D. Comparison of methods for auto-coding causation of injury narratives. ACCIDENT; ANALYSIS AND PREVENTION 2016; 88:117-123. [PMID: 26745274 PMCID: PMC4915551 DOI: 10.1016/j.aap.2015.12.006] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/27/2015] [Revised: 11/13/2015] [Accepted: 12/07/2015] [Indexed: 05/30/2023]
Abstract
Manually reading free-text narratives in large databases to identify the cause of an injury can be very time consuming and recently, there has been much work in automating this process. In particular, the variations of the naïve Bayes model have been used to successfully auto-code free text narratives describing the event/exposure leading to the injury of a workers' compensation claim. This paper compares the naïve Bayes model with an alternative logistic model and found that this new model outperformed the naïve Bayesian model. Further modest improvements were found through the addition of sequences of keywords in the models as opposed to consideration of only single keywords. The programs and weights used in this paper are available upon request to researchers without a training set wishing to automatically assign event codes to large data-sets of text narratives. The utility of sharing this program was tested on an outside set of injury narratives provided by the Bureau of Labor Statistics with promising results.
Collapse
Affiliation(s)
- S J Bertke
- National Institute for Occupational Safety and Health, Division of Surveillance, Hazard Evaluations, and Field Studies, Industrywide Studies Branch, 1090 Tusculum Ave, Cincinnati, OH 45226, United States.
| | - A R Meyers
- National Institute for Occupational Safety and Health, Division of Surveillance, Hazard Evaluations, and Field Studies, Industrywide Studies Branch, Center for Workers' Compensation Studies, 1090 Tusculum Ave, Cincinnati, OH 45226, United States
| | - S J Wurzelbacher
- National Institute for Occupational Safety and Health, Division of Surveillance, Hazard Evaluations, and Field Studies, Industrywide Studies Branch, Center for Workers' Compensation Studies, 1090 Tusculum Ave, Cincinnati, OH 45226, United States
| | - A Measure
- Bureau of Labor Statistics, Occupational Safety and Health Statistics, 2 Massachusetts Avenue, Washington, DC 20212, United States
| | - M P Lampl
- Ohio Bureau of Workers' Compensation, Division of Safety & Hygiene, 13430 Yarmouth Drive, Pickerington, OH 43147, United States
| | - D Robins
- Ohio Bureau of Workers' Compensation, Division of Safety & Hygiene, 13430 Yarmouth Drive, Pickerington, OH 43147, United States
| |
Collapse
|
24
|
Vallmuur K, Marucci-Wellman HR, Taylor JA, Lehto M, Corns HL, Smith GS. Harnessing information from injury narratives in the 'big data' era: understanding and applying machine learning for injury surveillance. Inj Prev 2016; 22 Suppl 1:i34-42. [PMID: 26728004 DOI: 10.1136/injuryprev-2015-041813] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2015] [Accepted: 12/08/2015] [Indexed: 11/03/2022]
Abstract
OBJECTIVE Vast amounts of injury narratives are collected daily and are available electronically in real time and have great potential for use in injury surveillance and evaluation. Machine learning algorithms have been developed to assist in identifying cases and classifying mechanisms leading to injury in a much timelier manner than is possible when relying on manual coding of narratives. The aim of this paper is to describe the background, growth, value, challenges and future directions of machine learning as applied to injury surveillance. METHODS This paper reviews key aspects of machine learning using injury narratives, providing a case study to demonstrate an application to an established human-machine learning approach. RESULTS The range of applications and utility of narrative text has increased greatly with advancements in computing techniques over time. Practical and feasible methods exist for semiautomatic classification of injury narratives which are accurate, efficient and meaningful. The human-machine learning approach described in the case study achieved high sensitivity and PPV and reduced the need for human coding to less than a third of cases in one large occupational injury database. CONCLUSIONS The last 20 years have seen a dramatic change in the potential for technological advancements in injury surveillance. Machine learning of 'big injury narrative data' opens up many possibilities for expanded sources of data which can provide more comprehensive, ongoing and timely surveillance to inform future injury prevention policy and practice.
Collapse
Affiliation(s)
- Kirsten Vallmuur
- Queensland University of Technology, Centre for Accident Research and Road Safety-Queensland, Brisbane, Queensland, Australia
| | - Helen R Marucci-Wellman
- Center for Injury Epidemiology, Liberty Mutual Research Institute for Safety, Hopkinton, Massachusetts, USA
| | - Jennifer A Taylor
- Department of Environmental & Occupational Health, School of Public Health, Drexel University, Philadelphia, Pennsylvania, USA
| | - Mark Lehto
- School of Industrial Engineering, Purdue University, West Lafayette, Indiana, USA
| | - Helen L Corns
- Center for Injury Epidemiology, Liberty Mutual Research Institute for Safety, Hopkinton, Massachusetts, USA
| | - Gordon S Smith
- National Center for Trauma and EMS, University of Maryland School of Medicine, Baltimore, Maryland, USA
| |
Collapse
|
25
|
Marucci-Wellman HR, Lehto MR, Corns HL. A practical tool for public health surveillance: Semi-automated coding of short injury narratives from large administrative databases using Naïve Bayes algorithms. ACCIDENT; ANALYSIS AND PREVENTION 2015; 84:165-176. [PMID: 26412196 DOI: 10.1016/j.aap.2015.06.014] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/29/2015] [Accepted: 06/30/2015] [Indexed: 06/05/2023]
Abstract
Public health surveillance programs in the U.S. are undergoing landmark changes with the availability of electronic health records and advancements in information technology. Injury narratives gathered from hospital records, workers compensation claims or national surveys can be very useful for identifying antecedents to injury or emerging risks. However, classifying narratives manually can become prohibitive for large datasets. The purpose of this study was to develop a human-machine system that could be relatively easily tailored to routinely and accurately classify injury narratives from large administrative databases such as workers compensation. We used a semi-automated approach based on two Naïve Bayesian algorithms to classify 15,000 workers compensation narratives into two-digit Bureau of Labor Statistics (BLS) event (leading to injury) codes. Narratives were filtered out for manual review if the algorithms disagreed or made weak predictions. This approach resulted in an overall accuracy of 87%, with consistently high positive predictive values across all two-digit BLS event categories including the very small categories (e.g., exposure to noise, needle sticks). The Naïve Bayes algorithms were able to identify and accurately machine code most narratives leaving only 32% (4853) for manual review. This strategy substantially reduces the need for resources compared with manual review alone.
Collapse
Affiliation(s)
- Helen R Marucci-Wellman
- Center for Injury Epidemiology, Liberty Mutual Research Institute for Safety, Hopkinton, MA, USA.
| | - Mark R Lehto
- School of Industrial Engineering, Purdue University, West Lafayette, IN, USA
| | - Helen L Corns
- Center for Injury Epidemiology, Liberty Mutual Research Institute for Safety, Hopkinton, MA, USA
| |
Collapse
|
26
|
Vallmuur K. Machine learning approaches to analysing textual injury surveillance data: a systematic review. ACCIDENT; ANALYSIS AND PREVENTION 2015; 79:41-49. [PMID: 25795924 DOI: 10.1016/j.aap.2015.03.018] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/14/2014] [Revised: 12/01/2014] [Accepted: 03/12/2015] [Indexed: 06/04/2023]
Abstract
OBJECTIVE To synthesise recent research on the use of machine learning approaches to mining textual injury surveillance data. DESIGN Systematic review. DATA SOURCES The electronic databases which were searched included PubMed, Cinahl, Medline, Google Scholar, and Proquest. The bibliography of all relevant articles was examined and associated articles were identified using a snowballing technique. SELECTION CRITERIA For inclusion, articles were required to meet the following criteria: (a) used a health-related database, (b) focused on injury-related cases, AND used machine learning approaches to analyse textual data. METHODS The papers identified through the search were screened resulting in 16 papers selected for review. Articles were reviewed to describe the databases and methodology used, the strength and limitations of different techniques, and quality assurance approaches used. Due to heterogeneity between studies meta-analysis was not performed. RESULTS Occupational injuries were the focus of half of the machine learning studies and the most common methods described were Bayesian probability or Bayesian network based methods to either predict injury categories or extract common injury scenarios. Models were evaluated through either comparison with gold standard data or content expert evaluation or statistical measures of quality. Machine learning was found to provide high precision and accuracy when predicting a small number of categories, was valuable for visualisation of injury patterns and prediction of future outcomes. However, difficulties related to generalizability, source data quality, complexity of models and integration of content and technical knowledge were discussed. CONCLUSIONS The use of narrative text for injury surveillance has grown in popularity, complexity and quality over recent years. With advances in data mining techniques, increased capacity for analysis of large databases, and involvement of computer scientists in the injury prevention field, along with more comprehensive use and description of quality assurance methods in text mining approaches, it is likely that we will see a continued growth and advancement in knowledge of text mining in the injury field.
Collapse
Affiliation(s)
- Kirsten Vallmuur
- Centre for Accident Research and Road Safety - Queensland, School of Psychology and Counselling, Faculty of Health, Queensland University of Technology, Kelvin Grove 4059, Brisbane, Queensland, Australia.
| |
Collapse
|
27
|
Abstract
Narrative text is a useful way of identifying injury circumstances from the routine emergency department data collections. Automatically classifying narratives based on machine learning techniques is a promising technique, which can consequently reduce the tedious manual classification process. Existing works focus on using Naive Bayes which does not always offer the best performance. This paper proposes the Matrix Factorization approaches along with a learning enhancement process for this task. The results are compared with the performance of various other classification approaches. The impact on the classification results from the parameters setting during the classification of a medical text dataset is discussed. With the selection of right dimension k, Non Negative Matrix Factorization-model method achieves 10 CV accuracy of 0.93.
Collapse
|
28
|
Abstract
The objective of this study was to examine the relationship between slip, trip and fall injuries and obesity in a population of workers at the Idaho National Laboratory (INL) in Idaho Falls, Idaho. INL is an applied engineering facility dedicated to supporting the US Department of Energy's mission. An analysis was performed on injuries reported to the INL Medical Clinic to determine whether obesity was related to an increase in slip, trip and fall injuries. Records were analysed that spanned a 6-year period (2005-2010), and included 8581 employees (mean age, 47 ± 11 years and body mass index [BMI], 29 ± 5 kg/m(2); 34% obesity rate). Of the 189 people who reported slip, trip and fall injuries (mean age, 48 ± 11 years), 51% were obese (P < 0.001 compared with uninjured employees), and their mean BMI was 31 ± 6 kg/m(2) (P < 0.001). Obesity in this population was associated with a greater rate of slip, trip and fall injuries.
Collapse
Affiliation(s)
- Gabriel A Koepp
- a Idaho National Laboratory, Department of Occupational Medicine , Idaho Falls , ID , USA
| | | | | |
Collapse
|
29
|
Beery L, Harris JR, Collins JW, Current RS, Amendola AA, Meyers AR, Wurzelbacher SJ, Lampl M, Bertke SJ. Occupational injuries in Ohio wood product manufacturing: a descriptive analysis with emphasis on saw-related injuries and associated causes. Am J Ind Med 2014; 57:1265-75. [PMID: 25123487 DOI: 10.1002/ajim.22360] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/21/2014] [Indexed: 11/05/2022]
Abstract
BACKGROUND Stationary sawing machinery is often a basic tool in the wood product manufacturing industry and was the source for over 2,500 injury/illness events that resulted in days away from work in 2010. METHODS We examined 9 years of workers' compensation claims for the state of Ohio in wood product manufacturing with specific attention to saw-related claims. For the study period, 8,547 claims were evaluated; from this group, 716 saw-related cases were examined. RESULTS The sawmills and wood preservation sub-sector experienced a 71% reduction in average incidence rate and an 87% reduction in average lost-time incidence rate from 2001 to 2009. The top three injury category descriptions for lost-time incidents within saw-related claims were fracture (35.8%), open wounds (29.6%), and amputation (14.8%). CONCLUSIONS For saw-related injuries, preventing blade contact remains important but securing the work piece to prevent kickback is also important.
Collapse
Affiliation(s)
- Lindsay Beery
- Emory Department of Emergency Medicine; Emory Center for Injury Control; Atlanta Georgia
| | - James R. Harris
- Division of Safety Research; National Institute for Occupational Safety and Health
| | - James W. Collins
- Division of Safety Research; National Institute for Occupational Safety and Health
| | - Richard S. Current
- Division of Safety Research; National Institute for Occupational Safety and Health
| | - Alfred A. Amendola
- Division of Safety Research; National Institute for Occupational Safety and Health
| | - Alysha R. Meyers
- Division of Surveillance; Hazard Evaluations and Field Studies; National Institute for Occupational Safety and Health
| | - Steven J. Wurzelbacher
- Division of Surveillance; Hazard Evaluations and Field Studies; National Institute for Occupational Safety and Health
| | - Mike Lampl
- Division of Safety and Hygiene; Ohio Bureau of Workers' Compensation
| | - Stephen J. Bertke
- Division of Surveillance; Hazard Evaluations and Field Studies; National Institute for Occupational Safety and Health
| |
Collapse
|
30
|
Oleinick A. The association of the original OSHA chemical hazard communication standard with reductions in acute work injuries/illnesses in private industry and the industrial releases of chemical carcinogens. Am J Ind Med 2014; 57:138-52. [PMID: 24243005 DOI: 10.1002/ajim.22269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/23/2013] [Indexed: 11/09/2022]
Abstract
BACKGROUND OSHA predicted the original chemical Hazard Communication Standard (HCS) would cumulatively reduce the lost workday acute injury/illness rate for exposure events by 20% over 20 years and reduce exposure to chemical carcinogens. METHODS JoinPoint trend software identified changes in the rate of change of BLS rates for days away from work for acute injuries/illnesses during 1992-2009 for manufacturing and nonmanufacturing industries for both chemical, noxious or allergenic injury exposure events and All other exposure events. The annual percent change in the rates was used to adjust observed numbers of cases to estimate their association with the standard. A case-control study of EPA's Toxic Release Inventory 1988-2009 data compared carcinogen and non-carcinogens' releases. RESULTS The study estimates that the HCS was associated with a reduction in the number of acute injuries/illnesses due to chemical injury exposure events over the background rate in the range 107,569-459,395 (Hudson method/modified BIC model) depending on whether the HCS is treated as a marginal or sole factor in the decrease. Carcinogen releases have declined at a substantially faster rate than control non-carcinogens. DISCUSSION The previous HCS standard was associated with significant reductions in chemical event acute injuries/illnesses and chemical carcinogen exposures.
Collapse
Affiliation(s)
- Arthur Oleinick
- Department of Environmental Health Sciences; School of Public Health; University of Michigan; Ann Arbor Michigan
| |
Collapse
|
31
|
Taylor JA, Lacovara AV, Smith GS, Pandian R, Lehto M. Near-miss narratives from the fire service: a Bayesian analysis. ACCIDENT; ANALYSIS AND PREVENTION 2014; 62:119-129. [PMID: 24144497 DOI: 10.1016/j.aap.2013.09.012] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/11/2013] [Revised: 09/04/2013] [Accepted: 09/17/2013] [Indexed: 06/02/2023]
Abstract
BACKGROUND In occupational safety research, narrative text analysis has been combined with coded surveillance, data to improve identification and understanding of injuries and their circumstances. Injury data give, information about incidence and the direct cause of an injury, while near-miss data enable the, identification of various hazards within an organization or industry. Further, near-miss data provide an, opportunity for surveillance and risk reduction. The National Firefighter Near-Miss Reporting System, (NFFNMRS) is a voluntary reporting system that collects narrative text data on near-miss and injurious, events within the fire and emergency services industry. In recent research, autocoding techniques, using Bayesian models have been used to categorize/code injury narratives with up to 90% accuracy, thereby reducing the amount of human effort required to manually code large datasets. Autocoding, techniques have not yet been applied to near-miss narrative data. METHODS We manually assigned mechanism of injury codes to previously un-coded narratives from the, NFFNMRS and used this as a training set to develop two Bayesian autocoding models, Fuzzy and Naïve. We calculated sensitivity, specificity and positive predictive value for both models. We also evaluated, the effect of training set size on prediction sensitivity and compared the models' predictive ability as, related to injury outcome. We cross-validated a subset of the prediction set for accuracy of the model, predictions. RESULTS Overall, the Fuzzy model performed better than Naïve, with a sensitivity of 0.74 compared to 0.678., Where Fuzzy and Naïve shared the same prediction, the cross-validation showed a sensitivity of 0.602., As the number of records in the training set increased, the models performed at a higher sensitivity, suggesting that both the Fuzzy and Naïve models were essentially "learning". Injury records were, predicted with greater sensitivity than near-miss records. CONCLUSION We conclude that the application of Bayesian autocoding methods can successfully code both near misses, and injuries in longer-than-average narratives with non-specific prompts regarding injury. Such, coding allowed for the creation of two new quantitative data elements for injury outcome and injury, mechanism.
Collapse
Affiliation(s)
- Jennifer A Taylor
- Department of Environmental & Occupational Health, Drexel University School of Public Health, 1505 Race Street, MS 1034, Philadelphia, PA 19102, United States.
| | | | | | | | | |
Collapse
|