Development of a scoring system to quantify errors from semantic characteristics in incident reports

IF 4.1 Q1 HEALTH CARE SCIENCES & SERVICES BMJ Health & Care Informatics Pub Date : 2024-04-01 DOI:10.1136/bmjhci-2023-100935

Haruhiro Uematsu, Masakazu Uemura, Masaru Kurihara, Hiroo Yamamoto, Tomomi Umemura, Fumimasa Kitano, Mariko Hiramatsu, Yoshimasa Nagao

{"title":"Development of a scoring system to quantify errors from semantic characteristics in incident reports","authors":"Haruhiro Uematsu, Masakazu Uemura, Masaru Kurihara, Hiroo Yamamoto, Tomomi Umemura, Fumimasa Kitano, Mariko Hiramatsu, Yoshimasa Nagao","doi":"10.1136/bmjhci-2023-100935","DOIUrl":null,"url":null,"abstract":"Objectives Incident reporting systems are widely used to identify risks and enable organisational learning. Free-text descriptions contain important information about factors associated with incidents. This study aimed to develop error scores by extracting information about the presence of error factors in incidents using an original decision-making model that partly relies on natural language processing techniques. Methods We retrospectively analysed free-text data from reports of incidents between January 2012 and December 2022 from Nagoya University Hospital, Japan. The sample data were randomly allocated to equal-sized training and validation datasets. We conducted morphological analysis on free text to segment terms from sentences in the training dataset. We calculated error scores for terms, individual reports and reports from staff groups according to report volume size and compared these with conventional classifications by patient safety experts. We also calculated accuracy, recall, precision and F-score values from the proposed ‘report error score’. Results Overall, 114 013 reports were included. We calculated 36 131 ‘term error scores’ from the 57 006 reports in the training dataset. There was a significant difference in error scores between reports of incidents categorised by experts as arising from errors (p<0.001, d =0.73 (large)) and other incidents. The accuracy, recall, precision and F-score values were 0.8, 0.82, 0.85 and 0.84, respectively. Group error scores were positively associated with expert ratings (correlation coefficient, 0.66; 95% CI 0.54 to 0.75, p<0.001) for all departments. Conclusion Our error scoring system could provide insights to improve patient safety using aggregated incident report data. Data are available upon reasonable request. The data that support the findings of this study are available from the corresponding author upon reasonable request.","PeriodicalId":9050,"journal":{"name":"BMJ Health & Care Informatics","volume":"58 1","pages":""},"PeriodicalIF":4.1000,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMJ Health & Care Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1136/bmjhci-2023-100935","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

Abstract

Objectives Incident reporting systems are widely used to identify risks and enable organisational learning. Free-text descriptions contain important information about factors associated with incidents. This study aimed to develop error scores by extracting information about the presence of error factors in incidents using an original decision-making model that partly relies on natural language processing techniques. Methods We retrospectively analysed free-text data from reports of incidents between January 2012 and December 2022 from Nagoya University Hospital, Japan. The sample data were randomly allocated to equal-sized training and validation datasets. We conducted morphological analysis on free text to segment terms from sentences in the training dataset. We calculated error scores for terms, individual reports and reports from staff groups according to report volume size and compared these with conventional classifications by patient safety experts. We also calculated accuracy, recall, precision and F-score values from the proposed ‘report error score’. Results Overall, 114 013 reports were included. We calculated 36 131 ‘term error scores’ from the 57 006 reports in the training dataset. There was a significant difference in error scores between reports of incidents categorised by experts as arising from errors (p<0.001, d =0.73 (large)) and other incidents. The accuracy, recall, precision and F-score values were 0.8, 0.82, 0.85 and 0.84, respectively. Group error scores were positively associated with expert ratings (correlation coefficient, 0.66; 95% CI 0.54 to 0.75, p<0.001) for all departments. Conclusion Our error scoring system could provide insights to improve patient safety using aggregated incident report data. Data are available upon reasonable request. The data that support the findings of this study are available from the corresponding author upon reasonable request.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

开发评分系统，从事故报告中的语义特征量化错误

目标事故报告系统被广泛用于识别风险和促进组织学习。自由文本描述包含与事件相关因素的重要信息。本研究旨在利用部分依赖于自然语言处理技术的原创决策模型，通过提取事件中存在的错误因素的信息来开发错误评分。方法我们回顾性分析了日本名古屋大学医院 2012 年 1 月至 2022 年 12 月期间事故报告中的自由文本数据。样本数据被随机分配到大小相等的训练数据集和验证数据集。我们对自由文本进行形态分析，从训练数据集中的句子中分割术语。我们根据报告数量的大小计算术语、单个报告和员工小组报告的误差分值，并将其与患者安全专家的传统分类进行比较。我们还根据建议的 "报告错误分数 "计算了准确度、召回率、精确度和 F 分数。结果共纳入 114 013 份报告。我们从训练数据集中的 57 006 份报告中计算出了 36 131 个 "术语错误分数"。被专家归类为由错误引起的事件报告（p<0.001，d =0.73（大））与其他事件报告之间的错误得分存在明显差异。准确度、召回率、精确度和 F 值分别为 0.8、0.82、0.85 和 0.84。所有部门的小组错误评分与专家评分呈正相关（相关系数为 0.66；95% CI 为 0.54 至 0.75，p<0.001）。结论我们的错误评分系统可以利用事故报告汇总数据为改善患者安全提供见解。如有合理要求，可提供相关数据。支持本研究结果的数据可向通讯作者索取。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊