Development of a scoring system to quantify errors from semantic characteristics in incident reports

IF 4.1 Q1 HEALTH CARE SCIENCES & SERVICES BMJ Health & Care Informatics Pub Date : 2024-04-01 DOI:10.1136/bmjhci-2023-100935
Haruhiro Uematsu, Masakazu Uemura, Masaru Kurihara, Hiroo Yamamoto, Tomomi Umemura, Fumimasa Kitano, Mariko Hiramatsu, Yoshimasa Nagao
{"title":"Development of a scoring system to quantify errors from semantic characteristics in incident reports","authors":"Haruhiro Uematsu, Masakazu Uemura, Masaru Kurihara, Hiroo Yamamoto, Tomomi Umemura, Fumimasa Kitano, Mariko Hiramatsu, Yoshimasa Nagao","doi":"10.1136/bmjhci-2023-100935","DOIUrl":null,"url":null,"abstract":"Objectives Incident reporting systems are widely used to identify risks and enable organisational learning. Free-text descriptions contain important information about factors associated with incidents. This study aimed to develop error scores by extracting information about the presence of error factors in incidents using an original decision-making model that partly relies on natural language processing techniques. Methods We retrospectively analysed free-text data from reports of incidents between January 2012 and December 2022 from Nagoya University Hospital, Japan. The sample data were randomly allocated to equal-sized training and validation datasets. We conducted morphological analysis on free text to segment terms from sentences in the training dataset. We calculated error scores for terms, individual reports and reports from staff groups according to report volume size and compared these with conventional classifications by patient safety experts. We also calculated accuracy, recall, precision and F-score values from the proposed ‘report error score’. Results Overall, 114 013 reports were included. We calculated 36 131 ‘term error scores’ from the 57 006 reports in the training dataset. There was a significant difference in error scores between reports of incidents categorised by experts as arising from errors (p<0.001, d =0.73 (large)) and other incidents. The accuracy, recall, precision and F-score values were 0.8, 0.82, 0.85 and 0.84, respectively. Group error scores were positively associated with expert ratings (correlation coefficient, 0.66; 95% CI 0.54 to 0.75, p<0.001) for all departments. Conclusion Our error scoring system could provide insights to improve patient safety using aggregated incident report data. Data are available upon reasonable request. The data that support the findings of this study are available from the corresponding author upon reasonable request.","PeriodicalId":9050,"journal":{"name":"BMJ Health & Care Informatics","volume":"58 1","pages":""},"PeriodicalIF":4.1000,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMJ Health & Care Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1136/bmjhci-2023-100935","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

Abstract

Objectives Incident reporting systems are widely used to identify risks and enable organisational learning. Free-text descriptions contain important information about factors associated with incidents. This study aimed to develop error scores by extracting information about the presence of error factors in incidents using an original decision-making model that partly relies on natural language processing techniques. Methods We retrospectively analysed free-text data from reports of incidents between January 2012 and December 2022 from Nagoya University Hospital, Japan. The sample data were randomly allocated to equal-sized training and validation datasets. We conducted morphological analysis on free text to segment terms from sentences in the training dataset. We calculated error scores for terms, individual reports and reports from staff groups according to report volume size and compared these with conventional classifications by patient safety experts. We also calculated accuracy, recall, precision and F-score values from the proposed ‘report error score’. Results Overall, 114 013 reports were included. We calculated 36 131 ‘term error scores’ from the 57 006 reports in the training dataset. There was a significant difference in error scores between reports of incidents categorised by experts as arising from errors (p<0.001, d =0.73 (large)) and other incidents. The accuracy, recall, precision and F-score values were 0.8, 0.82, 0.85 and 0.84, respectively. Group error scores were positively associated with expert ratings (correlation coefficient, 0.66; 95% CI 0.54 to 0.75, p<0.001) for all departments. Conclusion Our error scoring system could provide insights to improve patient safety using aggregated incident report data. Data are available upon reasonable request. The data that support the findings of this study are available from the corresponding author upon reasonable request.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
开发评分系统,从事故报告中的语义特征量化错误
目标 事故报告系统被广泛用于识别风险和促进组织学习。自由文本描述包含与事件相关因素的重要信息。本研究旨在利用部分依赖于自然语言处理技术的原创决策模型,通过提取事件中存在的错误因素的信息来开发错误评分。方法 我们回顾性分析了日本名古屋大学医院 2012 年 1 月至 2022 年 12 月期间事故报告中的自由文本数据。样本数据被随机分配到大小相等的训练数据集和验证数据集。我们对自由文本进行形态分析,从训练数据集中的句子中分割术语。我们根据报告数量的大小计算术语、单个报告和员工小组报告的误差分值,并将其与患者安全专家的传统分类进行比较。我们还根据建议的 "报告错误分数 "计算了准确度、召回率、精确度和 F 分数。结果 共纳入 114 013 份报告。我们从训练数据集中的 57 006 份报告中计算出了 36 131 个 "术语错误分数"。被专家归类为由错误引起的事件报告(p<0.001,d =0.73(大))与其他事件报告之间的错误得分存在明显差异。准确度、召回率、精确度和 F 值分别为 0.8、0.82、0.85 和 0.84。所有部门的小组错误评分与专家评分呈正相关(相关系数为 0.66;95% CI 为 0.54 至 0.75,p<0.001)。结论 我们的错误评分系统可以利用事故报告汇总数据为改善患者安全提供见解。如有合理要求,可提供相关数据。支持本研究结果的数据可向通讯作者索取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
6.10
自引率
4.90%
发文量
40
审稿时长
18 weeks
期刊最新文献
Scaling equitable artificial intelligence in healthcare with machine learning operations. Understanding prescribing errors for system optimisation: the technology-related error mechanism classification. Detection of hypertension from pharyngeal images using deep learning algorithm in primary care settings in Japan. PubMed captures more fine-grained bibliographic data on scientific commentary than Web of Science: a comparative analysis. Method to apply temporal graph analysis on electronic patient record data to explore healthcare professional-patient interaction intensity: a cohort study.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1