{"title":"Development of a scoring system to quantify errors from semantic characteristics in incident reports","authors":"Haruhiro Uematsu, Masakazu Uemura, Masaru Kurihara, Hiroo Yamamoto, Tomomi Umemura, Fumimasa Kitano, Mariko Hiramatsu, Yoshimasa Nagao","doi":"10.1136/bmjhci-2023-100935","DOIUrl":null,"url":null,"abstract":"Objectives Incident reporting systems are widely used to identify risks and enable organisational learning. Free-text descriptions contain important information about factors associated with incidents. This study aimed to develop error scores by extracting information about the presence of error factors in incidents using an original decision-making model that partly relies on natural language processing techniques. Methods We retrospectively analysed free-text data from reports of incidents between January 2012 and December 2022 from Nagoya University Hospital, Japan. The sample data were randomly allocated to equal-sized training and validation datasets. We conducted morphological analysis on free text to segment terms from sentences in the training dataset. We calculated error scores for terms, individual reports and reports from staff groups according to report volume size and compared these with conventional classifications by patient safety experts. We also calculated accuracy, recall, precision and F-score values from the proposed ‘report error score’. Results Overall, 114 013 reports were included. We calculated 36 131 ‘term error scores’ from the 57 006 reports in the training dataset. There was a significant difference in error scores between reports of incidents categorised by experts as arising from errors (p<0.001, d =0.73 (large)) and other incidents. The accuracy, recall, precision and F-score values were 0.8, 0.82, 0.85 and 0.84, respectively. Group error scores were positively associated with expert ratings (correlation coefficient, 0.66; 95% CI 0.54 to 0.75, p<0.001) for all departments. Conclusion Our error scoring system could provide insights to improve patient safety using aggregated incident report data. Data are available upon reasonable request. The data that support the findings of this study are available from the corresponding author upon reasonable request.","PeriodicalId":9050,"journal":{"name":"BMJ Health & Care Informatics","volume":"58 1","pages":""},"PeriodicalIF":4.1000,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMJ Health & Care Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1136/bmjhci-2023-100935","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
Objectives Incident reporting systems are widely used to identify risks and enable organisational learning. Free-text descriptions contain important information about factors associated with incidents. This study aimed to develop error scores by extracting information about the presence of error factors in incidents using an original decision-making model that partly relies on natural language processing techniques. Methods We retrospectively analysed free-text data from reports of incidents between January 2012 and December 2022 from Nagoya University Hospital, Japan. The sample data were randomly allocated to equal-sized training and validation datasets. We conducted morphological analysis on free text to segment terms from sentences in the training dataset. We calculated error scores for terms, individual reports and reports from staff groups according to report volume size and compared these with conventional classifications by patient safety experts. We also calculated accuracy, recall, precision and F-score values from the proposed ‘report error score’. Results Overall, 114 013 reports were included. We calculated 36 131 ‘term error scores’ from the 57 006 reports in the training dataset. There was a significant difference in error scores between reports of incidents categorised by experts as arising from errors (p<0.001, d =0.73 (large)) and other incidents. The accuracy, recall, precision and F-score values were 0.8, 0.82, 0.85 and 0.84, respectively. Group error scores were positively associated with expert ratings (correlation coefficient, 0.66; 95% CI 0.54 to 0.75, p<0.001) for all departments. Conclusion Our error scoring system could provide insights to improve patient safety using aggregated incident report data. Data are available upon reasonable request. The data that support the findings of this study are available from the corresponding author upon reasonable request.