A Transformer-Based Pipeline for German Clinical Document De-Identification.

IF 2.1 2区 医学 Q4 MEDICAL INFORMATICS Applied Clinical Informatics Pub Date : 2025-01-01 Epub Date: 2025-01-08 DOI:10.1055/a-2424-1989
Kamyar Arzideh, Giulia Baldini, Philipp Winnekens, Christoph M Friedrich, Felix Nensa, Ahmad Idrissi-Yaghir, René Hosch
{"title":"A Transformer-Based Pipeline for German Clinical Document De-Identification.","authors":"Kamyar Arzideh, Giulia Baldini, Philipp Winnekens, Christoph M Friedrich, Felix Nensa, Ahmad Idrissi-Yaghir, René Hosch","doi":"10.1055/a-2424-1989","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong> Commercially available large language models such as Chat Generative Pre-Trained Transformer (ChatGPT) cannot be applied to real patient data for data protection reasons. At the same time, de-identification of clinical unstructured data is a tedious and time-consuming task when done manually. Since transformer models can efficiently process and analyze large amounts of text data, our study aims to explore the impact of a large training dataset on the performance of this task.</p><p><strong>Methods: </strong> We utilized a substantial dataset of 10,240 German hospital documents from 1,130 patients, created as part of the investigating hospital's routine documentation, as training data. Our approach involved fine-tuning and training an ensemble of two transformer-based language models simultaneously to identify sensitive data within our documents. Annotation Guidelines with specific annotation categories and types were created for annotator training.</p><p><strong>Results: </strong> Performance evaluation on a test dataset of 100 manually annotated documents revealed that our fine-tuned German ELECTRA (gELECTRA) model achieved an F1 macro average score of 0.95, surpassing human annotators who scored 0.93.</p><p><strong>Conclusion: </strong> We trained and evaluated transformer models to detect sensitive information in German real-world pathology reports and progress notes. By defining an annotation scheme tailored to the documents of the investigating hospital and creating annotation guidelines for staff training, a further experimental study was conducted to compare the models with humans. These results showed that the best-performing model achieved better overall results than two experienced annotators who manually labeled 100 clinical documents.</p>","PeriodicalId":48956,"journal":{"name":"Applied Clinical Informatics","volume":"16 1","pages":"31-43"},"PeriodicalIF":2.1000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11710903/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Clinical Informatics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1055/a-2424-1989","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/8 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0

Abstract

Objective:  Commercially available large language models such as Chat Generative Pre-Trained Transformer (ChatGPT) cannot be applied to real patient data for data protection reasons. At the same time, de-identification of clinical unstructured data is a tedious and time-consuming task when done manually. Since transformer models can efficiently process and analyze large amounts of text data, our study aims to explore the impact of a large training dataset on the performance of this task.

Methods:  We utilized a substantial dataset of 10,240 German hospital documents from 1,130 patients, created as part of the investigating hospital's routine documentation, as training data. Our approach involved fine-tuning and training an ensemble of two transformer-based language models simultaneously to identify sensitive data within our documents. Annotation Guidelines with specific annotation categories and types were created for annotator training.

Results:  Performance evaluation on a test dataset of 100 manually annotated documents revealed that our fine-tuned German ELECTRA (gELECTRA) model achieved an F1 macro average score of 0.95, surpassing human annotators who scored 0.93.

Conclusion:  We trained and evaluated transformer models to detect sensitive information in German real-world pathology reports and progress notes. By defining an annotation scheme tailored to the documents of the investigating hospital and creating annotation guidelines for staff training, a further experimental study was conducted to compare the models with humans. These results showed that the best-performing model achieved better overall results than two experienced annotators who manually labeled 100 clinical documents.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于变压器的德国临床文件去识别管道。
目的:由于数据保护的原因,商业上可用的大型语言模型,如聊天生成预训练转换器(ChatGPT),不能应用于真实的患者数据。与此同时,临床非结构化数据的去识别是一项繁琐且耗时的任务。由于变压器模型可以有效地处理和分析大量文本数据,因此我们的研究旨在探索大型训练数据集对该任务性能的影响。方法:我们利用了来自1,130名患者的10,240份德国医院文件的大量数据集,作为调查医院常规文件的一部分创建,作为培训数据。我们的方法包括同时对两个基于转换器的语言模型进行微调和训练,以识别文档中的敏感数据。为注释员培训创建了带有特定注释类别和类型的注释指南。结果:在100个手动注释文档的测试数据集上进行的性能评估显示,我们经过微调的德国ELECTRA (gELECTRA)模型的F1宏观平均得分为0.95,超过了人类注释器的0.93分。结论:我们训练和评估了变压器模型,以检测德国真实世界病理报告和进展记录中的敏感信息。通过定义针对调查医院文件的注释方案,并为员工培训创建注释指南,进行了进一步的实验研究,将模型与人类进行比较。这些结果表明,表现最好的模型比两个有经验的注释者手动标记100个临床文档的总体结果更好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Applied Clinical Informatics
Applied Clinical Informatics MEDICAL INFORMATICS-
CiteScore
4.60
自引率
24.10%
发文量
132
期刊介绍: ACI is the third Schattauer journal dealing with biomedical and health informatics. It perfectly complements our other journals Öffnet internen Link im aktuellen FensterMethods of Information in Medicine and the Öffnet internen Link im aktuellen FensterYearbook of Medical Informatics. The Yearbook of Medical Informatics being the “Milestone” or state-of-the-art journal and Methods of Information in Medicine being the “Science and Research” journal of IMIA, ACI intends to be the “Practical” journal of IMIA.
期刊最新文献
Sharing a Hybrid EHR + FHIR CDS Tool Across Health Systems: Automating Smoking Cessation for Pediatric Caregivers. Application of an Externally Developed Algorithm to Identify Research Cases and Controls from Electronic Health Record Data: Failures and Successes. Association of an HIV-Prediction Model with Uptake of Pre-Exposure Prophylaxis (PrEP). Pediatric Predictive Artificial Intelligence Implemented in Clinical Practice from 2010-2021: A Systematic Review. Exploring Mixed Reality for Patient Education in Cerebral Angiograms: A Pilot Study.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1