Equivalence of pathologists' and rule-based parser's annotations of Dutch pathology reports

Gerard TN. Burger , Ameen Abu-Hanna , Nicolette F. de Keizer , Huibert Burger , Ronald Cornet
{"title":"Equivalence of pathologists' and rule-based parser's annotations of Dutch pathology reports","authors":"Gerard TN. Burger ,&nbsp;Ameen Abu-Hanna ,&nbsp;Nicolette F. de Keizer ,&nbsp;Huibert Burger ,&nbsp;Ronald Cornet","doi":"10.1016/j.ibmed.2022.100083","DOIUrl":null,"url":null,"abstract":"<div><h3>Introduction</h3><p>In the Netherlands, pathology reports are annotated using a nationwide pathology network (PALGA) thesaurus. Annotations must address topography, procedure, and diagnosis.</p><p>The Pathology Report Annotation Module (PRAM) can be used to annotate the report conclusion with PALGA-compliant code series. The equivalence of these generated annotations to manual annotations is unknown. We assess the equivalence of annotations by authoring pathologists, pathologists participating in this study, and PRAM.</p></div><div><h3>Methods</h3><p>New annotations were created for one thousand histopathology reports by the PRAM and a pathologist panel. We calculated dissimilarity of annotations using a semantic distance measure, Minimal Transition Cost (MTC). In absence of a gold standard, we compared dissimilarity scores having one common annotator. The resulting comparisons yielded a measure for the coding dissimilarity between PRAM, the pathologist panel and the authoring pathologist. To compare the comprehensiveness of the coding methods, we assessed number and length of the annotations.</p></div><div><h3>Results</h3><p>Eight of the twelve comparisons of dissimilarity scores were significantly equivalent. Non-equivalent score pairs involved dissimilarity between the code series by the original pathologist and the panel pathologists.</p><p>Coding dissimilarity was lowest for procedures, highest for diagnoses: MTC overall = 0.30, topographies = 0.22, procedures = 0.13, diagnoses = 0.33.</p><p>Both number and length of annotations per report increased with report conclusion length, mostly in PRAM-annotated conclusions: conclusion length ranging from 2 to 373 words, number of annotations ranged from 1 to 10 for pathologists, 1–19 for PRAM, annotation length ranged from 3 to 43 codes for pathologists, 4–123 for PRAM.</p></div><div><h3>Conclusions</h3><p>We measured annotation similarity among PRAM, authoring pathologists and panel pathologists. Annotating by PRAM, the panel pathologists and to a lesser extent by the authoring pathologist was equivalent. Therefore, the use of annotations by PRAM in a practical setting is justified. PRAM annotations are equivalent to study-setting annotations, and more comprehensive than routine coding. Further research on annotation quality is needed.</p></div>","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"7 ","pages":"Article 100083"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligence-based medicine","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666521222000369","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Introduction

In the Netherlands, pathology reports are annotated using a nationwide pathology network (PALGA) thesaurus. Annotations must address topography, procedure, and diagnosis.

The Pathology Report Annotation Module (PRAM) can be used to annotate the report conclusion with PALGA-compliant code series. The equivalence of these generated annotations to manual annotations is unknown. We assess the equivalence of annotations by authoring pathologists, pathologists participating in this study, and PRAM.

Methods

New annotations were created for one thousand histopathology reports by the PRAM and a pathologist panel. We calculated dissimilarity of annotations using a semantic distance measure, Minimal Transition Cost (MTC). In absence of a gold standard, we compared dissimilarity scores having one common annotator. The resulting comparisons yielded a measure for the coding dissimilarity between PRAM, the pathologist panel and the authoring pathologist. To compare the comprehensiveness of the coding methods, we assessed number and length of the annotations.

Results

Eight of the twelve comparisons of dissimilarity scores were significantly equivalent. Non-equivalent score pairs involved dissimilarity between the code series by the original pathologist and the panel pathologists.

Coding dissimilarity was lowest for procedures, highest for diagnoses: MTC overall = 0.30, topographies = 0.22, procedures = 0.13, diagnoses = 0.33.

Both number and length of annotations per report increased with report conclusion length, mostly in PRAM-annotated conclusions: conclusion length ranging from 2 to 373 words, number of annotations ranged from 1 to 10 for pathologists, 1–19 for PRAM, annotation length ranged from 3 to 43 codes for pathologists, 4–123 for PRAM.

Conclusions

We measured annotation similarity among PRAM, authoring pathologists and panel pathologists. Annotating by PRAM, the panel pathologists and to a lesser extent by the authoring pathologist was equivalent. Therefore, the use of annotations by PRAM in a practical setting is justified. PRAM annotations are equivalent to study-setting annotations, and more comprehensive than routine coding. Further research on annotation quality is needed.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
荷兰病理学报告病理学家和基于规则的解析器注释的等价性
在荷兰,病理报告是使用全国病理网络(PALGA)辞典注释。注释必须处理地形、程序和诊断。病理报告注释模块(PRAM)可用于用符合palga标准的代码序列注释报告结论。这些生成的注释与手动注释的等价性是未知的。我们评估了作者病理学家、参与本研究的病理学家和PRAM的注释的等效性。方法由PRAM和病理学专家小组对1000份组织病理学报告进行新的注释。我们使用语义距离度量最小转换成本(MTC)来计算注释的不相似性。在没有金标准的情况下,我们比较了具有一个通用注释器的不同分数。由此产生的比较产生了PRAM,病理学家小组和撰写病理学家之间编码差异的测量。为了比较编码方法的全面性,我们评估了注释的数量和长度。结果12个比较中,有8个比较的差异分有显著性相等。非等效分数对涉及原始病理学家和小组病理学家的代码序列之间的不相似性。编码差异在程序方面最低,在诊断方面最高:MTC总体= 0.30,地形= 0.22,程序= 0.13,诊断= 0.33。每篇报告注释的数量和长度都随着报告结论长度的增加而增加,主要以PRAM注释的结论为主:结论长度为2 ~ 373字,病理学家注释数为1 ~ 10条,PRAM注释数为1 ~ 19条,病理学家注释数为3 ~ 43条,PRAM注释数为4 ~ 123条。结论我们测量了PRAM、撰写病理学家和小组病理学家注释的相似性。由PRAM注释,小组病理学家和撰写病理学家在较小程度上是相同的。因此,PRAM在实际环境中使用注释是合理的。PRAM注释相当于研究设置注释,比常规编码更全面。标注质量有待进一步研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Intelligence-based medicine
Intelligence-based medicine Health Informatics
CiteScore
5.00
自引率
0.00%
发文量
0
审稿时长
187 days
期刊最新文献
Artificial intelligence in child development monitoring: A systematic review on usage, outcomes and acceptance Automatic characterization of cerebral MRI images for the detection of autism spectrum disorders DOTnet 2.0: Deep learning network for diffuse optical tomography image reconstruction Artificial intelligence in child development monitoring: A systematic review on usage, outcomes and acceptance Clustering polycystic ovary syndrome laboratory results extracted from a large internet forum with machine learning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1