Comparison of active learning algorithms in classifying head computed tomography reports using bidirectional encoder representations from transformers.

IF 2.3 3区医学 Q3 ENGINEERING, BIOMEDICAL International Journal of Computer Assisted Radiology and Surgery Pub Date : 2025-01-08 DOI:10.1007/s11548-024-03316-7

Tomohiro Wataya, Azusa Miura, Takahisa Sakisuka, Masahiro Fujiwara, Hisashi Tanaka, Yu Hiraoka, Junya Sato, Miyuki Tomiyama, Daiki Nishigaki, Kosuke Kita, Yuki Suzuki, Shoji Kido, Noriyuki Tomiyama

{"title":"Comparison of active learning algorithms in classifying head computed tomography reports using bidirectional encoder representations from transformers.","authors":"Tomohiro Wataya, Azusa Miura, Takahisa Sakisuka, Masahiro Fujiwara, Hisashi Tanaka, Yu Hiraoka, Junya Sato, Miyuki Tomiyama, Daiki Nishigaki, Kosuke Kita, Yuki Suzuki, Shoji Kido, Noriyuki Tomiyama","doi":"10.1007/s11548-024-03316-7","DOIUrl":null,"url":null,"abstract":"Purpose: Systems equipped with natural language (NLP) processing can reduce missed radiological findings by physicians, but the annotation costs are burden in the development. This study aimed to compare the effects of active learning (AL) algorithms in NLP for estimating the significance of head computed tomography (CT) reports using bidirectional encoder representations from transformers (BERT).Methods: A total of 3728 head CT reports annotated with five categories of importance were used and UTH-BERT was adopted as the pre-trained BERT model. We assumed that 64% (2385 reports) of the data were initially in the unlabeled data pool (UDP), while the labeled data set (LD) used to train the model was empty. Twenty-five reports were repeatedly selected from the UDP and added to the LD, based on seven metrices: random sampling (RS: control), four uncertainty sampling (US) methods (least confidence (LC), margin sampling (MS), ratio of confidence (RC), and entropy sampling (ES)), and two distance-based sampling (DS) methods (cosine distance (CD) and Euclidian distance (ED)). The transition of accuracy of the model was evaluated using the test dataset.Results: The accuracy of the models with US was significantly higher than RS when reports in LD were < 1800, whereas DS methods were significantly lower than RS. Among the US methods, MS and RC were even better than the others. With the US methods, the required labeled data decreased by 15.4-40.5%, and most efficient in RC. In addition, in the US methods, data for minor categories tended to be added to LD earlier than RS and DS.Conclusions: In the classification task for the importance of head CT reports, US methods, especially RC and MS can lead to the effective fine-tuning of BERT models and reduce the imbalance of categories. AL can contribute to other studies on larger datasets by providing effective annotation.","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computer Assisted Radiology and Surgery","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s11548-024-03316-7","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Purpose: Systems equipped with natural language (NLP) processing can reduce missed radiological findings by physicians, but the annotation costs are burden in the development. This study aimed to compare the effects of active learning (AL) algorithms in NLP for estimating the significance of head computed tomography (CT) reports using bidirectional encoder representations from transformers (BERT).

Methods: A total of 3728 head CT reports annotated with five categories of importance were used and UTH-BERT was adopted as the pre-trained BERT model. We assumed that 64% (2385 reports) of the data were initially in the unlabeled data pool (UDP), while the labeled data set (LD) used to train the model was empty. Twenty-five reports were repeatedly selected from the UDP and added to the LD, based on seven metrices: random sampling (RS: control), four uncertainty sampling (US) methods (least confidence (LC), margin sampling (MS), ratio of confidence (RC), and entropy sampling (ES)), and two distance-based sampling (DS) methods (cosine distance (CD) and Euclidian distance (ED)). The transition of accuracy of the model was evaluated using the test dataset.

Results: The accuracy of the models with US was significantly higher than RS when reports in LD were < 1800, whereas DS methods were significantly lower than RS. Among the US methods, MS and RC were even better than the others. With the US methods, the required labeled data decreased by 15.4-40.5%, and most efficient in RC. In addition, in the US methods, data for minor categories tended to be added to LD earlier than RS and DS.

Conclusions: In the classification task for the importance of head CT reports, US methods, especially RC and MS can lead to the effective fine-tuning of BERT models and reduce the imbalance of categories. AL can contribute to other studies on larger datasets by providing effective annotation.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

使用变压器双向编码器表示分类头部计算机断层扫描报告的主动学习算法的比较。

目的：配备自然语言（NLP）处理的系统可以减少医生遗漏的放射发现，但注释成本是开发中的负担。本研究旨在比较NLP中主动学习（AL）算法在估计使用变压器（BERT）双向编码器表示的头部计算机断层扫描（CT）报告的重要性方面的效果。方法：选取3728份头部CT报告，标注5类重要性，采用UTH-BERT作为预训练的BERT模型。我们假设64%（2385份报告）的数据最初在未标记的数据池（UDP）中，而用于训练模型的标记数据集（LD）是空的。根据随机抽样（RS: control）、四种不确定性抽样（US）方法（最小置信度（LC）、边际抽样（MS）、置信比（RC）和熵抽样（ES））和两种基于距离的抽样（DS）方法（余弦距离（CD）和欧氏距离（ED）），从UDP中重复选择25份报告并添加到LD中。利用测试数据集对模型的精度过渡进行了评价。结论：在头部CT报告重要性的分类任务中，US方法，尤其是RC和MS方法，可以对BERT模型进行有效的微调，减少类别的不平衡。通过提供有效的注释，人工智能可以为更大数据集的其他研究做出贡献。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

International Journal of Computer Assisted Radiology and Surgery ENGINEERING, BIOMEDICAL-RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING

CiteScore

5.90

自引率

6.70%

发文量

243

审稿时长

6-12 weeks

期刊介绍： The International Journal for Computer Assisted Radiology and Surgery (IJCARS) is a peer-reviewed journal that provides a platform for closing the gap between medical and technical disciplines, and encourages interdisciplinary research and development activities in an international environment.