Automatic Detection of Distant Metastasis Mentions in Radiology Reports in Spanish.

IF 3.3 Q2 ONCOLOGY JCO Clinical Cancer Informatics Pub Date : 2024-01-01 DOI:10.1200/CCI.23.00130
Ricardo Ahumada, Jocelyn Dunstan, Matías Rojas, Sergio Peñafiel, Inti Paredes, Pablo Báez
{"title":"Automatic Detection of Distant Metastasis Mentions in Radiology Reports in Spanish.","authors":"Ricardo Ahumada, Jocelyn Dunstan, Matías Rojas, Sergio Peñafiel, Inti Paredes, Pablo Báez","doi":"10.1200/CCI.23.00130","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>A critical task in oncology is extracting information related to cancer metastasis from electronic health records. Metastasis-related information is crucial for planning treatment, evaluating patient prognoses, and cancer research. However, the unstructured way in which findings of distant metastasis are often written in radiology reports makes it difficult to extract information automatically. The main aim of this study was to extract distant metastasis findings from free-text imaging and nuclear medicine reports to classify the patient status according to the presence or absence of distant metastasis.</p><p><strong>Materials and methods: </strong>We created a distant metastasis annotated corpus using positron emission tomography-computed tomography and computed tomography reports of patients with prostate, colorectal, and breast cancers. Entities were labeled M1 or M0 according to affirmative or negative metastasis descriptions. We used a named entity recognition model on the basis of a bidirectional long short-term memory model and conditional random fields to identify entities. Mentions were subsequently used to classify whole reports into M1 or M0.</p><p><strong>Results: </strong>The model detected distant metastasis mentions with a weighted average <i>F</i><sub>1</sub> score performance of 0.84. Whole reports were classified with an <i>F</i><sub>1</sub> score of 0.92 for M0 documents and 0.90 for M1 documents.</p><p><strong>Conclusion: </strong>These results show the usefulness of the model in detecting distant metastasis findings in three different types of cancer and the consequent classification of reports. The relevance of this study is to generate structured distant metastasis information from free-text imaging reports in Spanish. In addition, the manually annotated corpus, annotation guidelines, and code are freely released to the research community.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":null,"pages":null},"PeriodicalIF":3.3000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10793975/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JCO Clinical Cancer Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1200/CCI.23.00130","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Purpose: A critical task in oncology is extracting information related to cancer metastasis from electronic health records. Metastasis-related information is crucial for planning treatment, evaluating patient prognoses, and cancer research. However, the unstructured way in which findings of distant metastasis are often written in radiology reports makes it difficult to extract information automatically. The main aim of this study was to extract distant metastasis findings from free-text imaging and nuclear medicine reports to classify the patient status according to the presence or absence of distant metastasis.

Materials and methods: We created a distant metastasis annotated corpus using positron emission tomography-computed tomography and computed tomography reports of patients with prostate, colorectal, and breast cancers. Entities were labeled M1 or M0 according to affirmative or negative metastasis descriptions. We used a named entity recognition model on the basis of a bidirectional long short-term memory model and conditional random fields to identify entities. Mentions were subsequently used to classify whole reports into M1 or M0.

Results: The model detected distant metastasis mentions with a weighted average F1 score performance of 0.84. Whole reports were classified with an F1 score of 0.92 for M0 documents and 0.90 for M1 documents.

Conclusion: These results show the usefulness of the model in detecting distant metastasis findings in three different types of cancer and the consequent classification of reports. The relevance of this study is to generate structured distant metastasis information from free-text imaging reports in Spanish. In addition, the manually annotated corpus, annotation guidelines, and code are freely released to the research community.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用西班牙语自动检测放射学报告中的远处转移病灶。
目的:肿瘤学的一项关键任务是从电子健康记录中提取与癌症转移相关的信息。转移相关信息对于制定治疗计划、评估病人预后和癌症研究至关重要。然而,由于放射学报告中的远处转移发现通常采用非结构化的书写方式,因此很难自动提取信息。本研究的主要目的是从自由文本的影像学和核医学报告中提取远处转移的结果,并根据有无远处转移对患者状态进行分类:我们利用前列腺癌、结直肠癌和乳腺癌患者的正电子发射断层扫描-计算机断层扫描和计算机断层扫描报告创建了远处转移注释语料库。根据肯定或否定的转移描述,实体被标记为 M1 或 M0。我们在双向长短期记忆模型和条件随机场的基础上使用命名实体识别模型来识别实体。随后,我们使用实体识别模型将整个报告分为 M1 或 M0:结果:该模型检测到的远处转移提及加权平均 F1 分数为 0.84。对整个报告进行分类时,M0 文档的 F1 得分为 0.92,M1 文档的 F1 得分为 0.90:这些结果表明,该模型在检测三种不同类型癌症的远处转移结果以及随后对报告进行分类方面非常有用。这项研究的意义在于从西班牙语的自由文本成像报告中生成结构化的远处转移信息。此外,人工标注的语料库、标注指南和代码也免费向研究界发布。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
6.20
自引率
4.80%
发文量
190
期刊最新文献
Increasing Power in Phase III Oncology Trials With Multivariable Regression: An Empirical Assessment of 535 Primary End Point Analyses. Validation of Non-Small Cell Lung Cancer Clinical Insights Using a Generalized Oncology Natural Language Processing Model. Interinstitutional Approach to Advancing Geospatial Technologies for US Cancer Centers. Classification and Regression Trees to Predict for Survival for Patients With Hepatocellular Carcinoma Treated With Atezolizumab and Bevacizumab. Cureit: An End-to-End Pipeline for Implementing Mixture Cure Models With an Application to Liposarcoma Data.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1