Ron Keuth, Lasse Hansen, Maren Balks, Ronja Jäger, Anne-Nele Schröder, Ludger Tüshaus, Mattias Heinrich
{"title":"DenseSeg: joint learning for semantic segmentation and landmark detection using dense image-to-shape representation.","authors":"Ron Keuth, Lasse Hansen, Maren Balks, Ronja Jäger, Anne-Nele Schröder, Ludger Tüshaus, Mattias Heinrich","doi":"10.1007/s11548-024-03315-8","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Semantic segmentation and landmark detection are fundamental tasks of medical image processing, facilitating further analysis of anatomical objects. Although deep learning-based pixel-wise classification has set a new-state-of-the-art for segmentation, it falls short in landmark detection, a strength of shape-based approaches.</p><p><strong>Methods: </strong>In this work, we propose a dense image-to-shape representation that enables the joint learning of landmarks and semantic segmentation by employing a fully convolutional architecture. Our method intuitively allows the extraction of arbitrary landmarks due to its representation of anatomical correspondences. We benchmark our method against the state-of-the-art for semantic segmentation (nnUNet), a shape-based approach employing geometric deep learning and a convolutional neural network-based method for landmark detection.</p><p><strong>Results: </strong>We evaluate our method on two medical datasets: one common benchmark featuring the lungs, heart, and clavicle from thorax X-rays, and another with 17 different bones in the paediatric wrist. While our method is on par with the landmark detection baseline in the thorax setting (error in mm of <math><mrow><mn>2.6</mn> <mo>±</mo> <mn>0.9</mn></mrow> </math> vs. <math><mrow><mn>2.7</mn> <mo>±</mo> <mn>0.9</mn></mrow> </math> ), it substantially surpassed it in the more complex wrist setting ( <math><mrow><mn>1.1</mn> <mo>±</mo> <mn>0.6</mn></mrow> </math> vs. <math><mrow><mn>1.9</mn> <mo>±</mo> <mn>0.5</mn></mrow> </math> ).</p><p><strong>Conclusion: </strong>We demonstrate that dense geometric shape representation is beneficial for challenging landmark detection tasks and outperforms previous state-of-the-art using heatmap regression. While it does not require explicit training on the landmarks themselves, allowing for the addition of new landmarks without necessitating retraining.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"441-451"},"PeriodicalIF":2.3000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computer Assisted Radiology and Surgery","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s11548-024-03315-8","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/23 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: Semantic segmentation and landmark detection are fundamental tasks of medical image processing, facilitating further analysis of anatomical objects. Although deep learning-based pixel-wise classification has set a new-state-of-the-art for segmentation, it falls short in landmark detection, a strength of shape-based approaches.
Methods: In this work, we propose a dense image-to-shape representation that enables the joint learning of landmarks and semantic segmentation by employing a fully convolutional architecture. Our method intuitively allows the extraction of arbitrary landmarks due to its representation of anatomical correspondences. We benchmark our method against the state-of-the-art for semantic segmentation (nnUNet), a shape-based approach employing geometric deep learning and a convolutional neural network-based method for landmark detection.
Results: We evaluate our method on two medical datasets: one common benchmark featuring the lungs, heart, and clavicle from thorax X-rays, and another with 17 different bones in the paediatric wrist. While our method is on par with the landmark detection baseline in the thorax setting (error in mm of vs. ), it substantially surpassed it in the more complex wrist setting ( vs. ).
Conclusion: We demonstrate that dense geometric shape representation is beneficial for challenging landmark detection tasks and outperforms previous state-of-the-art using heatmap regression. While it does not require explicit training on the landmarks themselves, allowing for the addition of new landmarks without necessitating retraining.
目的:语义分割和地标检测是医学图像处理的基本任务,为解剖对象的进一步分析提供了方便。尽管基于深度学习的逐像素分类为分割设定了新的技术水平,但它在基于形状的方法的优势——地标检测方面存在不足。方法:在这项工作中,我们提出了一种密集的图像到形状表示,通过采用全卷积架构来实现地标和语义分割的联合学习。我们的方法直观地允许任意地标的提取,由于其表示解剖对应。我们将我们的方法与最先进的语义分割(nnUNet)进行基准测试,这是一种基于形状的方法,采用几何深度学习和基于卷积神经网络的方法进行地标检测。结果:我们在两个医疗数据集上评估了我们的方法:一个是来自胸腔x光片的肺、心脏和锁骨的共同基准,另一个是来自儿科手腕的17块不同骨骼。虽然我们的方法在胸部环境下与地标检测基线相当(mm误差为2.6±0.9 vs. 2.7±0.9),但在更复杂的手腕环境下,它大大超过了它(1.1±0.6 vs. 1.9±0.5)。结论:我们证明密集几何形状表示有利于具有挑战性的地标检测任务,并且优于先前使用热图回归的最新技术。虽然它不需要对地标本身进行明确的培训,但允许在不需要再培训的情况下添加新的地标。
期刊介绍:
The International Journal for Computer Assisted Radiology and Surgery (IJCARS) is a peer-reviewed journal that provides a platform for closing the gap between medical and technical disciplines, and encourages interdisciplinary research and development activities in an international environment.