首页 > 最新文献

arXiv - EE - Image and Video Processing最新文献

英文 中文
From FDG to PSMA: A Hitchhiker's Guide to Multitracer, Multicenter Lesion Segmentation in PET/CT Imaging 从 FDG 到 PSMA:PET/CT 成像中多示踪剂、多中心病灶分割的搭便车指南
Pub Date : 2024-09-14 DOI: arxiv-2409.09478
Maximilian Rokuss, Balint Kovacs, Yannick Kirchhoff, Shuhan Xiao, Constantin Ulrich, Klaus H. Maier-Hein, Fabian Isensee
Automated lesion segmentation in PET/CT scans is crucial for improvingclinical workflows and advancing cancer diagnostics. However, the task ischallenging due to physiological variability, different tracers used in PETimaging, and diverse imaging protocols across medical centers. To address this,the autoPET series was created to challenge researchers to develop algorithmsthat generalize across diverse PET/CT environments. This paper presents oursolution for the autoPET III challenge, targeting multitracer, multicentergeneralization using the nnU-Net framework with the ResEncL architecture. Keytechniques include misalignment data augmentation and multi-modal pretrainingacross CT, MR, and PET datasets to provide an initial anatomical understanding.We incorporate organ supervision as a multitask approach, enabling the model todistinguish between physiological uptake and tracer-specific patterns, which isparticularly beneficial in cases where no lesions are present. Compared to thedefault nnU-Net, which achieved a Dice score of 57.61, or the larger ResEncL(65.31) our model significantly improved performance with a Dice score of68.40, alongside a reduction in false positive (FPvol: 7.82) and false negative(FNvol: 10.35) volumes. These results underscore the effectiveness of combiningadvanced network design, augmentation, pretraining, and multitask learning forPET/CT lesion segmentation. Code is publicly available athttps://github.com/MIC-DKFZ/autopet-3-submission.
PET/CT 扫描中的自动病灶分割对于改善临床工作流程和推进癌症诊断至关重要。然而,由于生理变化、PET成像中使用的不同示踪剂以及各医疗中心成像方案的不同,这项任务具有挑战性。为了解决这个问题,我们创建了 autoPET 系列,以挑战研究人员开发能在不同 PET/CT 环境中通用的算法。本文介绍了我们针对 autoPET III 挑战的解决方案,目标是使用带有 ResEncL 架构的 nnU-Net 框架实现多示踪器、多中心泛化。关键技术包括错位数据增强和跨 CT、MR 和 PET 数据集的多模态预训练,以提供初步的解剖学理解。我们将器官监督作为一种多任务方法,使模型能够区分生理摄取和示踪剂特异性模式,这在没有病变的情况下尤其有益。与 Dice 得分为 57.61 的默认 nnU-Net 或更大的 ResEncL(65.31)相比,我们的模型显著提高了性能,Dice 得分为 68.40,同时减少了假阳性(FPvol: 7.82)和假阴性(FNvol: 10.35)体积。这些结果凸显了将先进的网络设计、增强、预训练和多任务学习相结合用于PET/CT病灶分割的有效性。代码公开于https://github.com/MIC-DKFZ/autopet-3-submission。
{"title":"From FDG to PSMA: A Hitchhiker's Guide to Multitracer, Multicenter Lesion Segmentation in PET/CT Imaging","authors":"Maximilian Rokuss, Balint Kovacs, Yannick Kirchhoff, Shuhan Xiao, Constantin Ulrich, Klaus H. Maier-Hein, Fabian Isensee","doi":"arxiv-2409.09478","DOIUrl":"https://doi.org/arxiv-2409.09478","url":null,"abstract":"Automated lesion segmentation in PET/CT scans is crucial for improving\u0000clinical workflows and advancing cancer diagnostics. However, the task is\u0000challenging due to physiological variability, different tracers used in PET\u0000imaging, and diverse imaging protocols across medical centers. To address this,\u0000the autoPET series was created to challenge researchers to develop algorithms\u0000that generalize across diverse PET/CT environments. This paper presents our\u0000solution for the autoPET III challenge, targeting multitracer, multicenter\u0000generalization using the nnU-Net framework with the ResEncL architecture. Key\u0000techniques include misalignment data augmentation and multi-modal pretraining\u0000across CT, MR, and PET datasets to provide an initial anatomical understanding.\u0000We incorporate organ supervision as a multitask approach, enabling the model to\u0000distinguish between physiological uptake and tracer-specific patterns, which is\u0000particularly beneficial in cases where no lesions are present. Compared to the\u0000default nnU-Net, which achieved a Dice score of 57.61, or the larger ResEncL\u0000(65.31) our model significantly improved performance with a Dice score of\u000068.40, alongside a reduction in false positive (FPvol: 7.82) and false negative\u0000(FNvol: 10.35) volumes. These results underscore the effectiveness of combining\u0000advanced network design, augmentation, pretraining, and multitask learning for\u0000PET/CT lesion segmentation. Code is publicly available at\u0000https://github.com/MIC-DKFZ/autopet-3-submission.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142263039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimating Neural Orientation Distribution Fields on High Resolution Diffusion MRI Scans 估算高分辨率弥散核磁共振成像扫描的神经定向分布场
Pub Date : 2024-09-14 DOI: arxiv-2409.09387
Mohammed Munzer Dwedari, William Consagra, Philip Müller, Özgün Turgut, Daniel Rueckert, Yogesh Rathi
The Orientation Distribution Function (ODF) characterizes key brainmicrostructural properties and plays an important role in understanding brainstructural connectivity. Recent works introduced Implicit Neural Representation(INR) based approaches to form a spatially aware continuous estimate of the ODFfield and demonstrated promising results in key tasks of interest when comparedto conventional discrete approaches. However, traditional INR methods facedifficulties when scaling to large-scale images, such as modernultra-high-resolution MRI scans, posing challenges in learning fine structuresas well as inefficiencies in training and inference speed. In this work, wepropose HashEnc, a grid-hash-encoding-based estimation of the ODF field anddemonstrate its effectiveness in retaining structural and textural features. Weshow that HashEnc achieves a 10% enhancement in image quality while requiring3x less computational resources than current methods. Our code can be found athttps://github.com/MunzerDw/NODF-HashEnc.
方向分布函数(ODF)描述了大脑微结构的关键特性,在理解大脑结构连接性方面发挥着重要作用。最近的研究引入了基于内隐神经表征(INR)的方法来形成对 ODF 场的空间感知连续估计,与传统的离散方法相比,这些方法在关键任务中表现出了良好的效果。然而,传统的 INR 方法在扩展到大规模图像(如现代超高分辨率 MRI 扫描)时遇到了困难,在学习精细结构以及训练和推理速度方面效率低下。在这项工作中,我们提出了基于网格哈希编码的 ODF 场估计方法 HashEnc,并演示了它在保留结构和纹理特征方面的有效性。结果表明,HashEnc 能使图像质量提高 10%,而所需的计算资源是现有方法的 3 倍。我们的代码可在https://github.com/MunzerDw/NODF-HashEnc。
{"title":"Estimating Neural Orientation Distribution Fields on High Resolution Diffusion MRI Scans","authors":"Mohammed Munzer Dwedari, William Consagra, Philip Müller, Özgün Turgut, Daniel Rueckert, Yogesh Rathi","doi":"arxiv-2409.09387","DOIUrl":"https://doi.org/arxiv-2409.09387","url":null,"abstract":"The Orientation Distribution Function (ODF) characterizes key brain\u0000microstructural properties and plays an important role in understanding brain\u0000structural connectivity. Recent works introduced Implicit Neural Representation\u0000(INR) based approaches to form a spatially aware continuous estimate of the ODF\u0000field and demonstrated promising results in key tasks of interest when compared\u0000to conventional discrete approaches. However, traditional INR methods face\u0000difficulties when scaling to large-scale images, such as modern\u0000ultra-high-resolution MRI scans, posing challenges in learning fine structures\u0000as well as inefficiencies in training and inference speed. In this work, we\u0000propose HashEnc, a grid-hash-encoding-based estimation of the ODF field and\u0000demonstrate its effectiveness in retaining structural and textural features. We\u0000show that HashEnc achieves a 10% enhancement in image quality while requiring\u00003x less computational resources than current methods. Our code can be found at\u0000https://github.com/MunzerDw/NODF-HashEnc.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142269827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MotionTTT: 2D Test-Time-Training Motion Estimation for 3D Motion Corrected MRI MotionTTT:三维运动校正磁共振成像的二维测试-时间-训练运动估计
Pub Date : 2024-09-14 DOI: arxiv-2409.09370
Tobit Klug, Kun Wang, Stefan Ruschke, Reinhard Heckel
A major challenge of the long measurement times in magnetic resonance imaging(MRI), an important medical imaging technology, is that patients may moveduring data acquisition. This leads to severe motion artifacts in thereconstructed images and volumes. In this paper, we propose a deeplearning-based test-time-training method for accurate motion estimation. Thekey idea is that a neural network trained for motion-free reconstruction has asmall loss if there is no motion, thus optimizing over motion parameters passedthrough the reconstruction network enables accurate estimation of motion. Theestimated motion parameters enable to correct for the motion and to reconstructaccurate motion-corrected images. Our method uses 2D reconstruction networks toestimate rigid motion in 3D, and constitutes the first deep learning basedmethod for 3D rigid motion estimation towards 3D-motion-corrected MRI. We showthat our method can provably reconstruct motion parameters for a simple signaland neural network model. We demonstrate the effectiveness of our method forboth retrospectively simulated motion and prospectively collected realmotion-corrupted data.
磁共振成像(MRI)是一项重要的医学成像技术,其测量时间长的一大挑战是患者可能会移动数据采集。这会导致所构建的图像和体量出现严重的运动伪影。在本文中,我们提出了一种基于深度学习的测试时间训练方法,用于精确运动估计。其主要思想是,为无运动重建训练的神经网络在没有运动的情况下损失很小,因此通过重建网络对运动参数进行优化就能准确估计运动。估算出的运动参数可以校正运动,重建精确的运动校正图像。我们的方法使用二维重建网络来估计三维刚性运动,是第一种基于深度学习的三维刚性运动估计方法,可用于三维运动校正核磁共振成像。我们的研究表明,我们的方法可以为一个简单的信号和神经网络模型重建运动参数。我们证明了我们的方法在回溯模拟运动和前瞻性收集运动损伤数据方面的有效性。
{"title":"MotionTTT: 2D Test-Time-Training Motion Estimation for 3D Motion Corrected MRI","authors":"Tobit Klug, Kun Wang, Stefan Ruschke, Reinhard Heckel","doi":"arxiv-2409.09370","DOIUrl":"https://doi.org/arxiv-2409.09370","url":null,"abstract":"A major challenge of the long measurement times in magnetic resonance imaging\u0000(MRI), an important medical imaging technology, is that patients may move\u0000during data acquisition. This leads to severe motion artifacts in the\u0000reconstructed images and volumes. In this paper, we propose a deep\u0000learning-based test-time-training method for accurate motion estimation. The\u0000key idea is that a neural network trained for motion-free reconstruction has a\u0000small loss if there is no motion, thus optimizing over motion parameters passed\u0000through the reconstruction network enables accurate estimation of motion. The\u0000estimated motion parameters enable to correct for the motion and to reconstruct\u0000accurate motion-corrected images. Our method uses 2D reconstruction networks to\u0000estimate rigid motion in 3D, and constitutes the first deep learning based\u0000method for 3D rigid motion estimation towards 3D-motion-corrected MRI. We show\u0000that our method can provably reconstruct motion parameters for a simple signal\u0000and neural network model. We demonstrate the effectiveness of our method for\u0000both retrospectively simulated motion and prospectively collected real\u0000motion-corrupted data.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142263040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrating Deep Unfolding with Direct Diffusion Bridges for Computed Tomography Reconstruction 将深度展开与直接扩散桥整合用于计算机断层扫描重建
Pub Date : 2024-09-14 DOI: arxiv-2409.09477
Herman Verinaz-Jadan, Su Yan
Computed Tomography (CT) is widely used in healthcare for detailed imaging.However, Low-dose CT, despite reducing radiation exposure, often results inimages with compromised quality due to increased noise. Traditional methods,including preprocessing, post-processing, and model-based approaches thatleverage physical principles, are employed to improve the quality of imagereconstructions from noisy projections or sinograms. Recently, deep learninghas significantly advanced the field, with diffusion models outperforming bothtraditional methods and other deep learning approaches. These modelseffectively merge deep learning with physics, serving as robust priors for theinverse problem in CT. However, they typically require prolonged computationtimes during sampling. This paper introduces the first approach to merge deepunfolding with Direct Diffusion Bridges (DDBs) for CT, integrating the physicsinto the network architecture and facilitating the transition from degraded toclean images by bypassing excessively noisy intermediate stages commonlyencountered in diffusion models. Moreover, this approach includes a tailoredtraining procedure that eliminates errors typically accumulated duringsampling. The proposed approach requires fewer sampling steps and demonstratesimproved fidelity metrics, outperforming many existing state-of-the-arttechniques.
计算机断层扫描(CT)被广泛应用于医疗保健领域的详细成像。然而,低剂量 CT 虽然减少了辐射暴露,但由于噪声增加,往往会导致图像质量下降。传统的方法,包括预处理、后处理和基于模型的方法(利用物理原理),都被用来提高从噪声投影或正弦曲线中重建图像的质量。最近,深度学习大大推动了这一领域的发展,扩散模型的表现优于传统方法和其他深度学习方法。这些模型有效地融合了深度学习和物理学,可作为 CT 逆问题的稳健先验。然而,它们在采样过程中通常需要较长的计算时间。本文介绍了第一种将深度折叠与直接扩散桥(DDBs)合并用于 CT 的方法,将物理学整合到网络架构中,通过绕过扩散模型中常见的噪声过大的中间阶段,促进从退化图像到清洁图像的过渡。此外,这种方法还包括一个量身定制的训练程序,可以消除通常在采样过程中积累的误差。所提出的方法需要的采样步骤更少,保真度指标也得到了改善,优于许多现有的先进技术。
{"title":"Integrating Deep Unfolding with Direct Diffusion Bridges for Computed Tomography Reconstruction","authors":"Herman Verinaz-Jadan, Su Yan","doi":"arxiv-2409.09477","DOIUrl":"https://doi.org/arxiv-2409.09477","url":null,"abstract":"Computed Tomography (CT) is widely used in healthcare for detailed imaging.\u0000However, Low-dose CT, despite reducing radiation exposure, often results in\u0000images with compromised quality due to increased noise. Traditional methods,\u0000including preprocessing, post-processing, and model-based approaches that\u0000leverage physical principles, are employed to improve the quality of image\u0000reconstructions from noisy projections or sinograms. Recently, deep learning\u0000has significantly advanced the field, with diffusion models outperforming both\u0000traditional methods and other deep learning approaches. These models\u0000effectively merge deep learning with physics, serving as robust priors for the\u0000inverse problem in CT. However, they typically require prolonged computation\u0000times during sampling. This paper introduces the first approach to merge deep\u0000unfolding with Direct Diffusion Bridges (DDBs) for CT, integrating the physics\u0000into the network architecture and facilitating the transition from degraded to\u0000clean images by bypassing excessively noisy intermediate stages commonly\u0000encountered in diffusion models. Moreover, this approach includes a tailored\u0000training procedure that eliminates errors typically accumulated during\u0000sampling. The proposed approach requires fewer sampling steps and demonstrates\u0000improved fidelity metrics, outperforming many existing state-of-the-art\u0000techniques.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142263041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adversarial Deep-Unfolding Network for MA-XRF Super-Resolution on Old Master Paintings Using Minimal Training Data 利用最少的训练数据对古画进行 MA-XRF 超分辨的对抗性深度展开网络
Pub Date : 2024-09-14 DOI: arxiv-2409.09483
Herman Verinaz-Jadan, Su Yan, Catherine Higgitt, Pier Luigi Dragotti
High-quality element distribution maps enable precise analysis of thematerial composition and condition of Old Master paintings. These maps aretypically produced from data acquired through Macro X-ray fluorescence (MA-XRF)scanning, a non-invasive technique that collects spectral information. However,MA-XRF is often limited by a trade-off between acquisition time and resolution.Achieving higher resolution requires longer scanning times, which can beimpractical for detailed analysis of large artworks. Super-resolution MA-XRFprovides an alternative solution by enhancing the quality of MA-XRF scans whilereducing the need for extended scanning sessions. This paper introduces atailored super-resolution approach to improve MA-XRF analysis of Old Masterpaintings. Our method proposes a novel adversarial neural network architecturefor MA-XRF, inspired by the Learned Iterative Shrinkage-Thresholding Algorithm.It is specifically designed to work in an unsupervised manner, making efficientuse of the limited available data. This design avoids the need for extensivedatasets or pre-trained networks, allowing it to be trained using just a singlehigh-resolution RGB image alongside low-resolution MA-XRF data. Numericalresults demonstrate that our method outperforms existing state-of-the-artsuper-resolution techniques for MA-XRF scans of Old Master paintings.
通过绘制高质量的元素分布图,可以精确分析古代大师画作的物质成分和状况。这些分布图通常是通过宏观 X 射线荧光(MA-XRF)扫描获得的数据绘制的,这是一种收集光谱信息的非侵入式技术。要获得更高的分辨率,需要更长的扫描时间,这对于大型艺术品的详细分析来说是不切实际的。超分辨 MA-XRF 提供了另一种解决方案,既能提高 MA-XRF 扫描的质量,又能减少对延长扫描时间的需求。本文介绍了一种有针对性的超分辨率方法,以改进对古代大师画作的 MA-XRF 分析。我们的方法为 MA-XRF 提出了一种新颖的对抗性神经网络架构,其灵感来自于学习迭代收缩阈值算法(Learned Iterative Shrinkage-Thresholding Algorithm)。这种设计避免了对扩展数据集或预训练网络的需求,只需使用单张高分辨率 RGB 图像和低分辨率 MA-XRF 数据即可对其进行训练。数值结果表明,我们的方法在对古代大师绘画进行 MA-XRF 扫描时优于现有的超分辨率技术。
{"title":"Adversarial Deep-Unfolding Network for MA-XRF Super-Resolution on Old Master Paintings Using Minimal Training Data","authors":"Herman Verinaz-Jadan, Su Yan, Catherine Higgitt, Pier Luigi Dragotti","doi":"arxiv-2409.09483","DOIUrl":"https://doi.org/arxiv-2409.09483","url":null,"abstract":"High-quality element distribution maps enable precise analysis of the\u0000material composition and condition of Old Master paintings. These maps are\u0000typically produced from data acquired through Macro X-ray fluorescence (MA-XRF)\u0000scanning, a non-invasive technique that collects spectral information. However,\u0000MA-XRF is often limited by a trade-off between acquisition time and resolution.\u0000Achieving higher resolution requires longer scanning times, which can be\u0000impractical for detailed analysis of large artworks. Super-resolution MA-XRF\u0000provides an alternative solution by enhancing the quality of MA-XRF scans while\u0000reducing the need for extended scanning sessions. This paper introduces a\u0000tailored super-resolution approach to improve MA-XRF analysis of Old Master\u0000paintings. Our method proposes a novel adversarial neural network architecture\u0000for MA-XRF, inspired by the Learned Iterative Shrinkage-Thresholding Algorithm.\u0000It is specifically designed to work in an unsupervised manner, making efficient\u0000use of the limited available data. This design avoids the need for extensive\u0000datasets or pre-trained networks, allowing it to be trained using just a single\u0000high-resolution RGB image alongside low-resolution MA-XRF data. Numerical\u0000results demonstrate that our method outperforms existing state-of-the-art\u0000super-resolution techniques for MA-XRF scans of Old Master paintings.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142263001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Real-Time Stochastic Terrain Mapping and Processing for Autonomous Safe Landing 用于自主安全着陆的实时随机地形测绘和处理技术
Pub Date : 2024-09-14 DOI: arxiv-2409.09309
Kento Tomita, Koki Ho
Onboard terrain sensing and mapping for safe planetary landings often sufferfrom missed hazardous features, e.g., small rocks, due to the largeobservational range and the limited resolution of the obtained terrain data. Tothis end, this paper develops a novel real-time stochastic terrain mappingalgorithm that accounts for topographic uncertainty between the sampled points,or the uncertainty due to the sparse 3D terrain measurements. We introduce aGaussian digital elevation map that is efficiently constructed using thecombination of Delauney triangulation and local Gaussian process regression.The geometric investigation of the lander-terrain interaction is exploited toefficiently evaluate the marginally conservative local slope and roughnesswhile avoiding the costly computation of the local plane. The conservativenessis proved in the paper. The developed real-time uncertainty quantificationpipeline enables stochastic landing safety evaluation under challengingoperational conditions, such as a large observational range or limited sensorcapability, which is a critical stepping stone for the development ofpredictive guidance algorithms for safe autonomous planetary landing. Detailedreviews on background and related works are also presented.
由于观测范围大,获得的地形数据分辨率有限,用于行星安全着陆的机载地形传感和绘图经常会遗漏危险特征,如小岩石。为此,本文开发了一种新颖的实时随机地形测绘算法,该算法考虑了采样点之间地形的不确定性,或由于稀疏的三维地形测量而产生的不确定性。我们引入了一种高斯数字高程图,该高斯数字高程图是利用 Delauney 三角测量和局部高斯过程回归相结合的方法有效构建的。利用对着陆器与地形相互作用的几何调查,可以有效地评估边际保守的局部坡度和粗糙度,同时避免昂贵的局部平面计算。本文证明了这种保守性。所开发的实时不确定性量化管道能够在具有挑战性的操作条件下(如观测范围大或传感器能力有限)进行随机着陆安全评估,这对于开发用于行星安全自主着陆的预测制导算法来说是至关重要的一步。此外,还介绍了背景和相关工作的详细回顾。
{"title":"Real-Time Stochastic Terrain Mapping and Processing for Autonomous Safe Landing","authors":"Kento Tomita, Koki Ho","doi":"arxiv-2409.09309","DOIUrl":"https://doi.org/arxiv-2409.09309","url":null,"abstract":"Onboard terrain sensing and mapping for safe planetary landings often suffer\u0000from missed hazardous features, e.g., small rocks, due to the large\u0000observational range and the limited resolution of the obtained terrain data. To\u0000this end, this paper develops a novel real-time stochastic terrain mapping\u0000algorithm that accounts for topographic uncertainty between the sampled points,\u0000or the uncertainty due to the sparse 3D terrain measurements. We introduce a\u0000Gaussian digital elevation map that is efficiently constructed using the\u0000combination of Delauney triangulation and local Gaussian process regression.\u0000The geometric investigation of the lander-terrain interaction is exploited to\u0000efficiently evaluate the marginally conservative local slope and roughness\u0000while avoiding the costly computation of the local plane. The conservativeness\u0000is proved in the paper. The developed real-time uncertainty quantification\u0000pipeline enables stochastic landing safety evaluation under challenging\u0000operational conditions, such as a large observational range or limited sensor\u0000capability, which is a critical stepping stone for the development of\u0000predictive guidance algorithms for safe autonomous planetary landing. Detailed\u0000reviews on background and related works are also presented.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142269826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MANGO: Disentangled Image Transformation Manifolds with Grouped Operators 芒果带分组算子的离散图像变换漫场
Pub Date : 2024-09-14 DOI: arxiv-2409.09542
Brighton Ancelin, Yenho Chen, Peimeng Guan, Chiraag Kaushik, Belen Martin-Urcelay, Alex Saad-Falcon, Nakul Singh
Learning semantically meaningful image transformations (i.e. rotation,thickness, blur) directly from examples can be a challenging task. Recently,the Manifold Autoencoder (MAE) proposed using a set of Lie group operators tolearn image transformations directly from examples. However, this approach haslimitations, as the learned operators are not guaranteed to be disentangled andthe training routine is prohibitively expensive when scaling up the model. Toaddress these limitations, we propose MANGO (transformation Manifolds withGrouped Operators) for learning disentangled operators that describe imagetransformations in distinct latent subspaces. Moreover, our approach allowspractitioners the ability to define which transformations they aim to model,thus improving the semantic meaning of the learned operators. Through ourexperiments, we demonstrate that MANGO enables composition of imagetransformations and introduces a one-phase training routine that leads to a100x speedup over prior works.
直接从示例中学习有语义意义的图像变换(如旋转、厚度、模糊)是一项具有挑战性的任务。最近,Manifold Autoencoder(MAE)提出使用一组列群算子直接从示例中学习图像变换。然而,这种方法也有局限性,因为学习到的算子不能保证被分解,而且在扩大模型规模时,训练程序的成本过高。为了解决这些局限性,我们提出了 MANGO(具有分组算子的变换 Manifolds)方法,用于学习在不同潜在子空间中描述图像变换的分离算子。此外,我们的方法允许练习者定义他们要模拟的变换,从而提高了所学算子的语义。通过我们的实验,我们证明了 MANGO 能够实现图像变换的组合,并引入了一个阶段的训练程序,与之前的工作相比,速度提高了 100 倍。
{"title":"MANGO: Disentangled Image Transformation Manifolds with Grouped Operators","authors":"Brighton Ancelin, Yenho Chen, Peimeng Guan, Chiraag Kaushik, Belen Martin-Urcelay, Alex Saad-Falcon, Nakul Singh","doi":"arxiv-2409.09542","DOIUrl":"https://doi.org/arxiv-2409.09542","url":null,"abstract":"Learning semantically meaningful image transformations (i.e. rotation,\u0000thickness, blur) directly from examples can be a challenging task. Recently,\u0000the Manifold Autoencoder (MAE) proposed using a set of Lie group operators to\u0000learn image transformations directly from examples. However, this approach has\u0000limitations, as the learned operators are not guaranteed to be disentangled and\u0000the training routine is prohibitively expensive when scaling up the model. To\u0000address these limitations, we propose MANGO (transformation Manifolds with\u0000Grouped Operators) for learning disentangled operators that describe image\u0000transformations in distinct latent subspaces. Moreover, our approach allows\u0000practitioners the ability to define which transformations they aim to model,\u0000thus improving the semantic meaning of the learned operators. Through our\u0000experiments, we demonstrate that MANGO enables composition of image\u0000transformations and introduces a one-phase training routine that leads to a\u0000100x speedup over prior works.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142262999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Self-Prompting Polyp Segmentation in Colonoscopy using Hybrid Yolo-SAM 2 Model 使用混合 Yolo-SAM 2 模型在结肠镜检查中进行自我提示息肉分割
Pub Date : 2024-09-14 DOI: arxiv-2409.09484
Mobina Mansoori, Sajjad Shahabodini, Jamshid Abouei, Konstantinos N. Plataniotis, Arash Mohammadi
Early diagnosis and treatment of polyps during colonoscopy are essential forreducing the incidence and mortality of Colorectal Cancer (CRC). However, thevariability in polyp characteristics and the presence of artifacts incolonoscopy images and videos pose significant challenges for accurate andefficient polyp detection and segmentation. This paper presents a novelapproach to polyp segmentation by integrating the Segment Anything Model (SAM2) with the YOLOv8 model. Our method leverages YOLOv8's bounding boxpredictions to autonomously generate input prompts for SAM 2, thereby reducingthe need for manual annotations. We conducted exhaustive tests on fivebenchmark colonoscopy image datasets and two colonoscopy video datasets,demonstrating that our method exceeds state-of-the-art models in both image andvideo segmentation tasks. Notably, our approach achieves high segmentationaccuracy using only bounding box annotations, significantly reducing annotationtime and effort. This advancement holds promise for enhancing the efficiencyand scalability of polyp detection in clinical settingshttps://github.com/sajjad-sh33/YOLO_SAM2.
结肠镜检查中息肉的早期诊断和治疗对于降低结肠直肠癌(CRC)的发病率和死亡率至关重要。然而,息肉特征的多变性以及结肠镜图像和视频中伪影的存在,给准确高效的息肉检测和分割带来了巨大挑战。本文通过将 Segment Anything Model(SAM2)与 YOLOv8 模型相结合,提出了一种新颖的息肉分割方法。我们的方法利用 YOLOv8 的边界框预测来自主生成 SAM 2 的输入提示,从而减少了手动注释的需要。我们在五个基准结肠镜检查图像数据集和两个结肠镜检查视频数据集上进行了详尽的测试,结果表明我们的方法在图像和视频分割任务中都超越了最先进的模型。值得注意的是,我们的方法只使用边界框注释就能达到很高的分割精度,大大减少了注释时间和工作量。这一进步有望提高临床设置中息肉检测的效率和可扩展性https://github.com/sajjad-sh33/YOLO_SAM2。
{"title":"Self-Prompting Polyp Segmentation in Colonoscopy using Hybrid Yolo-SAM 2 Model","authors":"Mobina Mansoori, Sajjad Shahabodini, Jamshid Abouei, Konstantinos N. Plataniotis, Arash Mohammadi","doi":"arxiv-2409.09484","DOIUrl":"https://doi.org/arxiv-2409.09484","url":null,"abstract":"Early diagnosis and treatment of polyps during colonoscopy are essential for\u0000reducing the incidence and mortality of Colorectal Cancer (CRC). However, the\u0000variability in polyp characteristics and the presence of artifacts in\u0000colonoscopy images and videos pose significant challenges for accurate and\u0000efficient polyp detection and segmentation. This paper presents a novel\u0000approach to polyp segmentation by integrating the Segment Anything Model (SAM\u00002) with the YOLOv8 model. Our method leverages YOLOv8's bounding box\u0000predictions to autonomously generate input prompts for SAM 2, thereby reducing\u0000the need for manual annotations. We conducted exhaustive tests on five\u0000benchmark colonoscopy image datasets and two colonoscopy video datasets,\u0000demonstrating that our method exceeds state-of-the-art models in both image and\u0000video segmentation tasks. Notably, our approach achieves high segmentation\u0000accuracy using only bounding box annotations, significantly reducing annotation\u0000time and effort. This advancement holds promise for enhancing the efficiency\u0000and scalability of polyp detection in clinical settings\u0000https://github.com/sajjad-sh33/YOLO_SAM2.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142263000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MAISI: Medical AI for Synthetic Imaging MAISI:用于合成成像的医学人工智能
Pub Date : 2024-09-13 DOI: arxiv-2409.11169
Pengfei Guo, Can Zhao, Dong Yang, Ziyue Xu, Vishwesh Nath, Yucheng Tang, Benjamin Simon, Mason Belue, Stephanie Harmon, Baris Turkbey, Daguang Xu
Medical imaging analysis faces challenges such as data scarcity, highannotation costs, and privacy concerns. This paper introduces the Medical AIfor Synthetic Imaging (MAISI), an innovative approach using the diffusion modelto generate synthetic 3D computed tomography (CT) images to address thosechallenges. MAISI leverages the foundation volume compression network and thelatent diffusion model to produce high-resolution CT images (up to a landmarkvolume dimension of 512 x 512 x 768 ) with flexible volume dimensions and voxelspacing. By incorporating ControlNet, MAISI can process organ segmentation,including 127 anatomical structures, as additional conditions and enables thegeneration of accurately annotated synthetic images that can be used forvarious downstream tasks. Our experiment results show that MAISI's capabilitiesin generating realistic, anatomically accurate images for diverse regions andconditions reveal its promising potential to mitigate challenges usingsynthetic data.
医学影像分析面临着数据稀缺、标注成本高和隐私问题等挑战。本文介绍了医学人工智能合成成像(MAISI),这是一种利用扩散模型生成合成三维计算机断层扫描(CT)图像以应对这些挑战的创新方法。MAISI 利用基础容积压缩网络和恒定扩散模型,以灵活的容积尺寸和体素间距生成高分辨率 CT 图像(最大地标容积尺寸为 512 x 512 x 768)。通过结合 ControlNet,MAISI 可以将器官分割(包括 127 个解剖结构)作为附加条件进行处理,并生成可用于各种下游任务的精确注释合成图像。我们的实验结果表明,MAISI 能够为不同区域和条件生成逼真、解剖准确的图像,这揭示了它在减轻合成数据挑战方面的巨大潜力。
{"title":"MAISI: Medical AI for Synthetic Imaging","authors":"Pengfei Guo, Can Zhao, Dong Yang, Ziyue Xu, Vishwesh Nath, Yucheng Tang, Benjamin Simon, Mason Belue, Stephanie Harmon, Baris Turkbey, Daguang Xu","doi":"arxiv-2409.11169","DOIUrl":"https://doi.org/arxiv-2409.11169","url":null,"abstract":"Medical imaging analysis faces challenges such as data scarcity, high\u0000annotation costs, and privacy concerns. This paper introduces the Medical AI\u0000for Synthetic Imaging (MAISI), an innovative approach using the diffusion model\u0000to generate synthetic 3D computed tomography (CT) images to address those\u0000challenges. MAISI leverages the foundation volume compression network and the\u0000latent diffusion model to produce high-resolution CT images (up to a landmark\u0000volume dimension of 512 x 512 x 768 ) with flexible volume dimensions and voxel\u0000spacing. By incorporating ControlNet, MAISI can process organ segmentation,\u0000including 127 anatomical structures, as additional conditions and enables the\u0000generation of accurately annotated synthetic images that can be used for\u0000various downstream tasks. Our experiment results show that MAISI's capabilities\u0000in generating realistic, anatomically accurate images for diverse regions and\u0000conditions reveal its promising potential to mitigate challenges using\u0000synthetic data.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142263240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spectral U-Net: Enhancing Medical Image Segmentation via Spectral Decomposition 光谱 U-网络:通过光谱分解增强医学图像分割功能
Pub Date : 2024-09-13 DOI: arxiv-2409.09216
Yaopeng Peng, Milan Sonka, Danny Z. Chen
This paper introduces Spectral U-Net, a novel deep learning network based onspectral decomposition, by exploiting Dual Tree Complex Wavelet Transform(DTCWT) for down-sampling and inverse Dual Tree Complex Wavelet Transform(iDTCWT) for up-sampling. We devise the corresponding Wave-Block andiWave-Block, integrated into the U-Net architecture, aiming at mitigatinginformation loss during down-sampling and enhancing detail reconstructionduring up-sampling. In the encoder, we first decompose the feature map intohigh and low-frequency components using DTCWT, enabling down-sampling whilemitigating information loss. In the decoder, we utilize iDTCWT to reconstructhigher-resolution feature maps from down-sampled features. Evaluations on theRetina Fluid, Brain Tumor, and Liver Tumor segmentation datasets with thennU-Net framework demonstrate the superiority of the proposed Spectral U-Net.
本文介绍了基于光谱分解的新型深度学习网络--光谱 U-Net,它利用双树复小波变换(DTCWT)进行下采样,利用逆双树复小波变换(iDTCWT)进行上采样。我们设计了相应的 Wave-Block 和 iWave-Block,并将其集成到 U-Net 架构中,旨在减少下采样时的信息丢失,并增强上采样时的细节重建。在编码器中,我们首先使用 DTCWT 将特征图分解为高频和低频分量,从而实现下采样,同时减少信息丢失。在解码器中,我们利用 iDTCWT 从缩小采样的特征图中重建更高分辨率的特征图。利用 nnU-Net 框架对网液、脑肿瘤和肝脏肿瘤分割数据集进行的评估证明了所提出的光谱 U-Net 的优越性。
{"title":"Spectral U-Net: Enhancing Medical Image Segmentation via Spectral Decomposition","authors":"Yaopeng Peng, Milan Sonka, Danny Z. Chen","doi":"arxiv-2409.09216","DOIUrl":"https://doi.org/arxiv-2409.09216","url":null,"abstract":"This paper introduces Spectral U-Net, a novel deep learning network based on\u0000spectral decomposition, by exploiting Dual Tree Complex Wavelet Transform\u0000(DTCWT) for down-sampling and inverse Dual Tree Complex Wavelet Transform\u0000(iDTCWT) for up-sampling. We devise the corresponding Wave-Block and\u0000iWave-Block, integrated into the U-Net architecture, aiming at mitigating\u0000information loss during down-sampling and enhancing detail reconstruction\u0000during up-sampling. In the encoder, we first decompose the feature map into\u0000high and low-frequency components using DTCWT, enabling down-sampling while\u0000mitigating information loss. In the decoder, we utilize iDTCWT to reconstruct\u0000higher-resolution feature maps from down-sampled features. Evaluations on the\u0000Retina Fluid, Brain Tumor, and Liver Tumor segmentation datasets with the\u0000nnU-Net framework demonstrate the superiority of the proposed Spectral U-Net.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142263042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - EE - Image and Video Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1