首页 > 最新文献

Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention最新文献

英文 中文
MUSCLE: Multi-task Self-supervised Continual Learning to Pre-train Deep Models for X-Ray Images of Multiple Body Parts 肌肉:多任务自监督持续学习,用于预训练多个身体部位x射线图像的深度模型
Weibin Liao, H. Xiong, Qingzhong Wang, Yan Mo, Xuhong Li, Yi Liu, Zeyu Chen, Siyu Huang, D. Dou
{"title":"MUSCLE: Multi-task Self-supervised Continual Learning to Pre-train Deep Models for X-Ray Images of Multiple Body Parts","authors":"Weibin Liao, H. Xiong, Qingzhong Wang, Yan Mo, Xuhong Li, Yi Liu, Zeyu Chen, Siyu Huang, D. Dou","doi":"10.1007/978-3-031-16452-1_15","DOIUrl":"https://doi.org/10.1007/978-3-031-16452-1_15","url":null,"abstract":"","PeriodicalId":18289,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"64 1","pages":"151-161"},"PeriodicalIF":0.0,"publicationDate":"2023-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72903973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Self-pruning Graph Neural Network for Predicting Inflammatory Disease Activity in Multiple Sclerosis from Brain MR Images 从脑MR图像预测多发性硬化症炎症性疾病活动的自修剪图神经网络
Chinmay Prabhakar, Hongwei Li, J. Paetzold, T. Loehr, Chen Niu, M. Muhlau, D. Rueckert, B. Wiestler, Bjoern H Menze
Multiple Sclerosis (MS) is a severe neurological disease characterized by inflammatory lesions in the central nervous system. Hence, predicting inflammatory disease activity is crucial for disease assessment and treatment. However, MS lesions can occur throughout the brain and vary in shape, size and total count among patients. The high variance in lesion load and locations makes it challenging for machine learning methods to learn a globally effective representation of whole-brain MRI scans to assess and predict disease. Technically it is non-trivial to incorporate essential biomarkers such as lesion load or spatial proximity. Our work represents the first attempt to utilize graph neural networks (GNN) to aggregate these biomarkers for a novel global representation. We propose a two-stage MS inflammatory disease activity prediction approach. First, a 3D segmentation network detects lesions, and a self-supervised algorithm extracts their image features. Second, the detected lesions are used to build a patient graph. The lesions act as nodes in the graph and are initialized with image features extracted in the first stage. Finally, the lesions are connected based on their spatial proximity and the inflammatory disease activity prediction is formulated as a graph classification task. Furthermore, we propose a self-pruning strategy to auto-select the most critical lesions for prediction. Our proposed method outperforms the existing baseline by a large margin (AUCs of 0.67 vs. 0.61 and 0.66 vs. 0.60 for one-year and two-year inflammatory disease activity, respectively). Finally, our proposed method enjoys inherent explainability by assigning an importance score to each lesion for the overall prediction. Code is available at https://github.com/chinmay5/ms_ida.git
多发性硬化症(MS)是一种以中枢神经系统炎症病变为特征的严重神经系统疾病。因此,预测炎症性疾病的活动性对疾病评估和治疗至关重要。然而,多发性硬化症病变可以发生在整个大脑中,并且在形状、大小和总数上各不相同。病变负荷和位置的高度差异使得机器学习方法难以学习全脑MRI扫描的全局有效表示来评估和预测疾病。从技术上讲,将必要的生物标志物(如损伤负荷或空间接近度)纳入其中是非常重要的。我们的工作代表了首次尝试利用图神经网络(GNN)来聚合这些生物标志物以获得新的全局表示。我们提出了一种两阶段MS炎症性疾病活动性预测方法。首先,用三维分割网络检测病灶,用自监督算法提取病灶的图像特征。其次,使用检测到的病变构建患者图。病灶作为图中的节点,用第一阶段提取的图像特征初始化。最后,根据病灶的空间接近性将其连接起来,并将炎症疾病活动预测制定为一个图分类任务。此外,我们提出了一种自修剪策略来自动选择最关键的病变进行预测。我们提出的方法在很大程度上优于现有基线(1年和2年炎症疾病活动性的auc分别为0.67 vs 0.61和0.66 vs 0.60)。最后,我们提出的方法具有固有的可解释性,通过为每个病变分配一个重要分数来进行整体预测。代码可从https://github.com/chinmay5/ms_ida.git获得
{"title":"Self-pruning Graph Neural Network for Predicting Inflammatory Disease Activity in Multiple Sclerosis from Brain MR Images","authors":"Chinmay Prabhakar, Hongwei Li, J. Paetzold, T. Loehr, Chen Niu, M. Muhlau, D. Rueckert, B. Wiestler, Bjoern H Menze","doi":"10.48550/arXiv.2308.16863","DOIUrl":"https://doi.org/10.48550/arXiv.2308.16863","url":null,"abstract":"Multiple Sclerosis (MS) is a severe neurological disease characterized by inflammatory lesions in the central nervous system. Hence, predicting inflammatory disease activity is crucial for disease assessment and treatment. However, MS lesions can occur throughout the brain and vary in shape, size and total count among patients. The high variance in lesion load and locations makes it challenging for machine learning methods to learn a globally effective representation of whole-brain MRI scans to assess and predict disease. Technically it is non-trivial to incorporate essential biomarkers such as lesion load or spatial proximity. Our work represents the first attempt to utilize graph neural networks (GNN) to aggregate these biomarkers for a novel global representation. We propose a two-stage MS inflammatory disease activity prediction approach. First, a 3D segmentation network detects lesions, and a self-supervised algorithm extracts their image features. Second, the detected lesions are used to build a patient graph. The lesions act as nodes in the graph and are initialized with image features extracted in the first stage. Finally, the lesions are connected based on their spatial proximity and the inflammatory disease activity prediction is formulated as a graph classification task. Furthermore, we propose a self-pruning strategy to auto-select the most critical lesions for prediction. Our proposed method outperforms the existing baseline by a large margin (AUCs of 0.67 vs. 0.61 and 0.66 vs. 0.60 for one-year and two-year inflammatory disease activity, respectively). Finally, our proposed method enjoys inherent explainability by assigning an importance score to each lesion for the overall prediction. Code is available at https://github.com/chinmay5/ms_ida.git","PeriodicalId":18289,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"1 1","pages":"226-236"},"PeriodicalIF":0.0,"publicationDate":"2023-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84841265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Self-Supervised Learning for Endoscopic Video Analysis 内窥镜视频分析的自监督学习
Roy Hirsch, Mathilde Caron, Regev Cohen, Amir Livne, Ron Shapiro, Tomer Golany, Roman Goldenberg, Daniel Freedman, E. Rivlin
Self-supervised learning (SSL) has led to important breakthroughs in computer vision by allowing learning from large amounts of unlabeled data. As such, it might have a pivotal role to play in biomedicine where annotating data requires a highly specialized expertise. Yet, there are many healthcare domains for which SSL has not been extensively explored. One such domain is endoscopy, minimally invasive procedures which are commonly used to detect and treat infections, chronic inflammatory diseases or cancer. In this work, we study the use of a leading SSL framework, namely Masked Siamese Networks (MSNs), for endoscopic video analysis such as colonoscopy and laparoscopy. To fully exploit the power of SSL, we create sizable unlabeled endoscopic video datasets for training MSNs. These strong image representations serve as a foundation for secondary training with limited annotated datasets, resulting in state-of-the-art performance in endoscopic benchmarks like surgical phase recognition during laparoscopy and colonoscopic polyp characterization. Additionally, we achieve a 50% reduction in annotated data size without sacrificing performance. Thus, our work provides evidence that SSL can dramatically reduce the need of annotated data in endoscopy.
自我监督学习(SSL)通过允许从大量未标记数据中学习,导致了计算机视觉的重要突破。因此,它可能在生物医学中发挥关键作用,因为注释数据需要高度专业化的专业知识。然而,许多医疗保健领域还没有对SSL进行广泛的探索。其中一个领域是内窥镜检查,这是一种微创手术,通常用于检测和治疗感染、慢性炎症性疾病或癌症。在这项工作中,我们研究了一种领先的SSL框架,即屏蔽暹罗网络(msn),用于内窥镜视频分析,如结肠镜检查和腹腔镜检查。为了充分利用SSL的力量,我们创建了相当大的未标记内窥镜视频数据集来训练msn。这些强大的图像表示为有限的注释数据集的二次训练奠定了基础,从而在腹腔镜和结肠镜息肉表征等内窥镜基准中获得了最先进的性能。此外,我们在不牺牲性能的情况下实现了注释数据大小减少50%。因此,我们的工作提供了SSL可以显著减少内窥镜检查中对注释数据的需求的证据。
{"title":"Self-Supervised Learning for Endoscopic Video Analysis","authors":"Roy Hirsch, Mathilde Caron, Regev Cohen, Amir Livne, Ron Shapiro, Tomer Golany, Roman Goldenberg, Daniel Freedman, E. Rivlin","doi":"10.48550/arXiv.2308.12394","DOIUrl":"https://doi.org/10.48550/arXiv.2308.12394","url":null,"abstract":"Self-supervised learning (SSL) has led to important breakthroughs in computer vision by allowing learning from large amounts of unlabeled data. As such, it might have a pivotal role to play in biomedicine where annotating data requires a highly specialized expertise. Yet, there are many healthcare domains for which SSL has not been extensively explored. One such domain is endoscopy, minimally invasive procedures which are commonly used to detect and treat infections, chronic inflammatory diseases or cancer. In this work, we study the use of a leading SSL framework, namely Masked Siamese Networks (MSNs), for endoscopic video analysis such as colonoscopy and laparoscopy. To fully exploit the power of SSL, we create sizable unlabeled endoscopic video datasets for training MSNs. These strong image representations serve as a foundation for secondary training with limited annotated datasets, resulting in state-of-the-art performance in endoscopic benchmarks like surgical phase recognition during laparoscopy and colonoscopic polyp characterization. Additionally, we achieve a 50% reduction in annotated data size without sacrificing performance. Thus, our work provides evidence that SSL can dramatically reduce the need of annotated data in endoscopy.","PeriodicalId":18289,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"60 1","pages":"569-578"},"PeriodicalIF":0.0,"publicationDate":"2023-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84519070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring Unsupervised Cell Recognition with Prior Self-activation Maps 探索无监督细胞识别与先验自激活地图
Pingyi Chen, Chenglu Zhu, Zhongyi Shui, Jiatong Cai, S. Zheng, Shichuan Zhang, Lin Yang
The success of supervised deep learning models on cell recognition tasks relies on detailed annotations. Many previous works have managed to reduce the dependency on labels. However, considering the large number of cells contained in a patch, costly and inefficient labeling is still inevitable. To this end, we explored label-free methods for cell recognition. Prior self-activation maps (PSM) are proposed to generate pseudo masks as training targets. To be specific, an activation network is trained with self-supervised learning. The gradient information in the shallow layers of the network is aggregated to generate prior self-activation maps. Afterward, a semantic clustering module is then introduced as a pipeline to transform PSMs to pixel-level semantic pseudo masks for downstream tasks. We evaluated our method on two histological datasets: MoNuSeg (cell segmentation) and BCData (multi-class cell detection). Compared with other fully-supervised and weakly-supervised methods, our method can achieve competitive performance without any manual annotations. Our simple but effective framework can also achieve multi-class cell detection which can not be done by existing unsupervised methods. The results show the potential of PSMs that might inspire other research to deal with the hunger for labels in medical area.
监督深度学习模型在细胞识别任务上的成功依赖于详细的注释。以前的许多工作都设法减少了对标签的依赖。然而,考虑到贴片中含有大量的细胞,昂贵和低效的标记仍然是不可避免的。为此,我们探索了无标签的细胞识别方法。提出了先验自激活映射(PSM)来生成伪掩码作为训练目标。具体来说,激活网络是用自监督学习来训练的。网络浅层的梯度信息被聚合以生成先验自激活图。然后,引入语义聚类模块作为管道,将psm转换为像素级语义伪掩码,用于下游任务。我们在两个组织学数据集上评估了我们的方法:MoNuSeg(细胞分割)和BCData(多类细胞检测)。与其他全监督和弱监督方法相比,我们的方法可以在不需要人工标注的情况下获得具有竞争力的性能。该框架简单有效,可以实现现有无监督方法无法实现的多类细胞检测。研究结果表明,psm的潜力可能会激发其他研究,以应对医学领域对标签的渴望。
{"title":"Exploring Unsupervised Cell Recognition with Prior Self-activation Maps","authors":"Pingyi Chen, Chenglu Zhu, Zhongyi Shui, Jiatong Cai, S. Zheng, Shichuan Zhang, Lin Yang","doi":"10.48550/arXiv.2308.11144","DOIUrl":"https://doi.org/10.48550/arXiv.2308.11144","url":null,"abstract":"The success of supervised deep learning models on cell recognition tasks relies on detailed annotations. Many previous works have managed to reduce the dependency on labels. However, considering the large number of cells contained in a patch, costly and inefficient labeling is still inevitable. To this end, we explored label-free methods for cell recognition. Prior self-activation maps (PSM) are proposed to generate pseudo masks as training targets. To be specific, an activation network is trained with self-supervised learning. The gradient information in the shallow layers of the network is aggregated to generate prior self-activation maps. Afterward, a semantic clustering module is then introduced as a pipeline to transform PSMs to pixel-level semantic pseudo masks for downstream tasks. We evaluated our method on two histological datasets: MoNuSeg (cell segmentation) and BCData (multi-class cell detection). Compared with other fully-supervised and weakly-supervised methods, our method can achieve competitive performance without any manual annotations. Our simple but effective framework can also achieve multi-class cell detection which can not be done by existing unsupervised methods. The results show the potential of PSMs that might inspire other research to deal with the hunger for labels in medical area.","PeriodicalId":18289,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"13 1","pages":"559-568"},"PeriodicalIF":0.0,"publicationDate":"2023-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84031220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
DMCVR: Morphology-Guided Diffusion Model for 3D Cardiac Volume Reconstruction DMCVR:三维心脏容量重建的形态学引导扩散模型
Xiaoxiao He, Chaowei Tan, Ligong Han, Bo Liu, L. Axel, Kang Li, Dimitris N. Metaxas
Accurate 3D cardiac reconstruction from cine magnetic resonance imaging (cMRI) is crucial for improved cardiovascular disease diagnosis and understanding of the heart's motion. However, current cardiac MRI-based reconstruction technology used in clinical settings is 2D with limited through-plane resolution, resulting in low-quality reconstructed cardiac volumes. To better reconstruct 3D cardiac volumes from sparse 2D image stacks, we propose a morphology-guided diffusion model for 3D cardiac volume reconstruction, DMCVR, that synthesizes high-resolution 2D images and corresponding 3D reconstructed volumes. Our method outperforms previous approaches by conditioning the cardiac morphology on the generative model, eliminating the time-consuming iterative optimization process of the latent code, and improving generation quality. The learned latent spaces provide global semantics, local cardiac morphology and details of each 2D cMRI slice with highly interpretable value to reconstruct 3D cardiac shape. Our experiments show that DMCVR is highly effective in several aspects, such as 2D generation and 3D reconstruction performance. With DMCVR, we can produce high-resolution 3D cardiac MRI reconstructions, surpassing current techniques. Our proposed framework has great potential for improving the accuracy of cardiac disease diagnosis and treatment planning. Code can be accessed at https://github.com/hexiaoxiao-cs/DMCVR.
通过电影磁共振成像(cMRI)精确的三维心脏重建对于改善心血管疾病诊断和了解心脏运动至关重要。然而,目前用于临床的基于心脏mri的重建技术是二维的,通过平面分辨率有限,导致重建的心脏体积质量低。为了更好地从稀疏的二维图像堆栈中重建三维心脏体积,我们提出了一种用于三维心脏体积重建的形态学引导扩散模型DMCVR,该模型综合了高分辨率的二维图像和相应的三维重建体积。该方法通过在生成模型上调节心脏形态,消除了耗时的潜在代码迭代优化过程,并提高了生成质量,优于以往的方法。学习到的潜在空间提供了全局语义、局部心脏形态和每个二维cMRI切片的细节,具有高度可解释的价值,可以重建三维心脏形状。我们的实验表明,DMCVR在二维生成和三维重建性能等几个方面都是非常有效的。有了DMCVR,我们可以制作高分辨率的3D心脏MRI重建,超越目前的技术。我们提出的框架在提高心脏病诊断和治疗计划的准确性方面具有很大的潜力。代码可以在https://github.com/hexiaoxiao-cs/DMCVR上访问。
{"title":"DMCVR: Morphology-Guided Diffusion Model for 3D Cardiac Volume Reconstruction","authors":"Xiaoxiao He, Chaowei Tan, Ligong Han, Bo Liu, L. Axel, Kang Li, Dimitris N. Metaxas","doi":"10.48550/arXiv.2308.09223","DOIUrl":"https://doi.org/10.48550/arXiv.2308.09223","url":null,"abstract":"Accurate 3D cardiac reconstruction from cine magnetic resonance imaging (cMRI) is crucial for improved cardiovascular disease diagnosis and understanding of the heart's motion. However, current cardiac MRI-based reconstruction technology used in clinical settings is 2D with limited through-plane resolution, resulting in low-quality reconstructed cardiac volumes. To better reconstruct 3D cardiac volumes from sparse 2D image stacks, we propose a morphology-guided diffusion model for 3D cardiac volume reconstruction, DMCVR, that synthesizes high-resolution 2D images and corresponding 3D reconstructed volumes. Our method outperforms previous approaches by conditioning the cardiac morphology on the generative model, eliminating the time-consuming iterative optimization process of the latent code, and improving generation quality. The learned latent spaces provide global semantics, local cardiac morphology and details of each 2D cMRI slice with highly interpretable value to reconstruct 3D cardiac shape. Our experiments show that DMCVR is highly effective in several aspects, such as 2D generation and 3D reconstruction performance. With DMCVR, we can produce high-resolution 3D cardiac MRI reconstructions, surpassing current techniques. Our proposed framework has great potential for improving the accuracy of cardiac disease diagnosis and treatment planning. Code can be accessed at https://github.com/hexiaoxiao-cs/DMCVR.","PeriodicalId":18289,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"83 1","pages":"132-142"},"PeriodicalIF":0.0,"publicationDate":"2023-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90314217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Revolutionizing Space Health (Swin-FSR): Advancing Super-Resolution of Fundus Images for SANS Visual Assessment Technology 革命性的空间健康(swwin - fsr):推进眼底图像的超分辨率SANS视觉评估技术
Khondker Fariha Hossain, S. Kamran, Joshua Ong, Andrew Lee, A. Tavakkoli
The rapid accessibility of portable and affordable retinal imaging devices has made early differential diagnosis easier. For example, color funduscopy imaging is readily available in remote villages, which can help to identify diseases like age-related macular degeneration (AMD), glaucoma, or pathological myopia (PM). On the other hand, astronauts at the International Space Station utilize this camera for identifying spaceflight-associated neuro-ocular syndrome (SANS). However, due to the unavailability of experts in these locations, the data has to be transferred to an urban healthcare facility (AMD and glaucoma) or a terrestrial station (e.g, SANS) for more precise disease identification. Moreover, due to low bandwidth limits, the imaging data has to be compressed for transfer between these two places. Different super-resolution algorithms have been proposed throughout the years to address this. Furthermore, with the advent of deep learning, the field has advanced so much that x2 and x4 compressed images can be decompressed to their original form without losing spatial information. In this paper, we introduce a novel model called Swin-FSR that utilizes Swin Transformer with spatial and depth-wise attention for fundus image super-resolution. Our architecture achieves Peak signal-to-noise-ratio (PSNR) of 47.89, 49.00 and 45.32 on three public datasets, namely iChallenge-AMD, iChallenge-PM, and G1020. Additionally, we tested the model's effectiveness on a privately held dataset for SANS provided by NASA and achieved comparable results against previous architectures.
便携式和负担得起的视网膜成像设备的快速获取使得早期鉴别诊断更加容易。例如,在偏远的村庄,彩色眼底成像很容易获得,它可以帮助识别年龄相关性黄斑变性(AMD)、青光眼或病理性近视(PM)等疾病。另一方面,国际空间站的宇航员利用这台相机来识别太空飞行相关的神经-眼综合征(SANS)。然而,由于这些地方没有专家,数据必须转移到城市医疗机构(AMD和青光眼)或地面站(例如SANS),以便更精确地识别疾病。此外,由于带宽限制较低,成像数据必须进行压缩才能在这两个地方之间传输。多年来,人们提出了不同的超分辨率算法来解决这个问题。此外,随着深度学习的出现,该领域已经取得了很大的进步,x2和x4压缩图像可以被解压缩到原始形式而不会丢失空间信息。本文介绍了一种新的Swin- fsr模型,该模型利用Swin变压器的空间和深度关注来实现眼底图像的超分辨率。我们的架构在icchallenge - amd、icchallenge - pm和G1020三个公共数据集上实现了47.89、49.00和45.32的峰值信噪比(PSNR)。此外,我们在NASA提供的SANS私有数据集上测试了模型的有效性,并与以前的架构取得了可比的结果。
{"title":"Revolutionizing Space Health (Swin-FSR): Advancing Super-Resolution of Fundus Images for SANS Visual Assessment Technology","authors":"Khondker Fariha Hossain, S. Kamran, Joshua Ong, Andrew Lee, A. Tavakkoli","doi":"10.48550/arXiv.2308.06332","DOIUrl":"https://doi.org/10.48550/arXiv.2308.06332","url":null,"abstract":"The rapid accessibility of portable and affordable retinal imaging devices has made early differential diagnosis easier. For example, color funduscopy imaging is readily available in remote villages, which can help to identify diseases like age-related macular degeneration (AMD), glaucoma, or pathological myopia (PM). On the other hand, astronauts at the International Space Station utilize this camera for identifying spaceflight-associated neuro-ocular syndrome (SANS). However, due to the unavailability of experts in these locations, the data has to be transferred to an urban healthcare facility (AMD and glaucoma) or a terrestrial station (e.g, SANS) for more precise disease identification. Moreover, due to low bandwidth limits, the imaging data has to be compressed for transfer between these two places. Different super-resolution algorithms have been proposed throughout the years to address this. Furthermore, with the advent of deep learning, the field has advanced so much that x2 and x4 compressed images can be decompressed to their original form without losing spatial information. In this paper, we introduce a novel model called Swin-FSR that utilizes Swin Transformer with spatial and depth-wise attention for fundus image super-resolution. Our architecture achieves Peak signal-to-noise-ratio (PSNR) of 47.89, 49.00 and 45.32 on three public datasets, namely iChallenge-AMD, iChallenge-PM, and G1020. Additionally, we tested the model's effectiveness on a privately held dataset for SANS provided by NASA and achieved comparable results against previous architectures.","PeriodicalId":18289,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"118 1","pages":"693-703"},"PeriodicalIF":0.0,"publicationDate":"2023-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84731150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
M&M: Tackling False Positives in Mammography with a Multi-view and Multi-instance Learning Sparse Detector M&M:用多视图和多实例学习稀疏检测器处理乳房x光检查中的假阳性
Yen Nhi Truong Vu, Dan Guo, Ahmed Taha, Jason Su, Thomas P. Matthews
Deep-learning-based object detection methods show promise for improving screening mammography, but high rates of false positives can hinder their effectiveness in clinical practice. To reduce false positives, we identify three challenges: (1) unlike natural images, a malignant mammogram typically contains only one malignant finding; (2) mammography exams contain two views of each breast, and both views ought to be considered to make a correct assessment; (3) most mammograms are negative and do not contain any findings. In this work, we tackle the three aforementioned challenges by: (1) leveraging Sparse R-CNN and showing that sparse detectors are more appropriate than dense detectors for mammography; (2) including a multi-view cross-attention module to synthesize information from different views; (3) incorporating multi-instance learning (MIL) to train with unannotated images and perform breast-level classification. The resulting model, M&M, is a Multi-view and Multi-instance learning system that can both localize malignant findings and provide breast-level predictions. We validate M&M's detection and classification performance using five mammography datasets. In addition, we demonstrate the effectiveness of each proposed component through comprehensive ablation studies.
基于深度学习的目标检测方法有望改善乳房x光筛查,但高假阳性率会阻碍其在临床实践中的有效性。为了减少假阳性,我们确定了三个挑战:(1)与自然图像不同,恶性乳房x光片通常只包含一个恶性发现;(2)乳房x光检查包含每个乳房的两个视图,应该考虑这两个视图以做出正确的评估;大多数乳房x光片都是阴性的,没有任何发现。在这项工作中,我们通过以下方式解决了上述三个挑战:(1)利用稀疏R-CNN,并表明稀疏检测器比密集检测器更适合乳房x光检查;(2)包含多视角交叉关注模块,综合不同视角信息;(3)结合多实例学习(multi-instance learning, MIL)对未标注的图像进行训练,并进行乳房级分类。由此产生的模型M&M是一个多视图和多实例学习系统,既可以定位恶性发现,又可以提供乳房水平预测。我们使用五个乳房x线摄影数据集验证了M&M的检测和分类性能。此外,我们通过全面的烧蚀研究证明了每个提出的组件的有效性。
{"title":"M&M: Tackling False Positives in Mammography with a Multi-view and Multi-instance Learning Sparse Detector","authors":"Yen Nhi Truong Vu, Dan Guo, Ahmed Taha, Jason Su, Thomas P. Matthews","doi":"10.48550/arXiv.2308.06420","DOIUrl":"https://doi.org/10.48550/arXiv.2308.06420","url":null,"abstract":"Deep-learning-based object detection methods show promise for improving screening mammography, but high rates of false positives can hinder their effectiveness in clinical practice. To reduce false positives, we identify three challenges: (1) unlike natural images, a malignant mammogram typically contains only one malignant finding; (2) mammography exams contain two views of each breast, and both views ought to be considered to make a correct assessment; (3) most mammograms are negative and do not contain any findings. In this work, we tackle the three aforementioned challenges by: (1) leveraging Sparse R-CNN and showing that sparse detectors are more appropriate than dense detectors for mammography; (2) including a multi-view cross-attention module to synthesize information from different views; (3) incorporating multi-instance learning (MIL) to train with unannotated images and perform breast-level classification. The resulting model, M&M, is a Multi-view and Multi-instance learning system that can both localize malignant findings and provide breast-level predictions. We validate M&M's detection and classification performance using five mammography datasets. In addition, we demonstrate the effectiveness of each proposed component through comprehensive ablation studies.","PeriodicalId":18289,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"238 1","pages":"778-788"},"PeriodicalIF":0.0,"publicationDate":"2023-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77826255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TriDo-Former: A Triple-Domain Transformer for Direct PET Reconstruction from Low-Dose Sinograms 三域变换器:用于从低剂量图中直接重建PET的三域变换器
Jiaqi Cui, Pinxian Zeng, Xinyi Zeng, Peng Wang, Xi Wu, Jiliu Zhou, Yan Wang, Dinggang Shen
To obtain high-quality positron emission tomography (PET) images while minimizing radiation exposure, various methods have been proposed for reconstructing standard-dose PET (SPET) images from low-dose PET (LPET) sinograms directly. However, current methods often neglect boundaries during sinogram-to-image reconstruction, resulting in high-frequency distortion in the frequency domain and diminished or fuzzy edges in the reconstructed images. Furthermore, the convolutional architectures, which are commonly used, lack the ability to model long-range non-local interactions, potentially leading to inaccurate representations of global structures. To alleviate these problems, we propose a transformer-based model that unites triple domains of sinogram, image, and frequency for direct PET reconstruction, namely TriDo-Former. Specifically, the TriDo-Former consists of two cascaded networks, i.e., a sinogram enhancement transformer (SE-Former) for denoising the input LPET sinograms and a spatial-spectral reconstruction transformer (SSR-Former) for reconstructing SPET images from the denoised sinograms. Different from the vanilla transformer that splits an image into 2D patches, based specifically on the PET imaging mechanism, our SE-Former divides the sinogram into 1D projection view angles to maintain its inner-structure while denoising, preventing the noise in the sinogram from prorogating into the image domain. Moreover, to mitigate high-frequency distortion and improve reconstruction details, we integrate global frequency parsers (GFPs) into SSR-Former. The GFP serves as a learnable frequency filter that globally adjusts the frequency components in the frequency domain, enforcing the network to restore high-frequency details resembling real SPET images. Validations on a clinical dataset demonstrate that our TriDo-Former outperforms the state-of-the-art methods qualitatively and quantitatively.
为了获得高质量的正电子发射断层扫描(PET)图像,同时最大限度地减少辐射暴露,人们提出了各种方法直接从低剂量PET (LPET)图重建标准剂量PET (SPET)图像。然而,目前的方法在图图重构过程中往往忽略边界,导致频域高频失真,重构图像边缘减弱或模糊。此外,常用的卷积架构缺乏对远程非局部相互作用进行建模的能力,可能导致全局结构的不准确表示。为了缓解这些问题,我们提出了一种基于变压器的模型,该模型结合了正弦图、图像和频率的三重域,即TriDo-Former。具体来说,TriDo-Former由两个级联网络组成,即用于去噪输入LPET正弦图的正弦图增强变压器(SE-Former)和用于从去噪的正弦图重建SPET图像的空间光谱重建变压器(SSR-Former)。与将图像分割成二维小块的普通变压器不同,我们的SE-Former基于PET成像机制,将sinogram分割成一维投影视角,在去噪的同时保持其内部结构,防止sinogram中的噪声进入图像域。此外,为了减轻高频失真并改善重建细节,我们将全局频率解析器(gfp)集成到SSR-Former中。GFP作为一个可学习的频率滤波器,在频域内全局调整频率成分,强制网络恢复类似于真实SPET图像的高频细节。对临床数据集的验证表明,我们的TriDo-Former在定性和定量上都优于最先进的方法。
{"title":"TriDo-Former: A Triple-Domain Transformer for Direct PET Reconstruction from Low-Dose Sinograms","authors":"Jiaqi Cui, Pinxian Zeng, Xinyi Zeng, Peng Wang, Xi Wu, Jiliu Zhou, Yan Wang, Dinggang Shen","doi":"10.48550/arXiv.2308.05365","DOIUrl":"https://doi.org/10.48550/arXiv.2308.05365","url":null,"abstract":"To obtain high-quality positron emission tomography (PET) images while minimizing radiation exposure, various methods have been proposed for reconstructing standard-dose PET (SPET) images from low-dose PET (LPET) sinograms directly. However, current methods often neglect boundaries during sinogram-to-image reconstruction, resulting in high-frequency distortion in the frequency domain and diminished or fuzzy edges in the reconstructed images. Furthermore, the convolutional architectures, which are commonly used, lack the ability to model long-range non-local interactions, potentially leading to inaccurate representations of global structures. To alleviate these problems, we propose a transformer-based model that unites triple domains of sinogram, image, and frequency for direct PET reconstruction, namely TriDo-Former. Specifically, the TriDo-Former consists of two cascaded networks, i.e., a sinogram enhancement transformer (SE-Former) for denoising the input LPET sinograms and a spatial-spectral reconstruction transformer (SSR-Former) for reconstructing SPET images from the denoised sinograms. Different from the vanilla transformer that splits an image into 2D patches, based specifically on the PET imaging mechanism, our SE-Former divides the sinogram into 1D projection view angles to maintain its inner-structure while denoising, preventing the noise in the sinogram from prorogating into the image domain. Moreover, to mitigate high-frequency distortion and improve reconstruction details, we integrate global frequency parsers (GFPs) into SSR-Former. The GFP serves as a learnable frequency filter that globally adjusts the frequency components in the frequency domain, enforcing the network to restore high-frequency details resembling real SPET images. Validations on a clinical dataset demonstrate that our TriDo-Former outperforms the state-of-the-art methods qualitatively and quantitatively.","PeriodicalId":18289,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"91 1","pages":"184-194"},"PeriodicalIF":0.0,"publicationDate":"2023-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78035584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Discrepancy-Based Active Learning for Weakly Supervised Bleeding Segmentation in Wireless Capsule Endoscopy Images 基于差异主动学习的无线胶囊内窥镜图像弱监督出血分割
Fan Bai, Xiaohan Xing, Yutian Shen, Han Ma, Max Q.-H. Meng
{"title":"Discrepancy-Based Active Learning for Weakly Supervised Bleeding Segmentation in Wireless Capsule Endoscopy Images","authors":"Fan Bai, Xiaohan Xing, Yutian Shen, Han Ma, Max Q.-H. Meng","doi":"10.1007/978-3-031-16452-1_3","DOIUrl":"https://doi.org/10.1007/978-3-031-16452-1_3","url":null,"abstract":"","PeriodicalId":18289,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"2 1","pages":"24-34"},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86075772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
An Interpretable and Attention-based Method for Gaze Estimation Using Electroencephalography 一种可解释的基于注意的脑电图凝视估计方法
Nina Weng, M. Płomecka, Manuel Kaufmann, Ard Kastrati, Roger Wattenhofer, N. Langer
Eye movements can reveal valuable insights into various aspects of human mental processes, physical well-being, and actions. Recently, several datasets have been made available that simultaneously record EEG activity and eye movements. This has triggered the development of various methods to predict gaze direction based on brain activity. However, most of these methods lack interpretability, which limits their technology acceptance. In this paper, we leverage a large data set of simultaneously measured Electroencephalography (EEG) and Eye tracking, proposing an interpretable model for gaze estimation from EEG data. More specifically, we present a novel attention-based deep learning framework for EEG signal analysis, which allows the network to focus on the most relevant information in the signal and discard problematic channels. Additionally, we provide a comprehensive evaluation of the presented framework, demonstrating its superiority over current methods in terms of accuracy and robustness. Finally, the study presents visualizations that explain the results of the analysis and highlights the potential of attention mechanism for improving the efficiency and effectiveness of EEG data analysis in a variety of applications.
眼球运动可以揭示人类心理过程、身体健康和行为的各个方面的宝贵见解。最近,有几个数据集可以同时记录脑电图活动和眼球运动。这引发了各种基于大脑活动来预测凝视方向的方法的发展。然而,这些方法大多缺乏可解释性,这限制了它们的技术接受度。在本文中,我们利用同时测量的脑电图(EEG)和眼动追踪的大量数据集,提出了一个可解释的模型,用于从脑电图数据中估计凝视。更具体地说,我们提出了一种新的基于注意力的深度学习框架,用于脑电图信号分析,该框架允许网络关注信号中最相关的信息,并丢弃有问题的通道。此外,我们对所提出的框架进行了全面的评估,证明了其在准确性和鲁棒性方面优于当前方法。最后,本研究以可视化的方式解释了分析结果,并强调了注意机制在各种应用中提高脑电数据分析效率和有效性的潜力。
{"title":"An Interpretable and Attention-based Method for Gaze Estimation Using Electroencephalography","authors":"Nina Weng, M. Płomecka, Manuel Kaufmann, Ard Kastrati, Roger Wattenhofer, N. Langer","doi":"10.48550/arXiv.2308.05768","DOIUrl":"https://doi.org/10.48550/arXiv.2308.05768","url":null,"abstract":"Eye movements can reveal valuable insights into various aspects of human mental processes, physical well-being, and actions. Recently, several datasets have been made available that simultaneously record EEG activity and eye movements. This has triggered the development of various methods to predict gaze direction based on brain activity. However, most of these methods lack interpretability, which limits their technology acceptance. In this paper, we leverage a large data set of simultaneously measured Electroencephalography (EEG) and Eye tracking, proposing an interpretable model for gaze estimation from EEG data. More specifically, we present a novel attention-based deep learning framework for EEG signal analysis, which allows the network to focus on the most relevant information in the signal and discard problematic channels. Additionally, we provide a comprehensive evaluation of the presented framework, demonstrating its superiority over current methods in terms of accuracy and robustness. Finally, the study presents visualizations that explain the results of the analysis and highlights the potential of attention mechanism for improving the efficiency and effectiveness of EEG data analysis in a variety of applications.","PeriodicalId":18289,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"1 1","pages":"734-743"},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89691423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1