首页 > 最新文献

Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision最新文献

英文 中文
Architecture-Agnostic Untrained Network Priors for Image Reconstruction with Frequency Regularization. 基于频率正则化的图像重构非训练网络先验算法。
Yilin Liu, Yunkui Pang, Jiang Li, Yong Chen, Pew-Thian Yap

Untrained networks inspired by deep image priors have shown promising capabilities in recovering high-quality images from noisy or partial measurements without requiring training sets. Their success is widely attributed to implicit regularization due to the spectral bias of suitable network architectures. However, the application of such network-based priors often entails superfluous architectural decisions, risks of overfitting, and lengthy optimization processes, all of which hinder their practicality. To address these challenges, we propose efficient architecture-agnostic techniques to directly modulate the spectral bias of network priors: 1) bandwidth-constrained input, 2) bandwidth-controllable upsamplers, and 3) Lipschitz-regularized convolutional layers. We show that, with just a few lines of code, we can reduce overfitting in underperforming architectures and close performance gaps with high-performing counterparts, minimizing the need for extensive architecture tuning. This makes it possible to employ a more compact model to achieve performance similar or superior to larger models while reducing runtime. Demonstrated on inpainting-like MRI reconstruction task, our results signify for the first time that architectural biases, overfitting, and runtime issues of untrained network priors can be simultaneously addressed without architectural modifications. Our code is publicly available .

受深度图像先验启发的未经训练的网络在不需要训练集的情况下从噪声或部分测量中恢复高质量图像方面显示出了很好的能力。他们的成功被广泛地归因于隐式正则化,由于合适的网络架构的频谱偏差。然而,这种基于网络的先验的应用通常会带来多余的架构决策、过度拟合的风险和冗长的优化过程,所有这些都阻碍了它们的实用性。为了解决这些挑战,我们提出了有效的架构不可知技术来直接调制网络先验的频谱偏差:1)带宽约束输入,2)带宽可控上采样器,以及3)lipschitz正则化卷积层。我们表明,只需几行代码,我们就可以减少性能不佳的体系结构中的过拟合,并缩小与高性能对应体系结构的性能差距,从而最大限度地减少对广泛体系结构调优的需求。这使得使用更紧凑的模型在减少运行时间的同时实现与大型模型相似或更好的性能成为可能。在类似于绘画的MRI重建任务中,我们的结果首次表明,未经训练的网络先验的架构偏差、过拟合和运行时问题可以在不修改架构的情况下同时解决。我们的代码是公开的。
{"title":"Architecture-Agnostic Untrained Network Priors for Image Reconstruction with Frequency Regularization.","authors":"Yilin Liu, Yunkui Pang, Jiang Li, Yong Chen, Pew-Thian Yap","doi":"10.1007/978-3-031-72630-9_20","DOIUrl":"10.1007/978-3-031-72630-9_20","url":null,"abstract":"<p><p>Untrained networks inspired by deep image priors have shown promising capabilities in recovering high-quality images from noisy or partial measurements <i>without requiring training sets</i>. Their success is widely attributed to implicit regularization due to the spectral bias of suitable network architectures. However, the application of such network-based priors often entails superfluous architectural decisions, risks of overfitting, and lengthy optimization processes, all of which hinder their practicality. To address these challenges, we propose efficient architecture-agnostic techniques to directly modulate the spectral bias of network priors: 1) bandwidth-constrained input, 2) bandwidth-controllable upsamplers, and 3) Lipschitz-regularized convolutional layers. We show that, with <i>just a few lines of code</i>, we can reduce overfitting in underperforming architectures and close performance gaps with high-performing counterparts, minimizing the need for extensive architecture tuning. This makes it possible to employ a more <i>compact</i> model to achieve performance similar or superior to larger models while reducing runtime. Demonstrated on inpainting-like MRI reconstruction task, our results signify for the first time that architectural biases, overfitting, and runtime issues of untrained network priors can be simultaneously addressed without architectural modifications. Our code is publicly available .</p>","PeriodicalId":72676,"journal":{"name":"Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision","volume":"15072 ","pages":"341-358"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11670387/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142904254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DEPICT: Diffusion-Enabled Permutation Importance for Image Classification Tasks. 描述:扩散激活排列对图像分类任务的重要性。
Sarah Jabbour, Gregory Kondas, Ella Kazerooni, Michael Sjoding, David Fouhey, Jenna Wiens

We propose a permutation-based explanation method for image classifiers. Current image-model explanations like activation maps are limited to instance-based explanations in the pixel space, making it difficult to understand global model behavior. In contrast, permutation based explanations for tabular data classifiers measure feature importance by comparing model performance on data before and after permuting a feature. We propose an explanation method for image-based models that permutes interpretable concepts across dataset images. Given a dataset of images labeled with specific concepts like captions, we permute a concept across examples in the text space and then generate images via a text-conditioned diffusion model. Feature importance is then reflected by the change in model performance relative to unpermuted data. When applied to a set of concepts, the method generates a ranking of feature importance. We show this approach recovers underlying model feature importance on synthetic and real-world image classification tasks.

提出了一种基于排列的图像分类器解释方法。当前的图像模型解释(如激活地图)仅限于像素空间中基于实例的解释,这使得很难理解全局模型行为。相比之下,表格数据分类器的基于排列的解释通过比较模型在排列特征前后对数据的性能来衡量特征的重要性。我们提出了一种基于图像的模型的解释方法,该模型可以跨数据集图像排列可解释的概念。给定一个带有特定概念(如标题)标记的图像数据集,我们在文本空间中的示例中排列一个概念,然后通过文本条件扩散模型生成图像。然后通过模型性能相对于未排列数据的变化来反映特征的重要性。当应用于一组概念时,该方法生成特征重要性的排序。我们展示了这种方法在合成和真实世界的图像分类任务中恢复底层模型特征的重要性。
{"title":"DEPICT: Diffusion-Enabled Permutation Importance for Image Classification Tasks.","authors":"Sarah Jabbour, Gregory Kondas, Ella Kazerooni, Michael Sjoding, David Fouhey, Jenna Wiens","doi":"10.1007/978-3-031-73039-9_3","DOIUrl":"10.1007/978-3-031-73039-9_3","url":null,"abstract":"<p><p>We propose a permutation-based explanation method for image classifiers. Current image-model explanations like activation maps are limited to instance-based explanations in the pixel space, making it difficult to understand global model behavior. In contrast, permutation based explanations for tabular data classifiers measure feature importance by comparing model performance on data before and after permuting a feature. We propose an explanation method for image-based models that permutes interpretable concepts across dataset images. Given a dataset of images labeled with specific concepts like captions, we permute a concept across examples in the text space and then generate images via a text-conditioned diffusion model. Feature importance is then reflected by the change in model performance relative to unpermuted data. When applied to a set of concepts, the method generates a ranking of feature importance. We show this approach recovers underlying model feature importance on synthetic and real-world image classification tasks.</p>","PeriodicalId":72676,"journal":{"name":"Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision","volume":"15122 ","pages":"35-51"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12199212/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144509827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bridging the Pathology Domain Gap: Efficiently Adapting CLIP for Pathology Image Analysis with Limited Labeled Data. 弥合病理领域的差距:有效地适应CLIP与有限的标记数据病理图像分析。
Zhengfeng Lai, Joohi Chauhan, Brittany N Dugger, Chen-Nee Chuah

Contrastive Language-Image Pre-training (CLIP) has shown its proficiency in acquiring distinctive visual representations and exhibiting strong generalization across diverse vision tasks. However, its effectiveness in pathology image analysis, particularly with limited labeled data, remains an ongoing area of investigation due to challenges associated with significant domain shifts and catastrophic forgetting. Thus, it is imperative to devise efficient adaptation strategies in this domain to enable scalable analysis. In this study, we introduce Path-CLIP, a framework tailored for a swift adaptation of CLIP to various pathology tasks. Firstly, we propose Residual Feature Refinement (RFR) with a dynamically adjustable ratio to effectively integrate and balance source and task-specific knowledge. Secondly, we introduce Hidden Representation Perturbation (HRP) and Dual-view Vision Contrastive (DVC) techniques to mitigate overfitting issues. Finally, we present the Doublet Multimodal Contrastive Loss (DMCL) for fine-tuning CLIP for pathology tasks. We demonstrate that Path-CLIP adeptly adapts pre-trained CLIP to downstream pathology tasks, yielding competitive results. Specifically, Path-CLIP achieves over +19% improvement in accuracy when utilizing mere 0.1% of labeled data in PCam with only 10 minutes of fine-tuning while running on a single GPU.

对比语言-图像预训练(CLIP)在获取独特的视觉表征和在不同的视觉任务中表现出很强的泛化能力方面已显示出其能力。然而,在病理图像分析中,尤其是在标记数据有限的情况下,CLIP 的有效性仍是一个有待研究的领域,因为它面临着与重大领域转移和灾难性遗忘相关的挑战。因此,当务之急是在这一领域设计出高效的适应策略,以实现可扩展的分析。在本研究中,我们介绍了 Path-CLIP,这是一个专为 CLIP 快速适应各种病理任务而定制的框架。首先,我们提出了可动态调整比例的残差特征细化(RFR),以有效整合和平衡源知识与特定任务知识。其次,我们引入了隐藏表征扰动(HRP)和双视角视觉对比(DVC)技术,以缓解过拟合问题。最后,我们介绍了双倍多模态对比损失(DMCL),用于微调病理学任务的 CLIP。我们证明,Path-CLIP 能够使预先训练的 CLIP 适应下游病理学任务,并产生有竞争力的结果。具体来说,Path-CLIP 在 PCam 中仅使用了 0.1% 的标记数据,仅进行了 10 分钟的微调,就实现了超过 +19% 的准确率提升,同时还能在单 GPU 上运行。
{"title":"Bridging the Pathology Domain Gap: Efficiently Adapting CLIP for Pathology Image Analysis with Limited Labeled Data.","authors":"Zhengfeng Lai, Joohi Chauhan, Brittany N Dugger, Chen-Nee Chuah","doi":"10.1007/978-3-031-73039-9_15","DOIUrl":"10.1007/978-3-031-73039-9_15","url":null,"abstract":"<p><p>Contrastive Language-Image Pre-training (CLIP) has shown its proficiency in acquiring distinctive visual representations and exhibiting strong generalization across diverse vision tasks. However, its effectiveness in pathology image analysis, particularly with limited labeled data, remains an ongoing area of investigation due to challenges associated with significant domain shifts and catastrophic forgetting. Thus, it is imperative to devise efficient adaptation strategies in this domain to enable scalable analysis. In this study, we introduce Path-CLIP, a framework tailored for a swift adaptation of CLIP to various pathology tasks. Firstly, we propose Residual Feature Refinement (RFR) with a dynamically adjustable ratio to effectively integrate and balance source and task-specific knowledge. Secondly, we introduce Hidden Representation Perturbation (HRP) and Dual-view Vision Contrastive (DVC) techniques to mitigate overfitting issues. Finally, we present the Doublet Multimodal Contrastive Loss (DMCL) for fine-tuning CLIP for pathology tasks. We demonstrate that Path-CLIP adeptly adapts pre-trained CLIP to downstream pathology tasks, yielding competitive results. Specifically, Path-CLIP achieves over +<b>19%</b> improvement in accuracy when utilizing mere <b>0.1%</b> of labeled data in PCam with only 10 minutes of fine-tuning while running on a single GPU.</p>","PeriodicalId":72676,"journal":{"name":"Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision","volume":"15122 ","pages":"256-273"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11949240/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143733479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Zero-Shot Adaptation for Approximate Posterior Sampling of Diffusion Models in Inverse Problems. 反问题中扩散模型近似后验抽样的零弹自适应。
Yaşar Utku Alçalar, Mehmet Akçakaya

Diffusion models have emerged as powerful generative techniques for solving inverse problems. Despite their success in a variety of inverse problems in imaging, these models require many steps to converge, leading to slow inference time. Recently, there has been a trend in diffusion models for employing sophisticated noise schedules that involve more frequent iterations of timesteps at lower noise levels, thereby improving image generation and convergence speed. However, application of these ideas for solving inverse problems with diffusion models remain challenging, as these noise schedules do not perform well when using empirical tuning for the forward model log-likelihood term weights. To tackle these challenges, we propose zero-shot approximate posterior sampling (ZAPS) that leverages connections to zero-shot physics-driven deep learning. ZAPS fixes the number of sampling steps, and uses zero-shot training with a physics-guided loss function to learn log-likelihood weights at each irregular timestep. We apply ZAPS to the recently proposed diffusion posterior sampling method as baseline, though ZAPS can also be used with other posterior sampling diffusion models. We further approximate the Hessian of the logarithm of the prior using a diagonalization approach with learnable diagonal entries for computational efficiency. These parameters are optimized over a fixed number of epochs with a given computational budget. Our results for various noisy inverse problems, including Gaussian and motion deblurring, inpainting, and super-resolution show that ZAPS reduces inference time, provides robustness to irregular noise schedules and improves reconstruction quality. Code is available at https://github.com/ualcalar17/ZAPS.

扩散模型已经成为求解逆问题的强大生成技术。尽管它们在成像中的各种逆问题中取得了成功,但这些模型需要许多步骤才能收敛,导致推理时间较慢。最近,在扩散模型中有一种趋势,即采用复杂的噪声时间表,在较低的噪声水平下更频繁地迭代时间步长,从而提高图像生成和收敛速度。然而,将这些思想应用于解决扩散模型的逆问题仍然具有挑战性,因为这些噪声调度在使用前向模型对数似然项权重的经验调整时表现不佳。为了解决这些挑战,我们提出了零射击近似后验抽样(ZAPS),它利用了零射击物理驱动的深度学习的连接。ZAPS固定采样步数,并使用带有物理引导损失函数的零射击训练来学习每个不规则时间步的对数似然权值。我们将ZAPS应用于最近提出的扩散后验抽样方法作为基线,尽管ZAPS也可以用于其他后验抽样扩散模型。为了提高计算效率,我们使用具有可学习对角项的对角化方法进一步近似先验对数的Hessian。这些参数在给定计算预算的固定数量的epoch上进行优化。我们对各种噪声逆问题的结果,包括高斯和运动去模糊,修复和超分辨率表明,ZAPS减少了推理时间,提供了对不规则噪声调度的鲁棒性,并提高了重建质量。代码可从https://github.com/ualcalar17/ZAPS获得。
{"title":"Zero-Shot Adaptation for Approximate Posterior Sampling of Diffusion Models in Inverse Problems.","authors":"Yaşar Utku Alçalar, Mehmet Akçakaya","doi":"10.1007/978-3-031-73010-8_26","DOIUrl":"10.1007/978-3-031-73010-8_26","url":null,"abstract":"<p><p>Diffusion models have emerged as powerful generative techniques for solving inverse problems. Despite their success in a variety of inverse problems in imaging, these models require many steps to converge, leading to slow inference time. Recently, there has been a trend in diffusion models for employing sophisticated noise schedules that involve more frequent iterations of timesteps at lower noise levels, thereby improving image generation and convergence speed. However, application of these ideas for solving inverse problems with diffusion models remain challenging, as these noise schedules do not perform well when using empirical tuning for the forward model log-likelihood term weights. To tackle these challenges, we propose zero-shot approximate posterior sampling (ZAPS) that leverages connections to zero-shot physics-driven deep learning. ZAPS fixes the number of sampling steps, and uses zero-shot training with a physics-guided loss function to learn log-likelihood weights at each irregular timestep. We apply ZAPS to the recently proposed diffusion posterior sampling method as baseline, though ZAPS can also be used with other posterior sampling diffusion models. We further approximate the Hessian of the logarithm of the prior using a diagonalization approach with learnable diagonal entries for computational efficiency. These parameters are optimized over a fixed number of epochs with a given computational budget. Our results for various noisy inverse problems, including Gaussian and motion deblurring, inpainting, and super-resolution show that ZAPS reduces inference time, provides robustness to irregular noise schedules and improves reconstruction quality. Code is available at https://github.com/ualcalar17/ZAPS.</p>","PeriodicalId":72676,"journal":{"name":"Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision","volume":"15141 ","pages":"444-460"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11736016/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143017349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DGR-MIL: Exploring Diverse Global Representation in Multiple Instance Learning for Whole Slide Image Classification. DGR-MIL:在全幻灯片图像分类的多实例学习中探索不同的全局表示。
Wenhui Zhu, Xiwen Chen, Peijie Qiu, Aristeidis Sotiras, Abolfazl Razi, Yalin Wang

Multiple instance learning (MIL) stands as a powerful approach in weakly supervised learning, regularly employed in histological whole slide image (WSI) classification for detecting tumorous lesions. However, existing mainstream MIL methods focus on modeling correlation between instances while overlooking the inherent diversity among instances. However, few MIL methods have aimed at diversity modeling, which empirically show inferior performance but with a high computational cost. To bridge this gap, we propose a novel MIL aggregation method based on diverse global representation (DGR-MIL), by modeling diversity among instances through a set of global vectors that serve as a summary of all instances. First, we turn the instance correlation into the similarity between instance embeddings and the predefined global vectors through a cross-attention mechanism. This stems from the fact that similar instance embeddings typically would result in a higher correlation with a certain global vector. Second, we propose two mechanisms to enforce the diversity among the global vectors to be more descriptive of the entire bag: (i) positive instance alignment and (ii) a novel, efficient, and theoretically guaranteed diversification learning paradigm. Specifically, the positive instance alignment module encourages the global vectors to align with the center of positive instances (e.g., instances containing tumors in WSI). To further diversify the global representations, we propose a novel diversification learning paradigm leveraging the determinantal point process. The proposed model outperforms the state-of-the-art MIL aggregation models by a substantial margin on the CAMELYON-16 and the TCGA-lung cancer datasets. The code is available at https://github.com/ChongQingNoSubway/DGR-MIL .

多实例学习(Multiple instance learning, MIL)是弱监督学习中的一种强有力的方法,常用于组织全滑动图像(组织学whole slide image, WSI)分类中检测肿瘤病变。然而,现有的主流MIL方法侧重于对实例之间的相关性进行建模,而忽略了实例之间固有的多样性。然而,针对多样性建模的MIL方法很少,经验表明其性能较差且计算成本较高。为了弥补这一差距,我们提出了一种基于多样化全局表示(DGR-MIL)的MIL聚合方法,该方法通过一组作为所有实例摘要的全局向量来建模实例之间的多样性。首先,我们通过交叉注意机制将实例相关性转化为实例嵌入与预定义全局向量之间的相似性。这源于这样一个事实,即相似的实例嵌入通常会导致与某个全局向量的更高相关性。其次,我们提出了两种机制来加强全局向量之间的多样性,以更好地描述整个包:(i)积极的实例对齐和(ii)一种新颖、高效、理论上有保证的多样化学习范式。具体来说,正实例对齐模块鼓励全局向量与正实例(例如,WSI中包含肿瘤的实例)的中心对齐。为了进一步使全局表征多样化,我们提出了一种利用确定性点过程的新的多样化学习范式。在CAMELYON-16和tcga -肺癌数据集上,所提出的模型在很大程度上优于最先进的MIL聚合模型。代码可在https://github.com/ChongQingNoSubway/DGR-MIL上获得。
{"title":"DGR-MIL: Exploring Diverse Global Representation in Multiple Instance Learning for Whole Slide Image Classification.","authors":"Wenhui Zhu, Xiwen Chen, Peijie Qiu, Aristeidis Sotiras, Abolfazl Razi, Yalin Wang","doi":"10.1007/978-3-031-72920-1_19","DOIUrl":"10.1007/978-3-031-72920-1_19","url":null,"abstract":"<p><p>Multiple instance learning (MIL) stands as a powerful approach in weakly supervised learning, regularly employed in histological whole slide image (WSI) classification for detecting tumorous lesions. However, existing mainstream MIL methods focus on modeling correlation between instances while overlooking the inherent diversity among instances. However, few MIL methods have aimed at diversity modeling, which empirically show inferior performance but with a high computational cost. To bridge this gap, we propose a novel MIL aggregation method based on diverse global representation (DGR-MIL), by modeling diversity among instances through a set of global vectors that serve as a summary of all instances. First, we turn the instance correlation into the similarity between instance embeddings and the predefined global vectors through a cross-attention mechanism. This stems from the fact that similar instance embeddings typically would result in a higher correlation with a certain global vector. Second, we propose two mechanisms to enforce the diversity among the global vectors to be more descriptive of the entire bag: (i) positive instance alignment and (ii) a novel, efficient, and theoretically guaranteed diversification learning paradigm. Specifically, the positive instance alignment module encourages the global vectors to align with the center of positive instances (e.g., instances containing tumors in WSI). To further diversify the global representations, we propose a novel diversification learning paradigm leveraging the determinantal point process. The proposed model outperforms the state-of-the-art MIL aggregation models by a substantial margin on the CAMELYON-16 and the TCGA-lung cancer datasets. The code is available at https://github.com/ChongQingNoSubway/DGR-MIL .</p>","PeriodicalId":72676,"journal":{"name":"Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision","volume":"15096 ","pages":"333-351"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12425359/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145066639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AddBiomechanics Dataset: Capturing the Physics of Human Motion at Scale. 添加生物力学数据集:大规模捕获人体运动的物理。
Keenon Werling, Janelle Kaneda, Tian Tan, Rishi Agarwal, Six Skov, Tom Van Wouwe, Scott Uhlrich, Nicholas Bianco, Carmichael Ong, Antoine Falisse, Shardul Sapkota, Aidan Chandra, Joshua Carter, Ezio Preatoni, Benjamin Fregly, Jennifer Hicks, Scott Delp, C Karen Liu

While reconstructing human poses in 3D from inexpensive sensors has advanced significantly in recent years, quantifying the dynamics of human motion, including the muscle-generated joint torques and external forces, remains a challenge. Prior attempts to estimate physics from reconstructed human poses have been hampered by a lack of datasets with high-quality pose and force data for a variety of movements. We present the AddBiomechanics Dataset 1.0, which includes physically accurate human dynamics of 273 human subjects, over 70 hours of motion and force plate data, totaling more than 24 million frames. To construct this dataset, novel analytical methods were required, which are also reported here. We propose a benchmark for estimating human dynamics from motion using this dataset, and present several baseline results. The AddBiomechanics Dataset is publicly available at addbiomechanics.org/download_data.html.

虽然近年来通过廉价的传感器在3D中重建人体姿势取得了显着进展,但量化人体运动的动力学,包括肌肉产生的关节扭矩和外力,仍然是一个挑战。由于缺乏各种运动的高质量姿势和力数据集,先前从重建的人体姿势估计物理的尝试受到阻碍。我们提出了AddBiomechanics Dataset 1.0,其中包括273人的物理精确的人体动力学,超过70小时的运动和力板数据,总计超过2400万帧。为了构建这个数据集,需要新的分析方法,这里也有报道。我们提出了一个使用该数据集从运动中估计人类动力学的基准,并提出了几个基线结果。AddBiomechanics数据集可在addbiomechanics.org/download_data.html公开获取。
{"title":"AddBiomechanics Dataset: Capturing the Physics of Human Motion at Scale.","authors":"Keenon Werling, Janelle Kaneda, Tian Tan, Rishi Agarwal, Six Skov, Tom Van Wouwe, Scott Uhlrich, Nicholas Bianco, Carmichael Ong, Antoine Falisse, Shardul Sapkota, Aidan Chandra, Joshua Carter, Ezio Preatoni, Benjamin Fregly, Jennifer Hicks, Scott Delp, C Karen Liu","doi":"10.1007/978-3-031-73223-2_27","DOIUrl":"10.1007/978-3-031-73223-2_27","url":null,"abstract":"<p><p>While reconstructing human poses in 3D from inexpensive sensors has advanced significantly in recent years, quantifying the dynamics of human motion, including the muscle-generated joint torques and external forces, remains a challenge. Prior attempts to estimate physics from reconstructed human poses have been hampered by a lack of datasets with high-quality pose and force data for a variety of movements. We present the <i>AddBiomechanics Dataset 1.0</i>, which includes physically accurate human dynamics of 273 human subjects, over 70 hours of motion and force plate data, totaling more than 24 million frames. To construct this dataset, novel analytical methods were required, which are also reported here. We propose a benchmark for estimating human dynamics from motion using this dataset, and present several baseline results. The AddBiomechanics Dataset is publicly available at addbiomechanics.org/download_data.html.</p>","PeriodicalId":72676,"journal":{"name":"Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision","volume":"15146 ","pages":"490-508"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11948690/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143733480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Task-Driven Uncertainty Quantification in Inverse Problems via Conformal Prediction. 基于保形预测的逆问题任务驱动不确定性量化。
Jeffrey Wen, Rizwan Ahmad, Philip Schniter

In imaging inverse problems, one seeks to recover an image from missing/corrupted measurements. Because such problems are ill-posed, there is great motivation to quantify the uncertainty induced by the measurement-and-recovery process. Motivated by applications where the recovered image is used for a downstream task, such as soft-output classification, we propose a task-centered approach to uncertainty quantification. In particular, we use conformal prediction to construct an interval that is guaranteed to contain the task output from the true image up to a user-specified probability, and we use the width of that interval to quantify the uncertainty contributed by measurement-and-recovery. For posterior-sampling-based image recovery, we construct locally adaptive prediction intervals. Furthermore, we propose to collect measurements over multiple rounds, stopping as soon as the task uncertainty falls below an acceptable level. We demonstrate our methodology on accelerated magnetic resonance imaging (MRI): https://github.com/jwen307/TaskUQ.

在成像反问题中,人们试图从丢失/损坏的测量中恢复图像。因为这些问题是病态的,所以有很大的动机去量化由测量和恢复过程引起的不确定性。由于应用程序将恢复的图像用于下游任务,例如软输出分类,我们提出了一种以任务为中心的不确定性量化方法。特别是,我们使用保形预测来构建一个区间,该区间保证包含从真实图像到用户指定概率的任务输出,并且我们使用该区间的宽度来量化测量和恢复所带来的不确定性。对于基于后验采样的图像恢复,我们构建了局部自适应的预测区间。此外,我们建议在多个回合中收集测量,一旦任务不确定性低于可接受的水平就停止。我们在加速磁共振成像(MRI)上展示了我们的方法:https://github.com/jwen307/TaskUQ。
{"title":"Task-Driven Uncertainty Quantification in Inverse Problems via Conformal Prediction.","authors":"Jeffrey Wen, Rizwan Ahmad, Philip Schniter","doi":"10.1007/978-3-031-73027-6_11","DOIUrl":"10.1007/978-3-031-73027-6_11","url":null,"abstract":"<p><p>In imaging inverse problems, one seeks to recover an image from missing/corrupted measurements. Because such problems are ill-posed, there is great motivation to quantify the uncertainty induced by the measurement-and-recovery process. Motivated by applications where the recovered image is used for a downstream task, such as soft-output classification, we propose a task-centered approach to uncertainty quantification. In particular, we use conformal prediction to construct an interval that is guaranteed to contain the task output from the true image up to a user-specified probability, and we use the width of that interval to quantify the uncertainty contributed by measurement-and-recovery. For posterior-sampling-based image recovery, we construct locally adaptive prediction intervals. Furthermore, we propose to collect measurements over multiple rounds, stopping as soon as the task uncertainty falls below an acceptable level. We demonstrate our methodology on accelerated magnetic resonance imaging (MRI): https://github.com/jwen307/TaskUQ.</p>","PeriodicalId":72676,"journal":{"name":"Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision","volume":"15118 ","pages":"182-199"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12109201/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144175722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semi-supervised Segmentation of Histopathology Images with Noise-Aware Topological Consistency. 具有噪声感知拓扑一致性的组织病理图像半监督分割。
Meilong Xu, Xiaoling Hu, Saumya Gupta, Shahira Abousamra, Chao Chen

In digital pathology, segmenting densely distributed objects like glands and nuclei is crucial for downstream analysis. Since detailed pixel-wise annotations are very time-consuming, we need semi-supervised segmentation methods that can learn from unlabeled images. Existing semi-supervised methods are often prone to topological errors, e.g., missing or incorrectly merged/separated glands or nuclei. To address this issue, we propose TopoSemiSeg, the first semi-supervised method that learns the topological representation from unlabeled histopathology images. The major challenge is for unlabeled images; we only have predictions carrying noisy topology. To this end, we introduce a noise-aware topological consistency loss to align the representations of a teacher and a student model. By decomposing the topology of the prediction into signal topology and noisy topology, we ensure that the models learn the true topological signals and become robust to noise. Extensive experiments on public histopathology image datasets show the superiority of our method, especially on topology-aware evaluation metrics. Code is available at https://github.com/Melon-Xu/TopoSemiSeg.

在数字病理学中,分割密集分布的物体,如腺体和细胞核,对下游分析至关重要。由于详细的逐像素注释非常耗时,我们需要可以从未标记的图像中学习的半监督分割方法。现有的半监督方法往往容易出现拓扑错误,例如缺失或不正确地合并/分离腺体或核。为了解决这个问题,我们提出了TopoSemiSeg,这是第一个从未标记的组织病理学图像中学习拓扑表示的半监督方法。主要的挑战是对于未标记的图像;我们只有带有噪声拓扑的预测。为此,我们引入了噪声感知的拓扑一致性损失来对齐教师和学生模型的表示。通过将预测拓扑分解为信号拓扑和噪声拓扑,确保模型学习到真实的拓扑信号,并对噪声具有鲁棒性。在公共组织病理学图像数据集上进行的大量实验表明,我们的方法具有优越性,特别是在拓扑感知评估指标方面。代码可从https://github.com/Melon-Xu/TopoSemiSeg获得。
{"title":"Semi-supervised Segmentation of Histopathology Images with Noise-Aware Topological Consistency.","authors":"Meilong Xu, Xiaoling Hu, Saumya Gupta, Shahira Abousamra, Chao Chen","doi":"10.1007/978-3-031-73229-4_16","DOIUrl":"10.1007/978-3-031-73229-4_16","url":null,"abstract":"<p><p>In digital pathology, segmenting densely distributed objects like glands and nuclei is crucial for downstream analysis. Since detailed pixel-wise annotations are very time-consuming, we need semi-supervised segmentation methods that can learn from unlabeled images. Existing semi-supervised methods are often prone to topological errors, <i>e.g</i>., missing or incorrectly merged/separated glands or nuclei. To address this issue, we propose <i>TopoSemiSeg</i>, the first semi-supervised method that learns the topological representation from unlabeled histopathology images. The major challenge is for unlabeled images; we only have predictions carrying noisy topology. To this end, we introduce a noise-aware topological consistency loss to align the representations of a teacher and a student model. By decomposing the topology of the prediction into signal topology and noisy topology, we ensure that the models learn the true topological signals and become robust to noise. Extensive experiments on public histopathology image datasets show the superiority of our method, especially on topology-aware evaluation metrics. Code is available at https://github.com/Melon-Xu/TopoSemiSeg.</p>","PeriodicalId":72676,"journal":{"name":"Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision","volume":"15136 ","pages":"271-289"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12185923/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144487330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dual-Stream Knowledge-Preserving Hashing for Unsupervised Video Retrieval 面向无监督视频检索的双流知识保持哈希算法
P. Li, Hongtao Xie, Jiannan Ge, Lei Zhang, Shaobo Min, Yongdong Zhang
{"title":"Dual-Stream Knowledge-Preserving Hashing for Unsupervised Video Retrieval","authors":"P. Li, Hongtao Xie, Jiannan Ge, Lei Zhang, Shaobo Min, Yongdong Zhang","doi":"10.1007/978-3-031-19781-9_11","DOIUrl":"https://doi.org/10.1007/978-3-031-19781-9_11","url":null,"abstract":"","PeriodicalId":72676,"journal":{"name":"Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision","volume":"25 1","pages":"181-197"},"PeriodicalIF":0.0,"publicationDate":"2023-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79999317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Spatial and Visual Perspective-Taking via View Rotation and Relation Reasoning for Embodied Reference Understanding 通过视角旋转和关系推理的空间和视觉视角获取对具体化参考理解的影响
Cheng Shi, Sibei Yang
{"title":"Spatial and Visual Perspective-Taking via View Rotation and Relation Reasoning for Embodied Reference Understanding","authors":"Cheng Shi, Sibei Yang","doi":"10.1007/978-3-031-20059-5_12","DOIUrl":"https://doi.org/10.1007/978-3-031-20059-5_12","url":null,"abstract":"","PeriodicalId":72676,"journal":{"name":"Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision","volume":"19 1","pages":"201-218"},"PeriodicalIF":0.0,"publicationDate":"2023-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78452878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1