Generalized Task-Driven Medical Image Quality Enhancement With Gradient Promotion

IF 18.6 IEEE transactions on pattern analysis and machine intelligence Pub Date : 2025-01-03 DOI:10.1109/TPAMI.2025.3525671

Dong Zhang;Kwang-Ting Cheng

{"title":"Generalized Task-Driven Medical Image Quality Enhancement With Gradient Promotion","authors":"Dong Zhang;Kwang-Ting Cheng","doi":"10.1109/TPAMI.2025.3525671","DOIUrl":null,"url":null,"abstract":"Thanks to the recent achievements in task-driven image quality enhancement (IQE) models like ESTR (Liu et al. 2023), the image enhancement model and the visual recognition model can mutually enhance each other's quantitation while producing high-quality processed images that are perceivable by our human vision systems. However, existing task-driven IQE models tend to overlook an underlying fact–different levels of vision tasks have varying and sometimes conflicting requirements of image features. To address this problem, this paper proposes a generalized gradient promotion (<italic>GradProm) training strategy for task-driven IQE of medical images. Specifically, we partition a task-driven IQE system into two sub-models, i.e., a mainstream model for image enhancement and an auxiliary model for visual recognition. During training, <italic>GradProm updates only parameters of the image enhancement model using gradients of the visual recognition model and the image enhancement model, but only when gradients of these two sub-models are aligned in the same direction, which is measured by their cosine similarity. In case gradients of these two sub-models are not in the same direction, <italic>GradProm only uses the gradient of the image enhancement model to update its parameters. Theoretically, we have proved that the optimization direction of the image enhancement model will not be biased by the auxiliary visual recognition model under the implementation of <italic>GradProm. Empirically, extensive experimental results on four public yet challenging medical image datasets demonstrated the superior performance of <italic>GradProm over existing state-of-the-art methods.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 4","pages":"2785-2798"},"PeriodicalIF":18.6000,"publicationDate":"2025-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10824866/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Thanks to the recent achievements in task-driven image quality enhancement (IQE) models like ESTR (Liu et al. 2023), the image enhancement model and the visual recognition model can mutually enhance each other's quantitation while producing high-quality processed images that are perceivable by our human vision systems. However, existing task-driven IQE models tend to overlook an underlying fact–different levels of vision tasks have varying and sometimes conflicting requirements of image features. To address this problem, this paper proposes a generalized gradient promotion (GradProm) training strategy for task-driven IQE of medical images. Specifically, we partition a task-driven IQE system into two sub-models, i.e., a mainstream model for image enhancement and an auxiliary model for visual recognition. During training, GradProm updates only parameters of the image enhancement model using gradients of the visual recognition model and the image enhancement model, but only when gradients of these two sub-models are aligned in the same direction, which is measured by their cosine similarity. In case gradients of these two sub-models are not in the same direction, GradProm only uses the gradient of the image enhancement model to update its parameters. Theoretically, we have proved that the optimization direction of the image enhancement model will not be biased by the auxiliary visual recognition model under the implementation of GradProm. Empirically, extensive experimental results on four public yet challenging medical image datasets demonstrated the superior performance of GradProm over existing state-of-the-art methods.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于梯度提升的广义任务驱动医学图像质量增强

由于最近在ESTR等任务驱动图像质量增强（IQE）模型方面取得的成就（Liu et al. 2023），图像增强模型和视觉识别模型可以相互增强彼此的量化，同时产生我们人类视觉系统可感知的高质量处理图像。然而，现有的任务驱动IQE模型往往忽略了一个潜在的事实，即不同层次的视觉任务对图像特征的要求是不同的，有时甚至是相互冲突的。为了解决这一问题，本文提出了一种用于任务驱动医学图像IQE的广义梯度提升（GradProm）训练策略。具体来说，我们将任务驱动的IQE系统划分为两个子模型，即用于图像增强的主流模型和用于视觉识别的辅助模型。在训练过程中，GradProm仅使用视觉识别模型和图像增强模型的梯度更新图像增强模型的参数，但前提是这两个子模型的梯度在同一方向上对齐，这是通过它们的余弦相似度来衡量的。如果这两个子模型的梯度方向不一致，GradProm只使用图像增强模型的梯度来更新其参数。理论上，我们证明了在GradProm实现下，图像增强模型的优化方向不会受到辅助视觉识别模型的偏倚。从经验上看，在四个公共但具有挑战性的医学图像数据集上的广泛实验结果表明，GradProm优于现有的最先进的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE transactions on pattern analysis and machine intelligence

自引率

0.00%

发文量

期刊最新文献

GrowSP++: Growing Superpoints and Primitives for Unsupervised 3D Semantic Segmentation. Unsupervised Gaze Representation Learning by Switching Features. H₂OT: Hierarchical Hourglass Tokenizer for Efficient Video Pose Transformers. MV2DFusion: Leveraging Modality-Specific Object Semantics for Multi-Modal 3D Detection. Parse Trees Guided LLM Prompt Compression.