首页 > 最新文献

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society最新文献

英文 中文
TopicFM+: Boosting Accuracy and Efficiency of Topic-Assisted Feature Matching TopicFM+:提高主题辅助特征匹配的准确性和效率
Khang Truong Giang;Soohwan Song;Sungho Jo
This study tackles image matching in difficult scenarios, such as scenes with significant variations or limited texture, with a strong emphasis on computational efficiency. Previous studies have attempted to address this challenge by encoding global scene contexts using Transformers. However, these approaches have high computational costs and may not capture sufficient high-level contextual information, such as spatial structures or semantic shapes. To overcome these limitations, we propose a novel image-matching method that leverages a topic-modeling strategy to capture high-level contexts in images. Our method represents each image as a multinomial distribution over topics, where each topic represents semantic structures. By incorporating these topics, we can effectively capture comprehensive context information and obtain discriminative and high-quality features. Notably, our coarse-level matching network enhances efficiency by employing attention layers only to fixed-sized topics and small-sized features. Finally, we design a dynamic feature refinement network for precise results at a finer matching stage. Through extensive experiments, we have demonstrated the superiority of our method in challenging scenarios. Specifically, our method ranks in the top 9% in the Image Matching Challenge 2023 without using ensemble techniques. Additionally, we achieve an approximately 50% reduction in computational costs compared to other Transformer-based methods. Code is available at https://github.com/TruongKhang/TopicFM.
本研究解决了图像匹配中的难题,如具有明显变化或有限纹理的场景,并着重强调了计算效率。以往的研究试图通过使用变形器编码全局场景上下文来应对这一挑战。然而,这些方法的计算成本较高,而且可能无法捕捉到足够的高级上下文信息,如空间结构或语义形状。为了克服这些局限性,我们提出了一种新颖的图像匹配方法,利用主题建模策略来捕捉图像中的高层语境。我们的方法将每幅图像表示为主题的多叉分布,其中每个主题代表语义结构。通过纳入这些主题,我们可以有效捕捉到全面的上下文信息,并获得具有区分性的高质量特征。值得注意的是,我们的粗层匹配网络只对固定大小的主题和小尺寸的特征采用注意层,从而提高了效率。最后,我们设计了一个动态特征细化网络,以便在更精细的匹配阶段获得精确的结果。通过大量实验,我们证明了我们的方法在具有挑战性的场景中的优越性。具体来说,我们的方法在不使用集合技术的情况下,在 2023 年图像匹配挑战赛中排名前 9%。此外,与其他基于变换器的方法相比,我们的计算成本降低了约 50%。代码见 https://github.com/TruongKhang/TopicFM。
{"title":"TopicFM+: Boosting Accuracy and Efficiency of Topic-Assisted Feature Matching","authors":"Khang Truong Giang;Soohwan Song;Sungho Jo","doi":"10.1109/TIP.2024.3473301","DOIUrl":"10.1109/TIP.2024.3473301","url":null,"abstract":"This study tackles image matching in difficult scenarios, such as scenes with significant variations or limited texture, with a strong emphasis on computational efficiency. Previous studies have attempted to address this challenge by encoding global scene contexts using Transformers. However, these approaches have high computational costs and may not capture sufficient high-level contextual information, such as spatial structures or semantic shapes. To overcome these limitations, we propose a novel image-matching method that leverages a topic-modeling strategy to capture high-level contexts in images. Our method represents each image as a multinomial distribution over topics, where each topic represents semantic structures. By incorporating these topics, we can effectively capture comprehensive context information and obtain discriminative and high-quality features. Notably, our coarse-level matching network enhances efficiency by employing attention layers only to fixed-sized topics and small-sized features. Finally, we design a dynamic feature refinement network for precise results at a finer matching stage. Through extensive experiments, we have demonstrated the superiority of our method in challenging scenarios. Specifically, our method ranks in the top 9% in the Image Matching Challenge 2023 without using ensemble techniques. Additionally, we achieve an approximately 50% reduction in computational costs compared to other Transformer-based methods. Code is available at \u0000<uri>https://github.com/TruongKhang/TopicFM</uri>\u0000.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"33 ","pages":"6016-6028"},"PeriodicalIF":0.0,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142448478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Searching Discriminative Regions for Convolutional Neural Networks in Fundus Image Classification With Genetic Algorithms 用遗传算法为眼底图像分类中的卷积神经网络寻找鉴别区域
Yibiao Rong;Tian Lin;Haoyu Chen;Zhun Fan;Xinjian Chen
Deep convolutional neural networks (CNNs) have been widely used for fundus image classification and have achieved very impressive performance. However, the explainability of CNNs is poor because of their black-box nature, which limits their application in clinical practice. In this paper, we propose a novel method to search for discriminative regions to increase the confidence of CNNs in the classification of features in specific category, thereby helping users understand which regions in an image are important for a CNN to make a particular prediction. In the proposed method, a set of superpixels is selected in an evolutionary process, such that discriminative regions can be found automatically. Many experiments are conducted to verify the effectiveness of the proposed method. The average drop and average increase obtained with the proposed method are 0 and 77.8%, respectively, in fundus image classification, indicating that the proposed method is very effective in identifying discriminative regions. Additionally, several interesting findings are reported: 1) Some superpixels, which contain the evidence used by humans to make a certain decision in practice, can be identified as discriminative regions via the proposed method; 2) The superpixels identified as discriminative regions are distributed in different locations in an image rather than focusing on regions with a specific instance; and 3) The number of discriminative superpixels obtained via the proposed method is relatively small. In other words, a CNN model can employ a small portion of the pixels in an image to increase the confidence for a specific category.
深度卷积神经网络(CNN)已被广泛用于眼底图像分类,并取得了令人瞩目的成绩。然而,由于 CNN 的黑箱特性,其可解释性较差,这限制了其在临床实践中的应用。在本文中,我们提出了一种新颖的方法,通过搜索具有区分性的区域来增加 CNN 对特定类别特征分类的信心,从而帮助用户了解图像中哪些区域对 CNN 做出特定预测非常重要。在所提出的方法中,一组超像素是在进化过程中选出的,这样就能自动找到判别区域。为了验证所提方法的有效性,我们进行了许多实验。在眼底图像分类中,所提方法获得的平均下降率和平均上升率分别为 0 和 77.8%,表明所提方法在识别分辨区域方面非常有效。此外,还报告了几个有趣的发现:1)一些包含人类在实践中做出某种决定所使用的证据的超像素可以通过所提出的方法识别为判别区域;2)被识别为判别区域的超像素分布在图像的不同位置,而不是集中在具有特定实例的区域;3)通过所提出的方法获得的判别超像素的数量相对较少。换句话说,CNN 模型可以利用图像中的一小部分像素来提高特定类别的可信度。
{"title":"Searching Discriminative Regions for Convolutional Neural Networks in Fundus Image Classification With Genetic Algorithms","authors":"Yibiao Rong;Tian Lin;Haoyu Chen;Zhun Fan;Xinjian Chen","doi":"10.1109/TIP.2024.3477932","DOIUrl":"10.1109/TIP.2024.3477932","url":null,"abstract":"Deep convolutional neural networks (CNNs) have been widely used for fundus image classification and have achieved very impressive performance. However, the explainability of CNNs is poor because of their black-box nature, which limits their application in clinical practice. In this paper, we propose a novel method to search for discriminative regions to increase the confidence of CNNs in the classification of features in specific category, thereby helping users understand which regions in an image are important for a CNN to make a particular prediction. In the proposed method, a set of superpixels is selected in an evolutionary process, such that discriminative regions can be found automatically. Many experiments are conducted to verify the effectiveness of the proposed method. The average drop and average increase obtained with the proposed method are 0 and 77.8%, respectively, in fundus image classification, indicating that the proposed method is very effective in identifying discriminative regions. Additionally, several interesting findings are reported: 1) Some superpixels, which contain the evidence used by humans to make a certain decision in practice, can be identified as discriminative regions via the proposed method; 2) The superpixels identified as discriminative regions are distributed in different locations in an image rather than focusing on regions with a specific instance; and 3) The number of discriminative superpixels obtained via the proposed method is relatively small. In other words, a CNN model can employ a small portion of the pixels in an image to increase the confidence for a specific category.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"33 ","pages":"5949-5958"},"PeriodicalIF":0.0,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142443822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improved MRF Reconstruction via Structure-Preserved Graph Embedding Framework 通过结构保留图嵌入框架改进 MRF 重构
Peng Li;Yuping Ji;Yue Hu
Highly undersampled schemes in magnetic resonance fingerprinting (MRF) typically lead to aliasing artifacts in reconstructed images, thereby reducing quantitative imaging accuracy. Existing studies mainly focus on improving the reconstruction quality by incorporating temporal or spatial data priors. However, these methods seldom exploit the underlying MRF data structure driven by imaging physics and usually suffer from high computational complexity due to the high-dimensional nature of MRF data. In addition, data priors constructed in a pixel-wise manner struggle to incorporate non-local and non-linear correlations. To address these issues, we introduce a novel MRF reconstruction framework based on the graph embedding framework, exploiting non-linear and non-local redundancies in MRF data. Our work remodels MRF data and parameter maps as graph nodes, redefining the MRF reconstruction problem as a structure-preserved graph embedding problem. Furthermore, we propose a novel scheme for accurately estimating the underlying graph structure, demonstrating that the parameter nodes inherently form a low-dimensional representation of the high-dimensional MRF data nodes. The reconstruction framework is then built by preserving the intrinsic graph structure between MRF data nodes and parameter nodes and extended to exploiting the globality of graph structure. Our approach integrates the MRF data recovery and parameter map estimation into a single optimization problem, facilitating reconstructions geared toward quantitative accuracy. Moreover, by introducing graph representation, our methods substantially reduce the computational complexity, with the computational cost showing a minimal increase as the data acquisition length grows. Experiments show that the proposed method can reconstruct high-quality MRF data and multiple parameter maps within reduced computational time.
磁共振指纹(MRF)中的高欠采样方案通常会导致重建图像中出现混叠伪影,从而降低定量成像的准确性。现有的研究主要侧重于通过纳入时间或空间数据先验来提高重建质量。然而,这些方法很少利用由成像物理学驱动的底层 MRF 数据结构,而且由于 MRF 数据的高维特性,通常具有较高的计算复杂性。此外,以像素为单位构建的数据先验很难纳入非局部和非线性相关性。为了解决这些问题,我们基于图嵌入框架,利用 MRF 数据中的非线性和非局部冗余,推出了一种新型 MRF 重构框架。我们的工作将 MRF 数据和参数图重塑为图节点,将 MRF 重构问题重新定义为结构保留的图嵌入问题。此外,我们还提出了一种准确估计底层图结构的新方案,证明参数节点本质上构成了高维 MRF 数据节点的低维表示。然后,通过保留 MRF 数据节点和参数节点之间的内在图结构,建立了重构框架,并扩展到利用图结构的全局性。我们的方法将 MRF 数据恢复和参数图估算整合为一个单一的优化问题,促进了面向定量精度的重建。此外,通过引入图表示,我们的方法大大降低了计算复杂度,随着数据采集长度的增加,计算成本的增加幅度也很小。实验表明,所提出的方法能在更短的计算时间内重建高质量的 MRF 数据和多个参数图。
{"title":"Improved MRF Reconstruction via Structure-Preserved Graph Embedding Framework","authors":"Peng Li;Yuping Ji;Yue Hu","doi":"10.1109/TIP.2024.3477980","DOIUrl":"10.1109/TIP.2024.3477980","url":null,"abstract":"Highly undersampled schemes in magnetic resonance fingerprinting (MRF) typically lead to aliasing artifacts in reconstructed images, thereby reducing quantitative imaging accuracy. Existing studies mainly focus on improving the reconstruction quality by incorporating temporal or spatial data priors. However, these methods seldom exploit the underlying MRF data structure driven by imaging physics and usually suffer from high computational complexity due to the high-dimensional nature of MRF data. In addition, data priors constructed in a pixel-wise manner struggle to incorporate non-local and non-linear correlations. To address these issues, we introduce a novel MRF reconstruction framework based on the graph embedding framework, exploiting non-linear and non-local redundancies in MRF data. Our work remodels MRF data and parameter maps as graph nodes, redefining the MRF reconstruction problem as a structure-preserved graph embedding problem. Furthermore, we propose a novel scheme for accurately estimating the underlying graph structure, demonstrating that the parameter nodes inherently form a low-dimensional representation of the high-dimensional MRF data nodes. The reconstruction framework is then built by preserving the intrinsic graph structure between MRF data nodes and parameter nodes and extended to exploiting the globality of graph structure. Our approach integrates the MRF data recovery and parameter map estimation into a single optimization problem, facilitating reconstructions geared toward quantitative accuracy. Moreover, by introducing graph representation, our methods substantially reduce the computational complexity, with the computational cost showing a minimal increase as the data acquisition length grows. Experiments show that the proposed method can reconstruct high-quality MRF data and multiple parameter maps within reduced computational time.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"33 ","pages":"5989-6001"},"PeriodicalIF":0.0,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142443820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DHM-Net: Deep Hypergraph Modeling for Robust Feature Matching DHM-Net:用于鲁棒特征匹配的深度超图建模
Shunxing Chen;Guobao Xiao;Junwen Guo;Qiangqiang Wu;Jiayi Ma
We present a novel deep hypergraph modeling architecture (called DHM-Net) for feature matching in this paper. Our network focuses on learning reliable correspondences between two sets of initial feature points by establishing a dynamic hypergraph structure that models group-wise relationships and assigns weights to each node. Compared to existing feature matching methods that only consider pair-wise relationships via a simple graph, our dynamic hypergraph is capable of modeling nonlinear higher-order group-wise relationships among correspondences in an interaction capturing and attention representation learning fashion. Specifically, we propose a novel Deep Hypergraph Modeling block, which initializes an overall hypergraph by utilizing neighbor information, and then adopts node-to-hyperedge and hyperedge-to-node strategies to propagate interaction information among correspondences while assigning weights based on hypergraph attention. In addition, we propose a Differentiation Correspondence-Aware Attention mechanism to optimize the hypergraph for promoting representation learning. The proposed mechanism is able to effectively locate the exact position of the object of importance via the correspondence aware encoding and simple feature gating mechanism to distinguish candidates of inliers. In short, we learn such a dynamic hypergraph format that embeds deep group-wise interactions to explicitly infer categories of correspondences. To demonstrate the effectiveness of DHM-Net, we perform extensive experiments on both real-world outdoor and indoor datasets. Particularly, experimental results show that DHM-Net surpasses the state-of-the-art method by a sizable margin. Our approach obtains an 11.65% improvement under error threshold of 5° for relative pose estimation task on YFCC100M dataset. Code will be released at https://github.com/CSX777/DHM-Net.
我们在本文中提出了一种用于特征匹配的新型深度超图建模架构(称为 DHM-Net)。我们的网络侧重于通过建立动态超图结构来学习两组初始特征点之间的可靠对应关系,该结构可模拟组与组之间的关系,并为每个节点分配权重。现有的特征匹配方法仅通过简单的图来考虑配对关系,相比之下,我们的动态超图能够以交互捕捉和注意力表征学习的方式来模拟对应点之间的非线性高阶组配关系。具体来说,我们提出了一个新颖的深度超图建模模块,该模块利用邻居信息初始化整体超图,然后采用节点到超边和超边到节点的策略传播对应关系之间的交互信息,同时根据超图注意力分配权重。此外,我们还提出了一种区分对应关系的关注机制,以优化超图,促进表征学习。所提出的机制能够通过对应感知编码和简单的特征门控机制有效定位重要对象的准确位置,从而区分离群值候选。简而言之,我们学习了这样一种动态超图格式,它嵌入了深层次的群组交互,可以明确地推断出对应关系的类别。为了证明 DHM-Net 的有效性,我们在真实世界的室外和室内数据集上进行了大量实验。实验结果表明,DHM-Net 在很大程度上超越了最先进的方法。在 YFCC100M 数据集的相对姿态估计任务中,我们的方法在误差阈值为 5° 的情况下提高了 11.65%。代码将在 https://github.com/CSX777/DHM-Net 上发布。
{"title":"DHM-Net: Deep Hypergraph Modeling for Robust Feature Matching","authors":"Shunxing Chen;Guobao Xiao;Junwen Guo;Qiangqiang Wu;Jiayi Ma","doi":"10.1109/TIP.2024.3477916","DOIUrl":"10.1109/TIP.2024.3477916","url":null,"abstract":"We present a novel deep hypergraph modeling architecture (called DHM-Net) for feature matching in this paper. Our network focuses on learning reliable correspondences between two sets of initial feature points by establishing a dynamic hypergraph structure that models group-wise relationships and assigns weights to each node. Compared to existing feature matching methods that only consider pair-wise relationships via a simple graph, our dynamic hypergraph is capable of modeling nonlinear higher-order group-wise relationships among correspondences in an interaction capturing and attention representation learning fashion. Specifically, we propose a novel Deep Hypergraph Modeling block, which initializes an overall hypergraph by utilizing neighbor information, and then adopts node-to-hyperedge and hyperedge-to-node strategies to propagate interaction information among correspondences while assigning weights based on hypergraph attention. In addition, we propose a Differentiation Correspondence-Aware Attention mechanism to optimize the hypergraph for promoting representation learning. The proposed mechanism is able to effectively locate the exact position of the object of importance via the correspondence aware encoding and simple feature gating mechanism to distinguish candidates of inliers. In short, we learn such a dynamic hypergraph format that embeds deep group-wise interactions to explicitly infer categories of correspondences. To demonstrate the effectiveness of DHM-Net, we perform extensive experiments on both real-world outdoor and indoor datasets. Particularly, experimental results show that DHM-Net surpasses the state-of-the-art method by a sizable margin. Our approach obtains an 11.65% improvement under error threshold of 5° for relative pose estimation task on YFCC100M dataset. Code will be released at \u0000<uri>https://github.com/CSX777/DHM-Net</uri>\u0000.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"33 ","pages":"6002-6015"},"PeriodicalIF":0.0,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142443821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hierarchical Graph Interaction Transformer With Dynamic Token Clustering for Camouflaged Object Detection 采用动态令牌聚类的层次图交互变换器用于伪装物体检测
Siyuan Yao;Hao Sun;Tian-Zhu Xiang;Xiao Wang;Xiaochun Cao
Camouflaged object detection (COD) aims to identify the objects that seamlessly blend into the surrounding backgrounds. Due to the intrinsic similarity between the camouflaged objects and the background region, it is extremely challenging to precisely distinguish the camouflaged objects by existing approaches. In this paper, we propose a hierarchical graph interaction network termed HGINet for camouflaged object detection, which is capable of discovering imperceptible objects via effective graph interaction among the hierarchical tokenized features. Specifically, we first design a region-aware token focusing attention (RTFA) with dynamic token clustering to excavate the potentially distinguishable tokens in the local region. Afterwards, a hierarchical graph interaction transformer (HGIT) is proposed to construct bi-directional aligned communication between hierarchical features in the latent interaction space for visual semantics enhancement. Furthermore, we propose a decoder network with confidence aggregated feature fusion (CAFF) modules, which progressively fuses the hierarchical interacted features to refine the local detail in ambiguous regions. Extensive experiments conducted on the prevalent datasets, i.e. COD10K, CAMO, NC4K and CHAMELEON demonstrate the superior performance of HGINet compared to existing state-of-the-art methods. Our code is available at https://github.com/Garyson1204/HGINet.
伪装物体检测(COD)旨在识别与周围背景完美融合的物体。由于伪装物体与背景区域之间的内在相似性,现有方法要精确区分伪装物体极具挑战性。在本文中,我们提出了一种用于伪装物体检测的分层图交互网络(称为 HGINet),它能够通过分层标记化特征之间的有效图交互发现难以察觉的物体。具体来说,我们首先设计了一种具有动态标记聚类功能的区域感知标记聚焦注意力(RTFA),以挖掘局部区域内潜在的可区分标记。然后,我们提出了分层图交互变换器(HGIT),用于在潜在交互空间中的分层特征之间构建双向对齐通信,以增强视觉语义。此外,我们还提出了一个带有置信度聚合特征融合(CAFF)模块的解码器网络,该模块可逐步融合分层交互特征,以完善模糊区域的局部细节。在 COD10K、CAMO、NC4K 和 CHAMELEON 等主流数据集上进行的大量实验表明,与现有的先进方法相比,HGINet 的性能更加卓越。我们的代码见 https://github.com/Garyson1204/HGINet。
{"title":"Hierarchical Graph Interaction Transformer With Dynamic Token Clustering for Camouflaged Object Detection","authors":"Siyuan Yao;Hao Sun;Tian-Zhu Xiang;Xiao Wang;Xiaochun Cao","doi":"10.1109/TIP.2024.3475219","DOIUrl":"10.1109/TIP.2024.3475219","url":null,"abstract":"Camouflaged object detection (COD) aims to identify the objects that seamlessly blend into the surrounding backgrounds. Due to the intrinsic similarity between the camouflaged objects and the background region, it is extremely challenging to precisely distinguish the camouflaged objects by existing approaches. In this paper, we propose a hierarchical graph interaction network termed HGINet for camouflaged object detection, which is capable of discovering imperceptible objects via effective graph interaction among the hierarchical tokenized features. Specifically, we first design a region-aware token focusing attention (RTFA) with dynamic token clustering to excavate the potentially distinguishable tokens in the local region. Afterwards, a hierarchical graph interaction transformer (HGIT) is proposed to construct bi-directional aligned communication between hierarchical features in the latent interaction space for visual semantics enhancement. Furthermore, we propose a decoder network with confidence aggregated feature fusion (CAFF) modules, which progressively fuses the hierarchical interacted features to refine the local detail in ambiguous regions. Extensive experiments conducted on the prevalent datasets, i.e. COD10K, CAMO, NC4K and CHAMELEON demonstrate the superior performance of HGINet compared to existing state-of-the-art methods. Our code is available at \u0000<uri>https://github.com/Garyson1204/HGINet</uri>\u0000.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"33 ","pages":"5936-5948"},"PeriodicalIF":0.0,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142439744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Explicitly-Decoupled Text Transfer With Minimized Background Reconstruction for Scene Text Editing 用于场景文本编辑的显式解耦文本传输与最小化背景重构
Jianqun Zhou;Pengwen Dai;Yang Li;Manjiang Hu;Xiaochun Cao
Scene text editing aims to replace the source text with the target text while preserving the original background. Its practical applications span various domains, such as data generation and privacy protection, highlighting its increasing importance in recent years. In this study, we propose a novel Scene Text Editing network with Explicitly-decoupled text transfer and Minimized background reconstruction, called STEEM. Unlike existing methods that usually fuse text style, text content, and background, our approach focuses on decoupling text style and content from the background and utilizes the minimized background reconstruction to reduce the impact of text replacement on the background. Specifically, the text-background separation module predicts the text mask of the scene text image, separating the source text from the background. Subsequently, the style-guided text transfer decoding module transfers the geometric and stylistic attributes of the source text to the content text, resulting in the target text. Next, the background and target text are combined to determine the minimal reconstruction area. Finally, the context-focused background reconstruction module is applied to the reconstruction area, producing the editing result. Furthermore, to ensure stable joint optimization of the four modules, a task-adaptive training optimization strategy has been devised. Experimental evaluations conducted on two popular datasets demonstrate the effectiveness of our approach. STEEM outperforms state-of-the-art methods, as evidenced by a reduction in the FID index from 29.48 to 24.67 and an increase in text recognition accuracy from 76.8% to 78.8%.
场景文本编辑旨在用目标文本替换源文本,同时保留原始背景。其实际应用横跨数据生成和隐私保护等多个领域,凸显了其近年来日益增长的重要性。在本研究中,我们提出了一种新颖的场景文本编辑网络,称为 STEEM,它具有显式解耦文本传输和最小化背景重建功能。与通常将文本风格、文本内容和背景融合在一起的现有方法不同,我们的方法侧重于将文本风格和内容与背景解耦,并利用最小化背景重构来减少文本替换对背景的影响。具体来说,文本-背景分离模块预测场景文本图像的文本掩码,将源文本从背景中分离出来。随后,风格引导文本转移解码模块将源文本的几何和风格属性转移到内容文本中,形成目标文本。接着,将背景和目标文本结合起来,以确定最小的重建区域。最后,以上下文为重点的背景重构模块被应用到重构区域,产生编辑结果。此外,为了确保四个模块稳定地联合优化,还设计了一种任务自适应训练优化策略。在两个流行数据集上进行的实验评估证明了我们方法的有效性。STEEM 的 FID 指数从 29.48 降至 24.67,文本识别准确率从 76.8% 提高到 78.8%,这证明它优于最先进的方法。
{"title":"Explicitly-Decoupled Text Transfer With Minimized Background Reconstruction for Scene Text Editing","authors":"Jianqun Zhou;Pengwen Dai;Yang Li;Manjiang Hu;Xiaochun Cao","doi":"10.1109/TIP.2024.3477355","DOIUrl":"10.1109/TIP.2024.3477355","url":null,"abstract":"Scene text editing aims to replace the source text with the target text while preserving the original background. Its practical applications span various domains, such as data generation and privacy protection, highlighting its increasing importance in recent years. In this study, we propose a novel Scene Text Editing network with Explicitly-decoupled text transfer and Minimized background reconstruction, called STEEM. Unlike existing methods that usually fuse text style, text content, and background, our approach focuses on decoupling text style and content from the background and utilizes the minimized background reconstruction to reduce the impact of text replacement on the background. Specifically, the text-background separation module predicts the text mask of the scene text image, separating the source text from the background. Subsequently, the style-guided text transfer decoding module transfers the geometric and stylistic attributes of the source text to the content text, resulting in the target text. Next, the background and target text are combined to determine the minimal reconstruction area. Finally, the context-focused background reconstruction module is applied to the reconstruction area, producing the editing result. Furthermore, to ensure stable joint optimization of the four modules, a task-adaptive training optimization strategy has been devised. Experimental evaluations conducted on two popular datasets demonstrate the effectiveness of our approach. STEEM outperforms state-of-the-art methods, as evidenced by a reduction in the FID index from 29.48 to 24.67 and an increase in text recognition accuracy from 76.8% to 78.8%.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"33 ","pages":"5921-5935"},"PeriodicalIF":0.0,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142440011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Injecting Text Clues for Improving Anomalous Event Detection From Weakly Labeled Videos 注入文本线索,改进从弱标签视频中检测异常事件的能力
Tianshan Liu;Kin-Man Lam;Bing-Kun Bao
Video anomaly detection (VAD) aims at localizing the snippets containing anomalous events in long unconstrained videos. The weakly supervised (WS) setting, where solely video-level labels are available during training, has attracted considerable attention, owing to its satisfactory trade-off between the detection performance and annotation cost. However, due to lack of snippet-level dense labels, the existing WS-VAD methods still get easily stuck on the detection errors, caused by false alarms and incomplete localization. To address this dilemma, in this paper, we propose to inject text clues of anomaly-event categories for improving WS-VAD, via a dedicated dual-branch framework. For suppressing the response of confusing normal contexts, we first present a text-guided anomaly discovering (TAG) branch based on a hierarchical matching scheme, which utilizes the label-text queries to search the discriminative anomalous snippets in a global-to-local fashion. To facilitate the completeness of anomaly-instance localization, an anomaly-conditioned text completion (ATC) branch is further designed to perform an auxiliary generative task, which intrinsically forces the model to gather sufficient event semantics from all the relevant anomalous snippets for completely reconstructing the masked description sentence. Furthermore, to encourage the cross-branch knowledge sharing, a mutual learning strategy is introduced by imposing a consistency constraint on the anomaly scores of these two branches. Extensive experimental results on two public benchmarks validate that the proposed method achieves superior performance over the competing methods.
视频异常检测(VAD)旨在定位长视频中包含异常事件的片段。弱监督(WS)设置在训练过程中仅提供视频级标签,由于其在检测性能和注释成本之间令人满意的权衡,已经引起了广泛关注。然而,由于缺乏片段级的密集标签,现有的 WS-VAD 方法仍然很容易陷入误报和定位不完整造成的检测错误。针对这一困境,本文提出通过专门的双分支框架,注入异常事件类别的文本线索,以改进 WS-VAD。为了抑制正常上下文混淆的反应,我们首先提出了基于分层匹配方案的文本引导异常发现(TAG)分支,该分支利用标签文本查询,以全局到局部的方式搜索具有区分性的异常片段。为了促进异常实例定位的完整性,还进一步设计了异常条件文本补全(ATC)分支来执行辅助生成任务,从本质上迫使模型从所有相关异常片段中收集足够的事件语义,以完全重建屏蔽描述句子。此外,为了鼓励跨分支知识共享,我们还引入了相互学习策略,对这两个分支的异常得分施加一致性约束。在两个公共基准上的广泛实验结果验证了所提出的方法比其他竞争方法性能更优越。
{"title":"Injecting Text Clues for Improving Anomalous Event Detection From Weakly Labeled Videos","authors":"Tianshan Liu;Kin-Man Lam;Bing-Kun Bao","doi":"10.1109/TIP.2024.3477351","DOIUrl":"10.1109/TIP.2024.3477351","url":null,"abstract":"Video anomaly detection (VAD) aims at localizing the snippets containing anomalous events in long unconstrained videos. The weakly supervised (WS) setting, where solely video-level labels are available during training, has attracted considerable attention, owing to its satisfactory trade-off between the detection performance and annotation cost. However, due to lack of snippet-level dense labels, the existing WS-VAD methods still get easily stuck on the detection errors, caused by false alarms and incomplete localization. To address this dilemma, in this paper, we propose to inject text clues of anomaly-event categories for improving WS-VAD, via a dedicated dual-branch framework. For suppressing the response of confusing normal contexts, we first present a text-guided anomaly discovering (TAG) branch based on a hierarchical matching scheme, which utilizes the label-text queries to search the discriminative anomalous snippets in a global-to-local fashion. To facilitate the completeness of anomaly-instance localization, an anomaly-conditioned text completion (ATC) branch is further designed to perform an auxiliary generative task, which intrinsically forces the model to gather sufficient event semantics from all the relevant anomalous snippets for completely reconstructing the masked description sentence. Furthermore, to encourage the cross-branch knowledge sharing, a mutual learning strategy is introduced by imposing a consistency constraint on the anomaly scores of these two branches. Extensive experimental results on two public benchmarks validate that the proposed method achieves superior performance over the competing methods.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"33 ","pages":"5907-5920"},"PeriodicalIF":0.0,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142440012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cross-Scope Spatial-Spectral Information Aggregation for Hyperspectral Image Super-Resolution 用于高光谱图像超分辨率的跨范围空间光谱信息聚合
Shi Chen;Lefei Zhang;Liangpei Zhang
Hyperspectral image super-resolution has attained widespread prominence to enhance the spatial resolution of hyperspectral images. However, convolution-based methods have encountered challenges in harnessing the global spatial-spectral information. The prevailing transformer-based methods have not adequately captured the long-range dependencies in both spectral and spatial dimensions. To alleviate this issue, we propose a novel cross-scope spatial-spectral Transformer (CST) to efficiently investigate long-range spatial and spectral similarities for single hyperspectral image super-resolution. Specifically, we devise cross-attention mechanisms in spatial and spectral dimensions to comprehensively model the long-range spatial-spectral characteristics. By integrating global information into the rectangle-window self-attention, we first design a cross-scope spatial self-attention to facilitate long-range spatial interactions. Then, by leveraging appropriately characteristic spatial-spectral features, we construct a cross-scope spectral self-attention to effectively capture the intrinsic correlations among global spectral bands. Finally, we elaborate a concise feed-forward neural network to enhance the feature representation capacity in the Transformer structure. Extensive experiments over three hyperspectral datasets demonstrate that the proposed CST is superior to other state-of-the-art methods both quantitatively and visually. The code is available at https://github.com/Tomchenshi/CST.git.
高光谱图像超分辨率在提高高光谱图像空间分辨率方面得到了广泛重视。然而,基于卷积的方法在利用全局空间光谱信息方面遇到了挑战。目前流行的基于变换器的方法无法充分捕捉光谱和空间维度上的长程依赖关系。为了缓解这一问题,我们提出了一种新型的跨范围空间-光谱变换器(CST),以有效地研究单幅高光谱图像超分辨率的长距离空间和光谱相似性。具体来说,我们设计了空间和光谱维度的交叉关注机制,以全面模拟长距离空间-光谱特征。通过将全局信息整合到矩形窗口自我注意中,我们首先设计了一种跨范围空间自我注意,以促进长距离空间交互。然后,利用适当的空间光谱特征,我们构建了一个跨范围光谱自我注意,以有效捕捉全球光谱带之间的内在关联。最后,我们精心设计了一个简洁的前馈神经网络,以增强变换器结构的特征表示能力。在三个高光谱数据集上进行的广泛实验表明,所提出的 CST 在定量和视觉上都优于其他最先进的方法。代码见 https://github.com/Tomchenshi/CST.git。
{"title":"Cross-Scope Spatial-Spectral Information Aggregation for Hyperspectral Image Super-Resolution","authors":"Shi Chen;Lefei Zhang;Liangpei Zhang","doi":"10.1109/TIP.2024.3468905","DOIUrl":"10.1109/TIP.2024.3468905","url":null,"abstract":"Hyperspectral image super-resolution has attained widespread prominence to enhance the spatial resolution of hyperspectral images. However, convolution-based methods have encountered challenges in harnessing the global spatial-spectral information. The prevailing transformer-based methods have not adequately captured the long-range dependencies in both spectral and spatial dimensions. To alleviate this issue, we propose a novel cross-scope spatial-spectral Transformer (CST) to efficiently investigate long-range spatial and spectral similarities for single hyperspectral image super-resolution. Specifically, we devise cross-attention mechanisms in spatial and spectral dimensions to comprehensively model the long-range spatial-spectral characteristics. By integrating global information into the rectangle-window self-attention, we first design a cross-scope spatial self-attention to facilitate long-range spatial interactions. Then, by leveraging appropriately characteristic spatial-spectral features, we construct a cross-scope spectral self-attention to effectively capture the intrinsic correlations among global spectral bands. Finally, we elaborate a concise feed-forward neural network to enhance the feature representation capacity in the Transformer structure. Extensive experiments over three hyperspectral datasets demonstrate that the proposed CST is superior to other state-of-the-art methods both quantitatively and visually. The code is available at \u0000<uri>https://github.com/Tomchenshi/CST.git</uri>\u0000.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"33 ","pages":"5878-5891"},"PeriodicalIF":0.0,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142440010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GRiD: Guided Refinement for Detector-Free Multimodal Image Matching GRiD:无检测器多模态图像匹配的引导式细化。
Yuyan Liu;Wei He;Hongyan Zhang
Multimodal image matching is essential in image stitching, image fusion, change detection, and land cover mapping. However, the severe nonlinear radiometric distortion (NRD) and geometric distortions in multimodal images severely limit the accuracy of multimodal image matching, posing significant challenges to existing methods. Additionally, detector-based methods are prone to feature point offset issues in regions with substantial modal differences, which also hinder the subsequent fine registration and fusion of images. To address these challenges, we propose a guided refinement for detector-free multimodal image matching (GRiD) method, which weakens feature point offset issues by establishing pixel-level correspondences and utilizes reference points to guide and correct matches affected by NRD and geometric distortions. Specifically, we first introduce a detector-free framework to alleviate the feature point offset problem by directly finding corresponding pixels between images. Subsequently, to tackle NRD and geometric distortion in multimodal images, we design a guided correction module that establishes robust reference points (RPs) to guide the search for corresponding pixels in regions with significant modality differences. Moreover, to enhance RPs reliability, we incorporate a phase congruency module during the RPs confirmation stage to concentrate RPs around image edge structures. Finally, we perform finer localization on highly correlated corresponding pixels to obtain the optimized matches. We conduct extensive experiments on four multimodal image datasets to validate the effectiveness of the proposed approach. Experimental results demonstrate that our method can achieve sufficient and robust matches across various modality images and effectively suppress the feature point offset problem.
多模态图像匹配在图像拼接、图像融合、变化检测和土地覆盖制图中至关重要。然而,多模态图像中严重的非线性辐射失真(NRD)和几何失真严重限制了多模态图像匹配的准确性,给现有方法带来了巨大挑战。此外,基于检测器的方法在模态差异较大的区域容易出现特征点偏移问题,这也阻碍了后续图像的精细配准和融合。为了应对这些挑战,我们提出了一种无检测器多模态图像匹配(GRiD)的引导细化方法,该方法通过建立像素级的对应关系来弱化特征点偏移问题,并利用参考点来引导和纠正受 NRD 和几何失真影响的匹配。具体来说,我们首先引入了一个无需检测器的框架,通过直接查找图像之间的对应像素来缓解特征点偏移问题。随后,为了解决多模态图像中的 NRD 和几何失真问题,我们设计了一个引导校正模块,建立稳健的参考点 (RP),在模态差异显著的区域引导搜索相应的像素。此外,为了提高参考点的可靠性,我们在参考点确认阶段加入了相位一致性模块,将参考点集中在图像边缘结构周围。最后,我们对高度相关的对应像素进行更精细的定位,以获得优化匹配。我们在四个多模态图像数据集上进行了大量实验,以验证所提方法的有效性。实验结果表明,我们的方法可以在各种模态图像中实现充分、稳健的匹配,并有效抑制特征点偏移问题。
{"title":"GRiD: Guided Refinement for Detector-Free Multimodal Image Matching","authors":"Yuyan Liu;Wei He;Hongyan Zhang","doi":"10.1109/TIP.2024.3472491","DOIUrl":"10.1109/TIP.2024.3472491","url":null,"abstract":"Multimodal image matching is essential in image stitching, image fusion, change detection, and land cover mapping. However, the severe nonlinear radiometric distortion (NRD) and geometric distortions in multimodal images severely limit the accuracy of multimodal image matching, posing significant challenges to existing methods. Additionally, detector-based methods are prone to feature point offset issues in regions with substantial modal differences, which also hinder the subsequent fine registration and fusion of images. To address these challenges, we propose a guided refinement for detector-free multimodal image matching (GRiD) method, which weakens feature point offset issues by establishing pixel-level correspondences and utilizes reference points to guide and correct matches affected by NRD and geometric distortions. Specifically, we first introduce a detector-free framework to alleviate the feature point offset problem by directly finding corresponding pixels between images. Subsequently, to tackle NRD and geometric distortion in multimodal images, we design a guided correction module that establishes robust reference points (RPs) to guide the search for corresponding pixels in regions with significant modality differences. Moreover, to enhance RPs reliability, we incorporate a phase congruency module during the RPs confirmation stage to concentrate RPs around image edge structures. Finally, we perform finer localization on highly correlated corresponding pixels to obtain the optimized matches. We conduct extensive experiments on four multimodal image datasets to validate the effectiveness of the proposed approach. Experimental results demonstrate that our method can achieve sufficient and robust matches across various modality images and effectively suppress the feature point offset problem.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"33 ","pages":"5892-5906"},"PeriodicalIF":0.0,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142407393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MLFA: Toward Realistic Test Time Adaptive Object Detection by Multi-Level Feature Alignment MLFA:通过多级特征对齐实现测试时间自适应物体检测。
Yabo Liu;Jinghua Wang;Chao Huang;Yiling Wu;Yong Xu;Xiaochun Cao
Object detection methods have achieved remarkable performances when the training and testing data satisfy the assumption of i.i.d. However, the training and testing data may be collected from different domains, and the gap between the domains can significantly degrade the detectors. Test Time Adaptive Object Detection (TTA-OD) is a novel online approach that aims to adapt detectors quickly and make predictions during the testing procedure. TTA-OD is more realistic than the existing unsupervised domain adaptation and source-free unsupervised domain adaptation approaches. For example, self-driving cars need to improve their perception of new environments in the TTA-OD paradigm during driving. To address this, we propose a multi-level feature alignment (MLFA) method for TTA-OD, which is able to adapt the model online based on the steaming target domain data. For a more straightforward adaptation, we select informative foreground and background features from image feature maps and capture their distributions using probabilistic models. Our approach includes: i) global-level feature alignment to align all informative feature distributions, thereby encouraging detectors to extract domain-invariant features, and ii) cluster-level feature alignment to match feature distributions for each category cluster across different domains. Through the multi-level alignment, we can prompt detectors to extract domain-invariant features, as well as align the category-specific components of image features from distinct domains. We conduct extensive experiments to verify the effectiveness of our proposed method. Our code is accessible at https://github.com/yaboliudotug/MLFA.
当训练数据和测试数据满足 i.i.d 假设时,物体检测方法取得了显著的性能。然而,训练数据和测试数据可能来自不同的领域,领域之间的差距可能会显著降低检测器的性能。测试时间自适应目标检测(TTA-OD)是一种新颖的在线方法,旨在快速调整检测器,并在测试过程中做出预测。与现有的无监督域自适应和无源无监督域自适应方法相比,TTA-OD 更符合实际情况。例如,在 TTA-OD 范例中,自动驾驶汽车需要在行驶过程中提高对新环境的感知能力。为此,我们提出了一种用于 TTA-OD 的多层次特征对齐(MLFA)方法,该方法能够根据蒸发目标域数据对模型进行在线适配。为了更直接地调整模型,我们从图像特征图中选择信息丰富的前景和背景特征,并使用概率模型捕捉它们的分布。我们的方法包括:i) 全局级特征对齐,对齐所有信息特征分布,从而鼓励检测器提取域不变特征;ii) 集群级特征对齐,匹配不同域中每个类别集群的特征分布。通过多级对齐,我们可以促使检测器提取与领域无关的特征,并对齐来自不同领域的图像特征的特定类别成分。我们进行了大量实验来验证我们提出的方法的有效性。我们的代码可通过 https://github.com/yaboliudotug/MLFA 访问。
{"title":"MLFA: Toward Realistic Test Time Adaptive Object Detection by Multi-Level Feature Alignment","authors":"Yabo Liu;Jinghua Wang;Chao Huang;Yiling Wu;Yong Xu;Xiaochun Cao","doi":"10.1109/TIP.2024.3473532","DOIUrl":"10.1109/TIP.2024.3473532","url":null,"abstract":"Object detection methods have achieved remarkable performances when the training and testing data satisfy the assumption of i.i.d. However, the training and testing data may be collected from different domains, and the gap between the domains can significantly degrade the detectors. Test Time Adaptive Object Detection (TTA-OD) is a novel online approach that aims to adapt detectors quickly and make predictions during the testing procedure. TTA-OD is more realistic than the existing unsupervised domain adaptation and source-free unsupervised domain adaptation approaches. For example, self-driving cars need to improve their perception of new environments in the TTA-OD paradigm during driving. To address this, we propose a multi-level feature alignment (MLFA) method for TTA-OD, which is able to adapt the model online based on the steaming target domain data. For a more straightforward adaptation, we select informative foreground and background features from image feature maps and capture their distributions using probabilistic models. Our approach includes: i) global-level feature alignment to align all informative feature distributions, thereby encouraging detectors to extract domain-invariant features, and ii) cluster-level feature alignment to match feature distributions for each category cluster across different domains. Through the multi-level alignment, we can prompt detectors to extract domain-invariant features, as well as align the category-specific components of image features from distinct domains. We conduct extensive experiments to verify the effectiveness of our proposed method. Our code is accessible at \u0000<uri>https://github.com/yaboliudotug/MLFA</uri>\u0000.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"33 ","pages":"5837-5848"},"PeriodicalIF":0.0,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142396289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1