首页 > 最新文献

Image and Vision Computing最新文献

英文 中文
Search and recovery network for camouflaged object detection 用于伪装物体探测的搜索和恢复网络
IF 4.2 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-01 DOI: 10.1016/j.imavis.2024.105247
Guangrui Liu, Wei Wu

Camouflaged object detection aims to accurately identify objects blending into the background. However, existing methods often struggle, especially with small object or multiple objects, due to their reliance on singular strategies. To address this, we introduce a novel Search and Recovery Network (SRNet) using a bionic approach and auxiliary features. SRNet comprises three key modules: the Region Search Module (RSM), Boundary Recovery Module (BRM), and Camouflaged Object Predictor (COP). The RSM mimics predator behavior to locate potential object regions, enhancing object location detection. The BRM refines texture features and recovers object boundaries. The COP fuse multilevel features to predict final segmentation maps. Experimental results on three benchmark datasets show SRNet's superiority over SOTA models, particularly with small and multiple objects. Notably, SRNet achieves performance improvements without significantly increasing model parameters. Moreover, the method exhibits promising performance in downstream tasks such as defect detection, polyp segmentation and military camouflage detection.

伪装物体检测旨在准确识别融入背景的物体。然而,由于依赖于单一策略,现有的方法往往难以奏效,尤其是在检测小物体或多个物体时。为了解决这个问题,我们采用仿生方法和辅助特征,推出了一种新型搜索和恢复网络(SRNet)。SRNet 由三个关键模块组成:区域搜索模块 (RSM)、边界恢复模块 (BRM) 和伪装物体预测器 (COP)。RSM 模拟捕食者的行为来定位潜在的物体区域,从而增强物体位置检测。BRM 可完善纹理特征并恢复物体边界。COP 融合多层次特征,预测最终的分割图。在三个基准数据集上的实验结果表明,SRNet 优于 SOTA 模型,尤其是在检测小物体和多物体时。值得注意的是,SRNet 在不显著增加模型参数的情况下提高了性能。此外,该方法在缺陷检测、息肉分割和军事伪装检测等下游任务中表现出良好的性能。
{"title":"Search and recovery network for camouflaged object detection","authors":"Guangrui Liu,&nbsp;Wei Wu","doi":"10.1016/j.imavis.2024.105247","DOIUrl":"10.1016/j.imavis.2024.105247","url":null,"abstract":"<div><p>Camouflaged object detection aims to accurately identify objects blending into the background. However, existing methods often struggle, especially with small object or multiple objects, due to their reliance on singular strategies. To address this, we introduce a novel Search and Recovery Network (SRNet) using a bionic approach and auxiliary features. SRNet comprises three key modules: the Region Search Module (RSM), Boundary Recovery Module (BRM), and Camouflaged Object Predictor (COP). The RSM mimics predator behavior to locate potential object regions, enhancing object location detection. The BRM refines texture features and recovers object boundaries. The COP fuse multilevel features to predict final segmentation maps. Experimental results on three benchmark datasets show SRNet's superiority over SOTA models, particularly with small and multiple objects. Notably, SRNet achieves performance improvements without significantly increasing model parameters. Moreover, the method exhibits promising performance in downstream tasks such as defect detection, polyp segmentation and military camouflage detection.</p></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"151 ","pages":"Article 105247"},"PeriodicalIF":4.2,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142158378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GSTGM: Graph, spatial–temporal attention and generative based model for pedestrian multi-path prediction GSTGM:基于图形、时空注意力和生成的行人多路径预测模型
IF 4.2 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-31 DOI: 10.1016/j.imavis.2024.105245
Muhammad Haris Kaka Khel , Paul Greaney , Marion McAfee , Sandra Moffett , Kevin Meehan

Pedestrian trajectory prediction in urban environments has emerged as a critical research area with extensive applications across various domains. Accurate prediction of pedestrian trajectories is essential for the safe navigation of autonomous vehicles and robots in pedestrian-populated environments. Effective prediction models must capture both the spatial interactions among pedestrians and the temporal dependencies governing their movements. Existing research primarily focuses on forecasting a single trajectory per pedestrian, limiting its applicability in real-world scenarios characterised by diverse and unpredictable pedestrian behaviours. To address these challenges, this paper introduces the Graph Convolutional Network, Spatial–Temporal Attention, and Generative Model (GSTGM) for pedestrian trajectory prediction. GSTGM employs a spatiotemporal graph convolutional network to effectively capture complex interactions between pedestrians and their environment. Additionally, it integrates a spatial–temporal attention mechanism to prioritise relevant information during the prediction process. By incorporating a time-dependent prior within the latent space and utilising a computationally efficient generative model, GSTGM facilitates the generation of diverse and realistic future trajectories. The effectiveness of GSTGM is validated through experiments on real-world scenario datasets. Compared to the state-of-the-art models on benchmark datasets such as ETH/UCY, GSTGM demonstrates superior performance in accurately predicting multiple potential trajectories for individual pedestrians. This superiority is measured using metrics such as Final Displacement Error (FDE) and Average Displacement Error (ADE). Moreover, GSTGM achieves these results with significantly faster processing speeds.

城市环境中的行人轨迹预测已成为一个重要的研究领域,在各个领域都有广泛的应用。准确预测行人轨迹对于自动驾驶车辆和机器人在行人密集的环境中安全导航至关重要。有效的预测模型必须同时捕捉到行人之间的空间相互作用和支配其运动的时间依赖性。现有研究主要侧重于预测每个行人的单一轨迹,这就限制了其在以行人行为多样化和不可预测为特征的现实世界场景中的适用性。为了应对这些挑战,本文介绍了用于行人轨迹预测的图形卷积网络、时空注意力和生成模型(GSTGM)。GSTGM 采用时空图卷积网络,可有效捕捉行人与环境之间的复杂互动。此外,它还整合了时空注意力机制,以便在预测过程中对相关信息进行优先排序。通过在潜在空间中加入随时间变化的先验信息,并利用计算效率高的生成模型,GSTGM 可帮助生成多样、逼真的未来轨迹。通过在真实世界场景数据集上进行实验,GSTGM 的有效性得到了验证。与 ETH/UCY 等基准数据集上的先进模型相比,GSTGM 在准确预测单个行人的多种潜在轨迹方面表现出了卓越的性能。这种优越性是通过最终位移误差(FDE)和平均位移误差(ADE)等指标来衡量的。此外,GSTGM 还能以更快的处理速度实现这些结果。
{"title":"GSTGM: Graph, spatial–temporal attention and generative based model for pedestrian multi-path prediction","authors":"Muhammad Haris Kaka Khel ,&nbsp;Paul Greaney ,&nbsp;Marion McAfee ,&nbsp;Sandra Moffett ,&nbsp;Kevin Meehan","doi":"10.1016/j.imavis.2024.105245","DOIUrl":"10.1016/j.imavis.2024.105245","url":null,"abstract":"<div><p>Pedestrian trajectory prediction in urban environments has emerged as a critical research area with extensive applications across various domains. Accurate prediction of pedestrian trajectories is essential for the safe navigation of autonomous vehicles and robots in pedestrian-populated environments. Effective prediction models must capture both the spatial interactions among pedestrians and the temporal dependencies governing their movements. Existing research primarily focuses on forecasting a single trajectory per pedestrian, limiting its applicability in real-world scenarios characterised by diverse and unpredictable pedestrian behaviours. To address these challenges, this paper introduces the Graph Convolutional Network, Spatial–Temporal Attention, and Generative Model (GSTGM) for pedestrian trajectory prediction. GSTGM employs a spatiotemporal graph convolutional network to effectively capture complex interactions between pedestrians and their environment. Additionally, it integrates a spatial–temporal attention mechanism to prioritise relevant information during the prediction process. By incorporating a time-dependent prior within the latent space and utilising a computationally efficient generative model, GSTGM facilitates the generation of diverse and realistic future trajectories. The effectiveness of GSTGM is validated through experiments on real-world scenario datasets. Compared to the state-of-the-art models on benchmark datasets such as ETH/UCY, GSTGM demonstrates superior performance in accurately predicting multiple potential trajectories for individual pedestrians. This superiority is measured using metrics such as Final Displacement Error (FDE) and Average Displacement Error (ADE). Moreover, GSTGM achieves these results with significantly faster processing speeds.</p></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"151 ","pages":"Article 105245"},"PeriodicalIF":4.2,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0262885624003500/pdfft?md5=be799dd771bacffe5a12fc1424240e2d&pid=1-s2.0-S0262885624003500-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142158377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Probability based dynamic soft label assignment for object detection 基于概率的动态软标签分配用于物体检测
IF 4.2 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-31 DOI: 10.1016/j.imavis.2024.105240
Yi Li , Sile Ma , Xiangyuan Jiang , Yizhong Luan , Zecui Jiang

By defining effective supervision labels for network training, the performance of object detectors can be improved without incurring additional inference costs. Current label assignment strategies generally require two steps: first, constructing a positive sample candidate bag, and then designing labels for these samples. However, the construction of candidate bag of positive samples may result in some noisy samples being introduced into the label assignment process. We explore a single-step label assignment approach: directly generating a probability map as labels for all samples. We design the label assignment approach from the following perspectives: Firstly, it should be able to reduce the impact of noise samples. Secondly, each sample should be treated differently because each one matches the target to a different extent, which assists the network to learn more valuable information from high-quality samples. We propose a probability-based dynamic soft label assignment method. Instead of dividing the samples into positive and negative samples, a probability map, which is calculated based on prediction quality and prior knowledge, is used to supervise all anchor points of the classification branch. The weight of prior knowledge in the labels decreases as the network improves the quality of instance predictions, as a way to reduce noise samples introduced by prior knowledge. By using continuous probability values as labels to supervise the classification branch, the network is able to focus on high-quality samples. As demonstrated in the experiments on the MS COCO benchmark, our label assignment method achieves 40.9% AP in the ResNet-50 under 1x schedule, which improves FCOS performance by approximately 2.0% AP. The code has been available at https://github.com/Liyi4578/PDSLA.

通过为网络训练定义有效的监督标签,可以在不增加推理成本的情况下提高物体检测器的性能。目前的标签分配策略一般需要两个步骤:首先,构建正样本候选包,然后为这些样本设计标签。然而,在构建阳性样本候选袋的过程中,可能会在标签分配过程中引入一些噪声样本。我们探索了一种单步标签分配方法:直接生成概率图作为所有样本的标签。我们从以下几个方面设计标签分配方法:首先,它应能减少噪声样本的影响。其次,应该区别对待每个样本,因为每个样本与目标的匹配程度不同,这有助于网络从高质量样本中学习到更多有价值的信息。我们提出了一种基于概率的动态软标签分配方法。我们不将样本分为正样本和负样本,而是使用基于预测质量和先验知识计算的概率图来监督分类分支的所有锚点。先验知识在标签中的权重会随着网络提高实例预测质量而降低,以此来减少先验知识带来的噪声样本。通过使用连续概率值作为标签来监督分类分支,网络能够将注意力集中在高质量样本上。正如在 MS COCO 基准实验中证明的那样,我们的标签分配方法在 1x 计划下的 ResNet-50 中实现了 40.9% 的 AP,将 FCOS 性能提高了约 2.0%。代码可在 https://github.com/Liyi4578/PDSLA 上获取。
{"title":"Probability based dynamic soft label assignment for object detection","authors":"Yi Li ,&nbsp;Sile Ma ,&nbsp;Xiangyuan Jiang ,&nbsp;Yizhong Luan ,&nbsp;Zecui Jiang","doi":"10.1016/j.imavis.2024.105240","DOIUrl":"10.1016/j.imavis.2024.105240","url":null,"abstract":"<div><p>By defining effective supervision labels for network training, the performance of object detectors can be improved without incurring additional inference costs. Current label assignment strategies generally require two steps: first, constructing a positive sample candidate bag, and then designing labels for these samples. However, the construction of candidate bag of positive samples may result in some noisy samples being introduced into the label assignment process. We explore a single-step label assignment approach: directly generating a probability map as labels for all samples. We design the label assignment approach from the following perspectives: Firstly, it should be able to reduce the impact of noise samples. Secondly, each sample should be treated differently because each one matches the target to a different extent, which assists the network to learn more valuable information from high-quality samples. We propose a probability-based dynamic soft label assignment method. Instead of dividing the samples into positive and negative samples, a probability map, which is calculated based on prediction quality and prior knowledge, is used to supervise all anchor points of the classification branch. The weight of prior knowledge in the labels decreases as the network improves the quality of instance predictions, as a way to reduce noise samples introduced by prior knowledge. By using continuous probability values as labels to supervise the classification branch, the network is able to focus on high-quality samples. As demonstrated in the experiments on the MS COCO benchmark, our label assignment method achieves 40.9% AP in the ResNet-50 under 1x schedule, which improves FCOS performance by approximately 2.0% AP. The code has been available at <span><span><span>https://github.com/Liyi4578/PDSLA</span></span><svg><path></path></svg></span>.</p></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"150 ","pages":"Article 105240"},"PeriodicalIF":4.2,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142097496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CRENet: Crowd region enhancement network for multi-person 3D pose estimation CRENet:用于多人三维姿态估计的人群区域增强网络
IF 4.2 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-30 DOI: 10.1016/j.imavis.2024.105243
Zhaokun Li, Qiong Liu

Recovering multi-person 3D poses from a single image is a challenging problem due to inherent depth ambiguities, including root-relative depth and absolute root depth. Current bottom-up methods show promising potential to mitigate absolute root depth ambiguity through explicitly aggregating global contextual cues. However, these methods treat the entire image region equally during root depth regression, ignoring the negative impact of irrelevant regions. Moreover, they learn shared features for both depths, each of which focuses on different information. This sharing mechanism may result in negative transfer, thus diminishing root depth prediction accuracy. To address these challenges, we present a novel bottom-up method, Crowd Region Enhancement Network (CRENet), incorporating a Feature Decoupling Module (FDM) and a Global Attention Module (GAM). FDM explicitly learns the discriminative feature for each depth through adaptively recalibrating its channel-wise responses and fusing multi-level features, which makes the model focus on each depth prediction separately and thus avoids the adverse effect of negative transfer. GAM highlights crowd regions while suppressing irrelevant regions using the attention mechanism and further refines the attention regions based on the confidence measure about the attention, which is beneficial to learn depth-related cues from informative crowd regions and facilitate root depth estimation. Comprehensive experiments on benchmarks MuPoTS-3D and CMU Panoptic demonstrate that our method outperforms the state-of-the-art bottom-up methods in absolute 3D pose estimation and is applicable to in-the-wild images, which also indicates that learning depth-specific features and suppressing the noise signals can significantly benefit multi-person absolute 3D pose estimation.

由于固有的深度模糊性(包括根相关深度和绝对根深度),从单张图像中恢复多人三维姿势是一个具有挑战性的问题。当前的自下而上方法通过明确聚合全局上下文线索,在减轻绝对根深模糊性方面显示出巨大潜力。然而,这些方法在根深度回归时对整个图像区域一视同仁,忽略了无关区域的负面影响。此外,这些方法为两个深度学习共享特征,而每个特征都侧重于不同的信息。这种共享机制可能会导致负迁移,从而降低根深度预测的准确性。为了应对这些挑战,我们提出了一种自下而上的新方法--人群区域增强网络(Crowd Region Enhancement Network,CRENet),其中包含一个特征去耦模块(FDM)和一个全局注意力模块(GAM)。FDM 通过自适应地重新校准信道响应和融合多层次特征,明确地学习每个深度的判别特征,这使得模型能够分别关注每个深度的预测,从而避免了负迁移的不利影响。GAM 利用注意力机制突出人群区域,同时抑制无关区域,并根据对注意力的置信度进一步完善注意力区域,这有利于从信息丰富的人群区域学习与深度相关的线索,促进根深度估计。在基准MuPoTS-3D和CMU Panoptic上进行的综合实验表明,我们的方法在绝对三维姿态估计方面优于最先进的自下而上方法,并且适用于野外图像,这也表明学习特定深度特征和抑制噪声信号对多人绝对三维姿态估计大有裨益。
{"title":"CRENet: Crowd region enhancement network for multi-person 3D pose estimation","authors":"Zhaokun Li,&nbsp;Qiong Liu","doi":"10.1016/j.imavis.2024.105243","DOIUrl":"10.1016/j.imavis.2024.105243","url":null,"abstract":"<div><p>Recovering multi-person 3D poses from a single image is a challenging problem due to inherent depth ambiguities, including root-relative depth and absolute root depth. Current bottom-up methods show promising potential to mitigate absolute root depth ambiguity through explicitly aggregating global contextual cues. However, these methods treat the entire image region equally during root depth regression, ignoring the negative impact of irrelevant regions. Moreover, they learn shared features for both depths, each of which focuses on different information. This sharing mechanism may result in negative transfer, thus diminishing root depth prediction accuracy. To address these challenges, we present a novel bottom-up method, Crowd Region Enhancement Network (CRENet), incorporating a Feature Decoupling Module (FDM) and a Global Attention Module (GAM). FDM explicitly learns the discriminative feature for each depth through adaptively recalibrating its channel-wise responses and fusing multi-level features, which makes the model focus on each depth prediction separately and thus avoids the adverse effect of negative transfer. GAM highlights crowd regions while suppressing irrelevant regions using the attention mechanism and further refines the attention regions based on the confidence measure about the attention, which is beneficial to learn depth-related cues from informative crowd regions and facilitate root depth estimation. Comprehensive experiments on benchmarks MuPoTS-3D and CMU Panoptic demonstrate that our method outperforms the state-of-the-art bottom-up methods in absolute 3D pose estimation and is applicable to in-the-wild images, which also indicates that learning depth-specific features and suppressing the noise signals can significantly benefit multi-person absolute 3D pose estimation.</p></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"151 ","pages":"Article 105243"},"PeriodicalIF":4.2,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142158433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dual subspace clustering for spectral-spatial hyperspectral image clustering 用于光谱-空间高光谱图像聚类的双子空间聚类技术
IF 4.2 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-28 DOI: 10.1016/j.imavis.2024.105235
Shujun Liu

Subspace clustering supposes that hyperspectral image (HSI) pixels lie in a union vector spaces of multiple sample subspaces without considering their dual space, i.e., spectral space. In this article, we propose a promising dual subspace clustering (DualSC) for improving spectral-spatial HSIs clustering by relaxing subspace clustering. To this end, DualSC simultaneously optimizes row and column subspace-representations of HSI superpixels to capture the intrinsic connection between spectral and spatial information. From the new perspective, the original subspace clustering can be treated as a special case of DualSC that has larger solution space, so tends to finding better sample representation matrix for applying spectral clustering. Besides, we provide theoretical proofs that show the proposed method relaxes the subspace space clustering with dual subspace, and can recover subspace-sparse representation of HSI samples. To the best of our knowledge, this work could be one of the first dual clustering method leveraging sample and spectral subspaces simultaneously. As a result, we conduct several clustering experiments on four canonical data sets, implying that our proposed method with strong interpretability reaches comparable performance and computing efficiency with other state-of-the-art methods.

子空间聚类假设高光谱图像(HSI)像素位于多个样本子空间的联合向量空间中,而不考虑它们的对偶空间,即光谱空间。在本文中,我们提出了一种很有前途的双子空间聚类(DualSC)方法,通过放宽子空间聚类来改进光谱-空间高光谱图像聚类。为此,DualSC 同时优化了 HSI 超像素的行和列子空间表示,以捕捉光谱信息和空间信息之间的内在联系。从新的视角来看,原始子空间聚类可被视为 DualSC 的特例,它具有更大的求解空间,因此更倾向于为应用光谱聚类找到更好的样本表示矩阵。此外,我们还提供了理论证明,表明所提出的方法用对偶子空间放松了子空间聚类,并能恢复人机交互样本的子空间稀疏表示。据我们所知,这项工作可能是首个同时利用样本和频谱子空间的双重聚类方法之一。因此,我们在四个典型数据集上进行了多次聚类实验,结果表明我们提出的方法具有很强的可解释性,其性能和计算效率与其他最先进的方法相当。
{"title":"Dual subspace clustering for spectral-spatial hyperspectral image clustering","authors":"Shujun Liu","doi":"10.1016/j.imavis.2024.105235","DOIUrl":"10.1016/j.imavis.2024.105235","url":null,"abstract":"<div><p>Subspace clustering supposes that hyperspectral image (HSI) pixels lie in a union vector spaces of multiple sample subspaces without considering their dual space, i.e., spectral space. In this article, we propose a promising dual subspace clustering (DualSC) for improving spectral-spatial HSIs clustering by relaxing subspace clustering. To this end, DualSC simultaneously optimizes row and column subspace-representations of HSI superpixels to capture the intrinsic connection between spectral and spatial information. From the new perspective, the original subspace clustering can be treated as a special case of DualSC that has larger solution space, so tends to finding better sample representation matrix for applying spectral clustering. Besides, we provide theoretical proofs that show the proposed method relaxes the subspace space clustering with dual subspace, and can recover subspace-sparse representation of HSI samples. To the best of our knowledge, this work could be one of the first dual clustering method leveraging sample and spectral subspaces simultaneously. As a result, we conduct several clustering experiments on four canonical data sets, implying that our proposed method with strong interpretability reaches comparable performance and computing efficiency with other state-of-the-art methods.</p></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"150 ","pages":"Article 105235"},"PeriodicalIF":4.2,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142089417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pro-ReID: Producing reliable pseudo labels for unsupervised person re-identification Pro-ReID:为无监督人员再识别制作可靠的伪标签
IF 4.2 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-28 DOI: 10.1016/j.imavis.2024.105244
Haiming Sun, Shiwei Ma

Mainstream unsupervised person ReIDentification (ReID) is on the basis of the alternation of clustering and fine-tuning to promote the task performance, but the clustering process inevitably produces noisy pseudo labels, which seriously constrains the further advancement of the task performance. To conquer the above concerns, the novel Pro-ReID framework is proposed to produce reliable person samples from the pseudo-labeled dataset to learn feature representations in this work. It consists of two modules: Pseudo Labels Correction (PLC) and Pseudo Labels Selection (PLS). Specifically, we further leverage the temporal ensemble prior knowledge to promote task performance. The PLC module assigns corresponding soft pseudo labels to each sample with control of soft pseudo label participation to potentially correct for noisy pseudo labels generated during clustering; the PLS module associates the predictions of the temporal ensemble model with pseudo label annotations and it detects noisy pseudo labele examples as out-of-distribution examples through the Gaussian Mixture Model (GMM) to supply reliable pseudo labels for the unsupervised person ReID task in consideration of their loss data distribution. Experimental findings validated on three person (Market-1501, DukeMTMC-reID and MSMT17) and one vehicle (VeRi-776) ReID benchmark establish that the novel Pro-ReID framework achieves competitive performance, in particular the mAP on the ambitious MSMT17 that is 4.3% superior to the state-of-the-art methods.

主流的无监督人员再识别(ReID)方法是在聚类和微调交替进行的基础上提升任务性能,但聚类过程不可避免地会产生噪声伪标签,严重制约了任务性能的进一步提升。为了解决上述问题,本研究提出了新颖的 Pro-ReID 框架,以从伪标签数据集中生成可靠的人物样本来学习特征表征。它由两个模块组成:伪标签校正(PLC)和伪标签选择(PLS)。具体来说,我们进一步利用时序集合先验知识来提高任务性能。PLC 模块通过控制软伪标签的参与度,为每个样本分配相应的软伪标签,从而有可能纠正聚类过程中产生的噪声伪标签;PLS 模块将时序集合模型的预测与伪标签注释关联起来,并通过高斯混杂模型(GMM)将噪声伪标签示例检测为分布外示例,从而在考虑其损失数据分布的情况下,为无监督人员 ReID 任务提供可靠的伪标签。在三个人员(Market-1501、DukeMTMC-reID 和 MSMT17)和一个车辆(VeRi-776)ReID 基准上验证的实验结果表明,新颖的 Pro-ReID 框架实现了具有竞争力的性能,尤其是在雄心勃勃的 MSMT17 上的 mAP 比最先进的方法高出 4.3%。
{"title":"Pro-ReID: Producing reliable pseudo labels for unsupervised person re-identification","authors":"Haiming Sun,&nbsp;Shiwei Ma","doi":"10.1016/j.imavis.2024.105244","DOIUrl":"10.1016/j.imavis.2024.105244","url":null,"abstract":"<div><p>Mainstream unsupervised person ReIDentification (ReID) is on the basis of the alternation of clustering and fine-tuning to promote the task performance, but the clustering process inevitably produces noisy pseudo labels, which seriously constrains the further advancement of the task performance. To conquer the above concerns, the novel Pro-ReID framework is proposed to produce reliable person samples from the pseudo-labeled dataset to learn feature representations in this work. It consists of two modules: Pseudo Labels Correction (PLC) and Pseudo Labels Selection (PLS). Specifically, we further leverage the temporal ensemble prior knowledge to promote task performance. The PLC module assigns corresponding soft pseudo labels to each sample with control of soft pseudo label participation to potentially correct for noisy pseudo labels generated during clustering; the PLS module associates the predictions of the temporal ensemble model with pseudo label annotations and it detects noisy pseudo labele examples as out-of-distribution examples through the Gaussian Mixture Model (GMM) to supply reliable pseudo labels for the unsupervised person ReID task in consideration of their loss data distribution. Experimental findings validated on three person (Market-1501, DukeMTMC-reID and MSMT17) and one vehicle (VeRi-776) ReID benchmark establish that the novel Pro-ReID framework achieves competitive performance, in particular the mAP on the ambitious MSMT17 that is 4.3% superior to the state-of-the-art methods.</p></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"150 ","pages":"Article 105244"},"PeriodicalIF":4.2,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142129876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Language conditioned multi-scale visual attention networks for visual grounding 语言条件下的多尺度视觉注意力网络,促进视觉接地
IF 4.2 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-25 DOI: 10.1016/j.imavis.2024.105242
Haibo Yao, Lipeng Wang, Chengtao Cai, Wei Wang, Zhi Zhang, Xiaobing Shang

Visual grounding (VG) is a task that requires to locate a specific region in an image according to a natural language expression. Existing efforts on the VG task are divided into two-stage, one-stage and Transformer-based methods, which have achieved high performance. However, most of the previous methods extract visual information at a single spatial scale and ignore visual information at other spatial scales, which makes these models unable to fully utilize the visual information. Moreover, the insufficient utilization of linguistic information, especially failure to capture global linguistic information, may lead to failure to fully understand language expressions, thus limiting the performance of these models. To better address the task, we propose a language conditioned multi-scale visual attention network (LMSVA) for visual grounding, which can sufficiently utilize visual and linguistic information to perform multimodal reasoning, thus improving performance of model. Specifically, we design a visual feature extractor containing a multi-scale layer to get the required multi-scale visual features by expanding the original backbone. Moreover, we exploit pooling the output of the pre-trained Bidirectional Encoder Representations from Transformers (BERT) model to extract sentence-level linguistic features, which can enable the model to capture global linguistic information. Inspired by the Transformer architecture, we present the Visual Attention Layer guided by Language and Multi-Scale Visual Features (VALMS), which is able to better learn the visual context guided by multi-scale visual and linguistic features, and facilitates further multimodal reasoning. Extensive experiments on four large benchmark datasets, including ReferItGame, RefCOCO, RefCOCO + and RefCOCOg, demonstrate that our proposed model achieves the state-of-the-art performance.

视觉定位(VG)是一项需要根据自然语言表达在图像中定位特定区域的任务。现有的视觉定位方法分为两阶段法、一阶段法和基于变换器的方法,这些方法都取得了很高的性能。然而,以往的方法大多只提取单一空间尺度的视觉信息,而忽略了其他空间尺度的视觉信息,这使得这些模型无法充分利用视觉信息。此外,对语言信息的利用不足,尤其是未能捕捉到全局语言信息,可能导致无法完全理解语言表达,从而限制了这些模型的性能。为了更好地解决这一任务,我们提出了一种用于视觉接地的语言条件多尺度视觉注意力网络(LMSVA),它可以充分利用视觉和语言信息进行多模态推理,从而提高模型的性能。具体来说,我们设计了一个包含多尺度层的视觉特征提取器,通过扩展原始骨干层来获取所需的多尺度视觉特征。此外,我们还利用预训练的变压器双向编码器表征(BERT)模型的池化输出来提取句子级语言特征,从而使模型能够捕捉全局语言信息。受 Transformer 架构的启发,我们提出了由语言和多尺度视觉特征引导的视觉注意层(VALMS),它能够在多尺度视觉和语言特征的引导下更好地学习视觉上下文,并促进进一步的多模态推理。在四个大型基准数据集(包括 ReferItGame、RefCOCO、RefCOCO + 和 RefCOCOg)上进行的广泛实验证明,我们提出的模型达到了最先进的性能。
{"title":"Language conditioned multi-scale visual attention networks for visual grounding","authors":"Haibo Yao,&nbsp;Lipeng Wang,&nbsp;Chengtao Cai,&nbsp;Wei Wang,&nbsp;Zhi Zhang,&nbsp;Xiaobing Shang","doi":"10.1016/j.imavis.2024.105242","DOIUrl":"10.1016/j.imavis.2024.105242","url":null,"abstract":"<div><p>Visual grounding (VG) is a task that requires to locate a specific region in an image according to a natural language expression. Existing efforts on the VG task are divided into two-stage, one-stage and Transformer-based methods, which have achieved high performance. However, most of the previous methods extract visual information at a single spatial scale and ignore visual information at other spatial scales, which makes these models unable to fully utilize the visual information. Moreover, the insufficient utilization of linguistic information, especially failure to capture global linguistic information, may lead to failure to fully understand language expressions, thus limiting the performance of these models. To better address the task, we propose a language conditioned multi-scale visual attention network (LMSVA) for visual grounding, which can sufficiently utilize visual and linguistic information to perform multimodal reasoning, thus improving performance of model. Specifically, we design a visual feature extractor containing a multi-scale layer to get the required multi-scale visual features by expanding the original backbone. Moreover, we exploit pooling the output of the pre-trained Bidirectional Encoder Representations from Transformers (BERT) model to extract sentence-level linguistic features, which can enable the model to capture global linguistic information. Inspired by the Transformer architecture, we present the Visual Attention Layer guided by Language and Multi-Scale Visual Features (VALMS), which is able to better learn the visual context guided by multi-scale visual and linguistic features, and facilitates further multimodal reasoning. Extensive experiments on four large benchmark datasets, including ReferItGame, RefCOCO, RefCOCO<!--> <!-->+ and RefCOCOg, demonstrate that our proposed model achieves the state-of-the-art performance.</p></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"150 ","pages":"Article 105242"},"PeriodicalIF":4.2,"publicationDate":"2024-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142097495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning facial structural dependency in 3D aligned space for face alignment 在三维对齐空间中学习人脸结构依赖性以进行人脸对齐
IF 4.2 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-23 DOI: 10.1016/j.imavis.2024.105241
Biying Li , Zhiwei Liu , Jinqiao Wang

Facial structure's statistical characteristics offer pivotal prior information in facial landmark prediction, forming inter-dependencies among different landmarks. Such inter-dependencies ensure that predictions adhere to the shape distribution typical of natural faces. In challenging scenarios like occlusions or extreme facial poses, this structure becomes indispensable, which can help to predict elusive landmarks based on more discernible ones. While current deep learning methods do capture these landmark dependencies, it's often an implicit process heavily reliant on vast training datasets. We contest that such implicit modeling approaches fail to manage more challenging situations. In this paper, we propose a new method that harnesses the facial structure and explicitly explores inter-dependencies among facial landmarks in an end-to-end fashion. We propose a Structural Dependency Learning Module (SDLM). It uses 3D face information to map facial features into a canonical UV space, in which the facial structure is explicitly 3D semantically aligned. Besides, to explore the global relationships between facial landmarks, we take advantage of the self-attention mechanism in the image and UV spaces. We name the proposed method Facial Structure-based Face Alignment (FSFA). FSFA reinforces the landmark structure, especially under challenging conditions. Extensive experiments demonstrate that FSFA achieves state-of-the-art performance on the WFLW, 300W, AFLW, and COFW68 datasets.

面部结构的统计特征为面部地标预测提供了关键的先验信息,形成了不同地标之间的相互依存关系。这种相互依存关系可确保预测结果符合自然面部的典型形状分布。在遮挡或极端面部姿势等具有挑战性的场景中,这种结构变得不可或缺,它有助于根据更易辨别的地标预测难以捉摸的地标。虽然目前的深度学习方法确实能捕捉到这些地标依赖关系,但这往往是一个隐式过程,严重依赖于大量的训练数据集。我们认为,这种隐式建模方法无法应对更具挑战性的情况。在本文中,我们提出了一种新方法,利用面部结构,以端到端的方式明确探索面部地标之间的相互依赖关系。我们提出了结构依赖性学习模块(SDLM)。它利用三维人脸信息将面部特征映射到一个典型的 UV 空间,在这个空间中,面部结构被明确地三维语义对齐。此外,为了探索面部地标之间的全局关系,我们还利用了图像和 UV 空间中的自注意机制。我们将所提出的方法命名为基于面部结构的面部对齐(FSFA)。FSFA 强化了地标结构,尤其是在具有挑战性的条件下。大量实验证明,FSFA 在 WFLW、300W、AFLW 和 COFW68 数据集上取得了一流的性能。
{"title":"Learning facial structural dependency in 3D aligned space for face alignment","authors":"Biying Li ,&nbsp;Zhiwei Liu ,&nbsp;Jinqiao Wang","doi":"10.1016/j.imavis.2024.105241","DOIUrl":"10.1016/j.imavis.2024.105241","url":null,"abstract":"<div><p>Facial structure's statistical characteristics offer pivotal prior information in facial landmark prediction, forming inter-dependencies among different landmarks. Such inter-dependencies ensure that predictions adhere to the shape distribution typical of natural faces. In challenging scenarios like occlusions or extreme facial poses, this structure becomes indispensable, which can help to predict elusive landmarks based on more discernible ones. While current deep learning methods do capture these landmark dependencies, it's often an implicit process heavily reliant on vast training datasets. We contest that such implicit modeling approaches fail to manage more challenging situations. In this paper, we propose a new method that harnesses the facial structure and explicitly explores inter-dependencies among facial landmarks in an end-to-end fashion. We propose a Structural Dependency Learning Module (SDLM). It uses 3D face information to map facial features into a canonical UV space, in which the facial structure is explicitly 3D semantically aligned. Besides, to explore the global relationships between facial landmarks, we take advantage of the self-attention mechanism in the image and UV spaces. We name the proposed method Facial Structure-based Face Alignment (FSFA). FSFA reinforces the landmark structure, especially under challenging conditions. Extensive experiments demonstrate that FSFA achieves state-of-the-art performance on the WFLW, 300W, AFLW, and COFW68 datasets.</p></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"150 ","pages":"Article 105241"},"PeriodicalIF":4.2,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142083951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning accurate monocular 3D voxel representation via bilateral voxel transformer 通过双边体素变换器学习精确的单目三维体素表征
IF 4.2 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-23 DOI: 10.1016/j.imavis.2024.105237
Tianheng Cheng , Haoyi Jiang , Shaoyu Chen , Bencheng Liao , Qian Zhang , Wenyu Liu , Xinggang Wang

Vision-based methods for 3D scene perception have been widely explored for autonomous vehicles. However, inferring complete 3D semantic scenes from monocular 2D images is still challenging owing to the 2D-to-3D transformation. Specifically, existing methods that use Inverse Perspective Mapping (IPM) to project image features to dense 3D voxels severely suffer from the ambiguous projection problem. In this research, we present Bilateral Voxel Transformer (BVT), a novel and effective Transformer-based approach for monocular 3D semantic scene completion. BVT exploits a bilateral architecture composed of two branches for preserving the high-resolution 3D voxel representation while aggregating contexts through the proposed Tri-Axial Transformer simultaneously. To alleviate the ill-posed 2D-to-3D transformation, we adopt position-aware voxel queries and dynamically update the voxels with image features through weighted geometry-aware sampling. BVT achieves 11.8 mIoU on the challenging Semantic KITTI dataset, considerably outperforming previous works for semantic scene completion with monocular images. The code and models of BVT will be available on GitHub.

基于视觉的三维场景感知方法已被广泛用于自动驾驶汽车。然而,由于二维到三维的转换,从单目二维图像推断完整的三维语义场景仍然具有挑战性。具体来说,现有的使用反透视映射(IPM)将图像特征投射到高密度三维体素的方法存在严重的投射模糊问题。在这项研究中,我们提出了双边体素变换器(Bilateral Voxel Transformer,BVT),这是一种基于变换器的新颖有效的单目三维语义场景补全方法。BVT 利用由两个分支组成的双边架构来保留高分辨率三维体素表示,同时通过提议的三轴变换器来聚合上下文。为了减轻二维到三维变换的不确定性,我们采用了位置感知体素查询,并通过加权几何感知采样动态更新具有图像特征的体素。BVT 在具有挑战性的语义 KITTI 数据集上实现了 11.8 mIoU,大大超过了之前利用单目图像完成语义场景的工作。BVT 的代码和模型将在 GitHub 上发布。
{"title":"Learning accurate monocular 3D voxel representation via bilateral voxel transformer","authors":"Tianheng Cheng ,&nbsp;Haoyi Jiang ,&nbsp;Shaoyu Chen ,&nbsp;Bencheng Liao ,&nbsp;Qian Zhang ,&nbsp;Wenyu Liu ,&nbsp;Xinggang Wang","doi":"10.1016/j.imavis.2024.105237","DOIUrl":"10.1016/j.imavis.2024.105237","url":null,"abstract":"<div><p>Vision-based methods for 3D scene perception have been widely explored for autonomous vehicles. However, inferring complete 3D semantic scenes from monocular 2D images is still challenging owing to the 2D-to-3D transformation. Specifically, existing methods that use Inverse Perspective Mapping (IPM) to project image features to dense 3D voxels severely suffer from the ambiguous projection problem. In this research, we present <strong>Bilateral Voxel Transformer</strong> (BVT), a novel and effective Transformer-based approach for monocular 3D semantic scene completion. BVT exploits a bilateral architecture composed of two branches for preserving the high-resolution 3D voxel representation while aggregating contexts through the proposed Tri-Axial Transformer simultaneously. To alleviate the ill-posed 2D-to-3D transformation, we adopt position-aware voxel queries and dynamically update the voxels with image features through weighted geometry-aware sampling. BVT achieves 11.8 mIoU on the challenging Semantic KITTI dataset, considerably outperforming previous works for semantic scene completion with monocular images. The code and models of BVT will be available on <span><span>GitHub</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"150 ","pages":"Article 105237"},"PeriodicalIF":4.2,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142077211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Simultaneous image patch attention and pruning for patch selective transformer 同时关注图像补丁和修剪补丁选择性变换器
IF 4.2 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-22 DOI: 10.1016/j.imavis.2024.105239
Sunpil Kim , Gang-Joon Yoon , Jinjoo Song , Sang Min Yoon

Vision transformer models provide superior performance compared to convolutional neural networks for various computer vision tasks but require increased computational overhead with large datasets. This paper proposes a patch selective vision transformer that effectively selects patches to reduce computational costs while simultaneously extracting global and local self-representative patch information to maintain performance. The inter-patch attention in the transformer encoder emphasizes meaningful features by capturing the inter-patch relationships of features, and dynamic patch pruning is applied to the attentive patches using a learnable soft threshold that measures the maximum multi-head importance scores. The proposed patch attention and pruning method provides constraints to exploit dominant feature maps in conjunction with self-attention, thus avoiding the propagation of noisy or irrelevant information. The proposed patch-selective transformer also helps to address computer vision problems such as scale, background clutter, and partial occlusion, resulting in a lightweight and general-purpose vision transformer suitable for mobile devices.

与卷积神经网络相比,视觉变换器模型能为各种计算机视觉任务提供更优越的性能,但在处理大型数据集时需要增加计算开销。本文提出了一种补丁选择性视觉变换器,它能有效选择补丁以降低计算成本,同时提取全局和局部自代表性补丁信息以保持性能。转换器编码器中的补丁间关注通过捕捉特征的补丁间关系来强调有意义的特征,并使用可学习的软阈值对关注的补丁进行动态修剪,该阈值用于测量最大多头重要性分数。所提出的补丁关注和修剪方法提供了利用主导特征图与自我关注相结合的约束条件,从而避免了噪声或不相关信息的传播。建议的补丁选择变换器还有助于解决尺度、背景杂波和部分遮挡等计算机视觉问题,从而形成适合移动设备的轻量级通用视觉变换器。
{"title":"Simultaneous image patch attention and pruning for patch selective transformer","authors":"Sunpil Kim ,&nbsp;Gang-Joon Yoon ,&nbsp;Jinjoo Song ,&nbsp;Sang Min Yoon","doi":"10.1016/j.imavis.2024.105239","DOIUrl":"10.1016/j.imavis.2024.105239","url":null,"abstract":"<div><p>Vision transformer models provide superior performance compared to convolutional neural networks for various computer vision tasks but require increased computational overhead with large datasets. This paper proposes a patch selective vision transformer that effectively selects patches to reduce computational costs while simultaneously extracting global and local self-representative patch information to maintain performance. The inter-patch attention in the transformer encoder emphasizes meaningful features by capturing the inter-patch relationships of features, and dynamic patch pruning is applied to the attentive patches using a learnable soft threshold that measures the maximum multi-head importance scores. The proposed patch attention and pruning method provides constraints to exploit dominant feature maps in conjunction with self-attention, thus avoiding the propagation of noisy or irrelevant information. The proposed patch-selective transformer also helps to address computer vision problems such as scale, background clutter, and partial occlusion, resulting in a lightweight and general-purpose vision transformer suitable for mobile devices.</p></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"150 ","pages":"Article 105239"},"PeriodicalIF":4.2,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142083950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Image and Vision Computing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1