首页 > 最新文献

International Journal of Computer Vision最新文献

英文 中文
MoDA: Modeling Deformable 3D Objects from Casual Videos MoDA:从休闲视频建模可变形的3D对象
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-12-12 DOI: 10.1007/s11263-024-02310-5
Chaoyue Song, Jiacheng Wei, Tianyi Chen, Yiwen Chen, Chuan-Sheng Foo, Fayao Liu, Guosheng Lin

In this paper, we focus on the challenges of modeling deformable 3D objects from casual videos. With the popularity of NeRF, many works extend it to dynamic scenes with a canonical NeRF and a deformation model that achieves 3D point transformation between the observation space and the canonical space. Recent works rely on linear blend skinning (LBS) to achieve the canonical-observation transformation. However, the linearly weighted combination of rigid transformation matrices is not guaranteed to be rigid. As a matter of fact, unexpected scale and shear factors often appear. In practice, using LBS as the deformation model can always lead to skin-collapsing artifacts for bending or twisting motions. To solve this problem, we propose neural dual quaternion blend skinning (NeuDBS) to achieve 3D point deformation, which can perform rigid transformation without skin-collapsing artifacts. To register 2D pixels across different frames, we establish a correspondence between canonical feature embeddings that encodes 3D points within the canonical space, and 2D image features by solving an optimal transport problem. Besides, we introduce a texture filtering approach for texture rendering that effectively minimizes the impact of noisy colors outside target deformable objects.

在本文中,我们重点关注从休闲视频中建模可变形3D对象的挑战。随着NeRF的普及,许多作品将其扩展到动态场景中,使用规范化NeRF和变形模型,实现观测空间与规范化空间之间的三维点变换。最近的工作是依靠线性混合蒙皮(LBS)来实现经典观测转换。然而,刚性变换矩阵的线性加权组合并不能保证是刚性的。事实上,经常会出现意想不到的尺度和剪切因素。在实践中,使用LBS作为变形模型总是会导致弯曲或扭转运动的皮肤塌陷伪影。为了解决这一问题,我们提出了神经对偶四元数混合蒙皮(NeuDBS)来实现三维点变形,该方法可以在没有蒙皮崩溃伪影的情况下进行刚性变换。为了在不同帧之间注册2D像素,我们通过解决最优传输问题,在规范空间内编码3D点的规范特征嵌入与2D图像特征之间建立了对应关系。此外,我们还引入了一种纹理滤波方法来进行纹理渲染,有效地减少了目标可变形物体外部噪声颜色的影响。
{"title":"MoDA: Modeling Deformable 3D Objects from Casual Videos","authors":"Chaoyue Song, Jiacheng Wei, Tianyi Chen, Yiwen Chen, Chuan-Sheng Foo, Fayao Liu, Guosheng Lin","doi":"10.1007/s11263-024-02310-5","DOIUrl":"https://doi.org/10.1007/s11263-024-02310-5","url":null,"abstract":"<p>In this paper, we focus on the challenges of modeling deformable 3D objects from casual videos. With the popularity of NeRF, many works extend it to dynamic scenes with a canonical NeRF and a deformation model that achieves 3D point transformation between the observation space and the canonical space. Recent works rely on linear blend skinning (LBS) to achieve the canonical-observation transformation. However, the linearly weighted combination of rigid transformation matrices is not guaranteed to be rigid. As a matter of fact, unexpected scale and shear factors often appear. In practice, using LBS as the deformation model can always lead to skin-collapsing artifacts for bending or twisting motions. To solve this problem, we propose neural dual quaternion blend skinning (NeuDBS) to achieve 3D point deformation, which can perform rigid transformation without skin-collapsing artifacts. To register 2D pixels across different frames, we establish a correspondence between canonical feature embeddings that encodes 3D points within the canonical space, and 2D image features by solving an optimal transport problem. Besides, we introduce a texture filtering approach for texture rendering that effectively minimizes the impact of noisy colors outside target deformable objects.</p>","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"62 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142809695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Structured Generative Models for Scene Understanding 场景理解的结构化生成模型
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-12-12 DOI: 10.1007/s11263-024-02316-z
Christopher K. I. Williams

This position paper argues for the use of structured generative models (SGMs) for the understanding of static scenes. This requires the reconstruction of a 3D scene from an input image (or a set of multi-view images), whereby the contents of the image(s) are causally explained in terms of models of instantiated objects, each with their own type, shape, appearance and pose, along with global variables like scene lighting and camera parameters. This approach also requires scene models which account for the co-occurrences and inter-relationships of objects in a scene. The SGM approach has the merits that it is compositional and generative, which lead to interpretability and editability. To pursue the SGM agenda, we need models for objects and scenes, and approaches to carry out inference. We first review models for objects, which include “things” (object categories that have a well defined shape), and “stuff” (categories which have amorphous spatial extent). We then move on to review scene models which describe the inter-relationships of objects. Perhaps the most challenging problem for SGMs is inference of the objects, lighting and camera parameters, and scene inter-relationships from input consisting of a single or multiple images. We conclude with a discussion of issues that need addressing to advance the SGM agenda.

本文主张使用结构化生成模型(sgm)来理解静态场景。这需要从输入图像(或一组多视图图像)中重建3D场景,其中图像的内容根据实例化对象的模型进行因果解释,每个对象都有自己的类型,形状,外观和姿势,以及场景照明和相机参数等全局变量。这种方法还需要考虑场景中对象的共现和相互关系的场景模型。SGM方法的优点是它是组合的和生成的,这导致了可解释性和可编辑性。为了实现SGM的目标,我们需要对象和场景的模型,以及进行推理的方法。我们首先回顾对象模型,其中包括“事物”(具有明确形状的对象类别)和“材料”(具有无定形空间范围的类别)。然后我们继续回顾描述对象相互关系的场景模型。对于SGMs来说,最具挑战性的问题可能是物体、照明和相机参数的推断,以及由单个或多个图像组成的输入的场景相互关系。最后,我们讨论了推进SGM议程需要解决的问题。
{"title":"Structured Generative Models for Scene Understanding","authors":"Christopher K. I. Williams","doi":"10.1007/s11263-024-02316-z","DOIUrl":"https://doi.org/10.1007/s11263-024-02316-z","url":null,"abstract":"<p>This position paper argues for the use of <i>structured generative models</i> (SGMs) for the understanding of static scenes. This requires the reconstruction of a 3D scene from an input image (or a set of multi-view images), whereby the contents of the image(s) are causally explained in terms of models of instantiated objects, each with their own type, shape, appearance and pose, along with global variables like scene lighting and camera parameters. This approach also requires scene models which account for the co-occurrences and inter-relationships of objects in a scene. The SGM approach has the merits that it is compositional and generative, which lead to interpretability and editability. To pursue the SGM agenda, we need models for objects and scenes, and approaches to carry out inference. We first review models for objects, which include “things” (object categories that have a well defined shape), and “stuff” (categories which have amorphous spatial extent). We then move on to review <i>scene models</i> which describe the inter-relationships of objects. Perhaps the most challenging problem for SGMs is <i>inference</i> of the objects, lighting and camera parameters, and scene inter-relationships from input consisting of a single or multiple images. We conclude with a discussion of issues that need addressing to advance the SGM agenda.</p>","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"200 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142809753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
InfoPro: Locally Supervised Deep Learning by Maximizing Information Propagation 最大化信息传播的局部监督深度学习
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-12-11 DOI: 10.1007/s11263-024-02296-0
Yulin Wang, Zanlin Ni, Yifan Pu, Cai Zhou, Jixuan Ying, Shiji Song, Gao Huang

End-to-end (E2E) training has become the de-facto standard for training modern deep networks, e.g., ConvNets and vision Transformers (ViTs). Typically, a global error signal is generated at the end of a model and back-propagated layer-by-layer to update the parameters. This paper shows that the reliance on back-propagating global errors may not be necessary for deep learning. More precisely, deep networks with a competitive or even better performance can be obtained by purely leveraging locally supervised learning, i.e., splitting a network into gradient-isolated modules and training them with local supervision signals. However, such an extension is non-trivial. Our experimental and theoretical analysis demonstrates that simply training local modules with an E2E objective tends to be short-sighted, collapsing task-relevant information at early layers, and hurting the performance of the full model. To avoid this issue, we propose an information propagation (InfoPro) loss, which encourages local modules to preserve as much useful information as possible, while progressively discarding task-irrelevant information. As InfoPro loss is difficult to compute in its original form, we derive a feasible upper bound as a surrogate optimization objective, yielding a simple but effective algorithm. We evaluate InfoPro extensively with ConvNets and ViTs, based on twelve computer vision benchmarks organized into five tasks (i.e., image/video recognition, semantic/instance segmentation, and object detection). InfoPro exhibits superior efficiency over E2E training in terms of GPU memory footprints, convergence speed, and training data scale. Moreover, InfoPro enables the effective training of more parameter- and computation-efficient models (e.g., much deeper networks), which suffer from inferior performance when trained in E2E. Code: https://github.com/blackfeather-wang/InfoPro-Pytorch.

端到端(E2E)训练已经成为训练现代深度网络的事实上的标准,例如ConvNets和视觉变压器(vit)。通常,在模型结束时生成一个全局错误信号,并逐层反向传播以更新参数。本文表明,对反向传播的全局误差的依赖可能不是深度学习所必需的。更准确地说,可以通过纯粹利用局部监督学习来获得具有竞争力甚至更好性能的深度网络,即将网络分成梯度隔离的模块并用局部监督信号进行训练。然而,这样的扩展是不平凡的。我们的实验和理论分析表明,简单地用E2E目标训练局部模块往往是短视的,会在早期层崩溃任务相关信息,并损害整个模型的性能。为了避免这个问题,我们提出了信息传播(InfoPro)损失,它鼓励本地模块保留尽可能多的有用信息,同时逐步丢弃与任务无关的信息。由于InfoPro损失在原始形式下难以计算,我们推导了可行的上界作为替代优化目标,得到了一个简单而有效的算法。我们使用ConvNets和ViTs对InfoPro进行了广泛的评估,基于12个计算机视觉基准,分为5个任务(即图像/视频识别、语义/实例分割和目标检测)。InfoPro在GPU内存占用、收敛速度和训练数据规模方面表现出优于E2E训练的效率。此外,InfoPro能够有效地训练更多参数和计算效率高的模型(例如,更深层的网络),这些模型在E2E中训练时性能较差。代码:https://github.com/blackfeather-wang/InfoPro-Pytorch。
{"title":"InfoPro: Locally Supervised Deep Learning by Maximizing Information Propagation","authors":"Yulin Wang, Zanlin Ni, Yifan Pu, Cai Zhou, Jixuan Ying, Shiji Song, Gao Huang","doi":"10.1007/s11263-024-02296-0","DOIUrl":"https://doi.org/10.1007/s11263-024-02296-0","url":null,"abstract":"<p>End-to-end (E2E) training has become the <i>de-facto</i> standard for training modern deep networks, e.g., ConvNets and vision Transformers (ViTs). Typically, a global error signal is generated at the end of a model and back-propagated layer-by-layer to update the parameters. This paper shows that the reliance on back-propagating global errors may not be necessary for deep learning. More precisely, deep networks with a competitive or even better performance can be obtained by purely leveraging locally supervised learning, i.e., splitting a network into gradient-isolated modules and training them with local supervision signals. However, such an extension is non-trivial. Our experimental and theoretical analysis demonstrates that simply training local modules with an E2E objective tends to be short-sighted, collapsing task-relevant information at early layers, and hurting the performance of the full model. To avoid this issue, we propose an information propagation (InfoPro) loss, which encourages local modules to preserve as much useful information as possible, while progressively discarding task-irrelevant information. As InfoPro loss is difficult to compute in its original form, we derive a feasible upper bound as a surrogate optimization objective, yielding a simple but effective algorithm. We evaluate InfoPro extensively with ConvNets and ViTs, based on twelve computer vision benchmarks organized into five tasks (i.e., image/video recognition, semantic/instance segmentation, and object detection). InfoPro exhibits superior efficiency over E2E training in terms of GPU memory footprints, convergence speed, and training data scale. Moreover, InfoPro enables the effective training of more parameter- and computation-efficient models (e.g., much deeper networks), which suffer from inferior performance when trained in E2E. Code: https://github.com/blackfeather-wang/InfoPro-Pytorch.</p>","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"113 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2024-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142805404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CMAE-3D: Contrastive Masked AutoEncoders for Self-Supervised 3D Object Detection CMAE-3D:用于自监督3D对象检测的对比蒙面自动编码器
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-12-11 DOI: 10.1007/s11263-024-02313-2
Yanan Zhang, Jiaxin Chen, Di Huang

LiDAR-based 3D object detection is a crucial task for autonomous driving, owing to its accurate object recognition and localization capabilities in the 3D real-world space. However, existing methods heavily rely on time-consuming and laborious large-scale labeled LiDAR data, posing a bottleneck for both performance improvement and practical applications. In this paper, we propose Contrastive Masked AutoEncoders for self-supervised 3D object detection, dubbed as CMAE-3D, which is a promising solution to effectively alleviate label dependency in 3D perception. Specifically, we integrate Contrastive Learning (CL) and Masked AutoEncoders (MAE) into one unified framework to fully utilize the complementary characteristics of global semantic representation and local spatial perception. Furthermore, from the perspective of MAE, we develop the Geometric-Semantic Hybrid Masking (GSHM) to selectively mask representative regions in point clouds with imbalanced foreground-background and uneven density distribution, and design the Multi-scale Latent Feature Reconstruction (MLFR) to capture high-level semantic features while mitigating the redundant reconstruction of low-level details. From the perspective of CL, we present Hierarchical Relational Contrastive Learning (HRCL) to mine rich semantic similarity information while alleviating the issue of negative sample mismatch from both the voxel-level and frame-level. Extensive experiments demonstrate the effectiveness of our pre-training method when applied to multiple mainstream 3D object detectors (SECOND, CenterPoint and PV-RCNN) on three popular datasets (KITTI, Waymo and nuScenes).

基于激光雷达的三维目标检测是自动驾驶的关键任务,因为它在三维现实空间中具有准确的目标识别和定位能力。然而,现有的方法严重依赖于耗时费力的大规模标记激光雷达数据,这对性能改进和实际应用都构成了瓶颈。在本文中,我们提出了用于自监督3D物体检测的对比蒙面自动编码器,称为CMAE-3D,这是一种很有前途的解决方案,可以有效地减轻3D感知中的标签依赖。具体而言,我们将对比学习(CL)和掩码自动编码器(MAE)整合到一个统一的框架中,以充分利用全局语义表征和局部空间感知的互补特性。此外,从MAE的角度,我们开发了几何-语义混合掩蔽(GSHM)来选择性地掩盖前背景不平衡和密度分布不均匀的点云中的代表性区域,并设计了多尺度潜在特征重建(MLFR)来捕获高级语义特征,同时减少低级细节的冗余重建。从层次关系对比学习的角度出发,提出层次关系对比学习(HRCL)来挖掘丰富的语义相似信息,同时从体素级和帧级两个层面缓解负样本不匹配的问题。大量的实验证明了我们的预训练方法在三个流行的数据集(KITTI, Waymo和nuScenes)上应用于多个主流3D物体检测器(SECOND, CenterPoint和PV-RCNN)时的有效性。
{"title":"CMAE-3D: Contrastive Masked AutoEncoders for Self-Supervised 3D Object Detection","authors":"Yanan Zhang, Jiaxin Chen, Di Huang","doi":"10.1007/s11263-024-02313-2","DOIUrl":"https://doi.org/10.1007/s11263-024-02313-2","url":null,"abstract":"<p>LiDAR-based 3D object detection is a crucial task for autonomous driving, owing to its accurate object recognition and localization capabilities in the 3D real-world space. However, existing methods heavily rely on time-consuming and laborious large-scale labeled LiDAR data, posing a bottleneck for both performance improvement and practical applications. In this paper, we propose Contrastive Masked AutoEncoders for self-supervised 3D object detection, dubbed as CMAE-3D, which is a promising solution to effectively alleviate label dependency in 3D perception. Specifically, we integrate Contrastive Learning (CL) and Masked AutoEncoders (MAE) into one unified framework to fully utilize the complementary characteristics of global semantic representation and local spatial perception. Furthermore, from the perspective of MAE, we develop the Geometric-Semantic Hybrid Masking (GSHM) to selectively mask representative regions in point clouds with imbalanced foreground-background and uneven density distribution, and design the Multi-scale Latent Feature Reconstruction (MLFR) to capture high-level semantic features while mitigating the redundant reconstruction of low-level details. From the perspective of CL, we present Hierarchical Relational Contrastive Learning (HRCL) to mine rich semantic similarity information while alleviating the issue of negative sample mismatch from both the voxel-level and frame-level. Extensive experiments demonstrate the effectiveness of our pre-training method when applied to multiple mainstream 3D object detectors (SECOND, CenterPoint and PV-RCNN) on three popular datasets (KITTI, Waymo and nuScenes).\u0000</p>","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"12 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2024-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142809693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Language-Guided Hierarchical Fine-Grained Image Forgery Detection and Localization 语言引导的分层细粒度图像伪造检测与定位
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-12-10 DOI: 10.1007/s11263-024-02255-9
Xiao Guo, Xiaohong Liu, Iacopo Masi, Xiaoming Liu

Differences in forgery attributes of images generated in CNN-synthesized and image-editing domains are large, and such differences make a unified image forgery detection and localization (IFDL) challenging. To this end, we present a hierarchical fine-grained formulation for IFDL representation learning. Specifically, we first represent forgery attributes of a manipulated image with multiple labels at different levels. Then, we perform fine-grained classification at these levels using the hierarchical dependency between them. As a result, the algorithm is encouraged to learn both comprehensive features and the inherent hierarchical nature of different forgery attributes, thereby improving the IFDL representation. In this work, we propose a Language-guided Hierarchical Fine-grained IFDL, denoted as HiFi-Net++. Specifically, HiFi-Net++ contains four components: multi-branch feature extractor, language-guided forgery localization enhancer, as well as classification and localization modules. Each branch of the multi-branch feature extractor learns to classify forgery attributes at one level, while localization and classification modules segment the pixel-level forgery region and detect image-level forgery, respectively. In addition, the language-guided forgery localization enhancer (LFLE), containing image and text encoders learned by contrastive language-image pre-training (CLIP), is used to further enrich the IFDL representation. LFLE takes specifically designed texts and the given image as multi-modal inputs and then generates the visual embedding and manipulation score maps, which are used to further improve HiFi-Net++ manipulation localization performance. Lastly, we construct a hierarchical fine-grained dataset to facilitate our study. We demonstrate the effectiveness of our method on 8 different benchmarks for both tasks of IFDL and forgery attribute classification. Our source code and dataset can be found: github.com/CHELSEA234/HiFi-IFDL.

cnn合成域和图像编辑域生成的图像伪造属性存在较大差异,这给统一的图像伪造检测与定位(IFDL)带来了挑战。为此,我们提出了一个分层细粒度的IFDL表示学习公式。具体来说,我们首先用不同级别的多个标签表示被操纵图像的伪造属性。然后,我们使用它们之间的层次依赖关系在这些级别上执行细粒度分类。因此,鼓励算法学习不同伪造属性的综合特征和固有的层次性质,从而改进IFDL表示。在这项工作中,我们提出了一个语言引导的分层细粒度IFDL,表示为HiFi-Net++。具体来说,HiFi-Net++包含四个组件:多分支特征提取器、语言引导伪造定位增强器以及分类和定位模块。多分支特征提取器的每个分支学习对一个级别的伪造属性进行分类,定位模块和分类模块分别对像素级伪造区域进行分割,对图像级伪造进行检测。此外,使用语言引导的伪造定位增强器(LFLE),其中包含通过对比语言图像预训练(CLIP)学习的图像和文本编码器,进一步丰富了IFDL表示。LFLE将特定设计的文本和给定图像作为多模态输入,生成可视化嵌入和操作评分图,用于进一步提高HiFi-Net++操作定位性能。最后,我们构建了一个分层的细粒度数据集,以方便我们的研究。我们在8个不同的基准上对IFDL和伪造属性分类任务进行了验证。我们的源代码和数据集可以找到:github.com/CHELSEA234/HiFi-IFDL。
{"title":"Language-Guided Hierarchical Fine-Grained Image Forgery Detection and Localization","authors":"Xiao Guo, Xiaohong Liu, Iacopo Masi, Xiaoming Liu","doi":"10.1007/s11263-024-02255-9","DOIUrl":"https://doi.org/10.1007/s11263-024-02255-9","url":null,"abstract":"<p>Differences in forgery attributes of images generated in CNN-synthesized and image-editing domains are large, and such differences make a unified image forgery detection and localization (IFDL) challenging. To this end, we present a hierarchical fine-grained formulation for IFDL representation learning. Specifically, we first represent forgery attributes of a manipulated image with multiple labels at different levels. Then, we perform fine-grained classification at these levels using the hierarchical dependency between them. As a result, the algorithm is encouraged to learn both comprehensive features and the inherent hierarchical nature of different forgery attributes, thereby improving the IFDL representation. In this work, we propose a Language-guided Hierarchical Fine-grained IFDL, denoted as HiFi-Net++. Specifically, HiFi-Net++ contains four components: multi-branch feature extractor, language-guided forgery localization enhancer, as well as classification and localization modules. Each branch of the multi-branch feature extractor learns to classify forgery attributes at one level, while localization and classification modules segment the pixel-level forgery region and detect image-level forgery, respectively. In addition, the language-guided forgery localization enhancer (LFLE), containing image and text encoders learned by contrastive language-image pre-training (CLIP), is used to further enrich the IFDL representation. LFLE takes specifically designed texts and the given image as multi-modal inputs and then generates the visual embedding and manipulation score maps, which are used to further improve HiFi-Net++ manipulation localization performance. Lastly, we construct a hierarchical fine-grained dataset to facilitate our study. We demonstrate the effectiveness of our method on 8 different benchmarks for both tasks of IFDL and forgery attribute classification. Our source code and dataset can be found: github.com/CHELSEA234/HiFi-IFDL.</p>","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"28 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2024-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142805402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On Mitigating Stability-Plasticity Dilemma in CLIP-guided Image Morphing via Geodesic Distillation Loss 通过大地蒸馏损失缓解 CLIP 引导图像变形中的稳定性-弹性困境
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-12-10 DOI: 10.1007/s11263-024-02308-z
Yeongtak Oh, Saehyung Lee, Uiwon Hwang, Sungroh Yoon

Large-scale language-vision pre-training models, such as CLIP, have achieved remarkable results in text-guided image morphing by leveraging several unconditional generative models. However, existing CLIP-guided methods face challenges in achieving photorealistic morphing when adapting the generator from the source to the target domain. Specifically, current guidance methods fail to provide detailed explanations of the morphing regions within the image, leading to misguidance and catastrophic forgetting of the original image’s fidelity. In this paper, we propose a novel approach considering proper regularization losses to overcome these difficulties by addressing the SP dilemma in CLIP guidance. Our approach consists of two key components: (1) a geodesic cosine similarity loss that minimizes inter-modality features (i.e., image and text) in a projected subspace of CLIP space, and (2) a latent regularization loss that minimizes intra-modality features (i.e., image and image) on the image manifold. By replacing the naive directional CLIP loss in a drop-in replacement manner, our method achieves superior morphing results for both images and videos across various benchmarks, including CLIP-inversion.

大规模语言视觉预训练模型,如CLIP,利用几个无条件生成模型在文本引导图像变形方面取得了显着的效果。然而,现有的clip引导方法在将生成器从源域调整到目标域时,在实现逼真变形方面面临挑战。具体来说,目前的制导方法不能提供图像内变形区域的详细解释,导致误导和灾难性地忘记原始图像的保真度。在本文中,我们提出了一种考虑适当正则化损失的新方法,通过解决CLIP制导中的SP困境来克服这些困难。我们的方法由两个关键组成部分组成:(1)在CLIP空间的投影子空间中最小化模态间特征(即图像和文本)的测地余弦相似性损失,以及(2)最小化图像流形上的模态内特征(即图像和图像)的潜在正则化损失。通过以插入式替换的方式替换原始的定向CLIP损失,我们的方法在各种基准测试中(包括CLIP反转)对图像和视频都获得了出色的变形结果。
{"title":"On Mitigating Stability-Plasticity Dilemma in CLIP-guided Image Morphing via Geodesic Distillation Loss","authors":"Yeongtak Oh, Saehyung Lee, Uiwon Hwang, Sungroh Yoon","doi":"10.1007/s11263-024-02308-z","DOIUrl":"https://doi.org/10.1007/s11263-024-02308-z","url":null,"abstract":"<p>Large-scale language-vision pre-training models, such as CLIP, have achieved remarkable results in text-guided image morphing by leveraging several unconditional generative models. However, existing CLIP-guided methods face challenges in achieving photorealistic morphing when adapting the generator from the source to the target domain. Specifically, current guidance methods fail to provide detailed explanations of the morphing regions within the image, leading to misguidance and catastrophic forgetting of the original image’s fidelity. In this paper, we propose a novel approach considering proper regularization losses to overcome these difficulties by addressing the SP dilemma in CLIP guidance. Our approach consists of two key components: (1) a geodesic cosine similarity loss that minimizes inter-modality features (i.e., image and text) in a projected subspace of CLIP space, and (2) a latent regularization loss that minimizes intra-modality features (i.e., image and image) on the image manifold. By replacing the naive directional CLIP loss in a drop-in replacement manner, our method achieves superior morphing results for both images and videos across various benchmarks, including CLIP-inversion.</p>","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"10 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2024-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142797373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Image-Based Virtual Try-On: A Survey 基于图像的虚拟试穿:一项调查
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-12-10 DOI: 10.1007/s11263-024-02305-2
Dan Song, Xuanpu Zhang, Juan Zhou, Weizhi Nie, Ruofeng Tong, Mohan Kankanhalli, An-An Liu

Image-based virtual try-on aims to synthesize a naturally dressed person image with a clothing image, which revolutionizes online shopping and inspires related topics within image generation, showing both research significance and commercial potential. However, there is a gap between current research progress and commercial applications and an absence of comprehensive overview of this field to accelerate the development. In this survey, we provide a comprehensive analysis of the state-of-the-art techniques and methodologies in aspects of pipeline architecture, person representation and key modules such as try-on indication, clothing warping and try-on stage. We additionally apply CLIP to assess the semantic alignment of try-on results, and evaluate representative methods with uniformly implemented evaluation metrics on the same dataset. In addition to quantitative and qualitative evaluation of current open-source methods, unresolved issues are highlighted and future research directions are prospected to identify key trends and inspire further exploration. The uniformly implemented evaluation metrics, dataset and collected methods will be made public available at https://github.com/little-misfit/Survey-Of-Virtual-Try-On.

基于图像的虚拟试戴旨在将穿着自然的人的形象与服装形象合成在一起,这是一场网络购物的革命,激发了图像生成领域的相关课题,具有研究意义和商业潜力。然而,目前的研究进展与商业应用之间存在差距,缺乏对该领域的全面概述来加速发展。在本调查中,我们从管道架构、人物表现和关键模块(如试衣指示、服装翘曲和试衣阶段)等方面全面分析了最新的技术和方法。我们还应用CLIP来评估试穿结果的语义一致性,并在同一数据集上使用统一实现的评估指标评估代表性方法。除了对现有的开源方法进行定量和定性评估外,还强调了尚未解决的问题,并展望了未来的研究方向,以确定关键趋势并激发进一步的探索。统一实施的评估指标、数据集和收集方法将在https://github.com/little-misfit/Survey-Of-Virtual-Try-On上公开。
{"title":"Image-Based Virtual Try-On: A Survey","authors":"Dan Song, Xuanpu Zhang, Juan Zhou, Weizhi Nie, Ruofeng Tong, Mohan Kankanhalli, An-An Liu","doi":"10.1007/s11263-024-02305-2","DOIUrl":"https://doi.org/10.1007/s11263-024-02305-2","url":null,"abstract":"<p>Image-based virtual try-on aims to synthesize a naturally dressed person image with a clothing image, which revolutionizes online shopping and inspires related topics within image generation, showing both research significance and commercial potential. However, there is a gap between current research progress and commercial applications and an absence of comprehensive overview of this field to accelerate the development. In this survey, we provide a comprehensive analysis of the state-of-the-art techniques and methodologies in aspects of pipeline architecture, person representation and key modules such as try-on indication, clothing warping and try-on stage. We additionally apply CLIP to assess the semantic alignment of try-on results, and evaluate representative methods with uniformly implemented evaluation metrics on the same dataset. In addition to quantitative and qualitative evaluation of current open-source methods, unresolved issues are highlighted and future research directions are prospected to identify key trends and inspire further exploration. The uniformly implemented evaluation metrics, dataset and collected methods will be made public available at https://github.com/little-misfit/Survey-Of-Virtual-Try-On.</p>","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"89 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2024-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142805365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Evaluation of Zero-Cost Proxies - from Neural Architecture Performance Prediction to Model Robustness 零成本代理的评估——从神经结构性能预测到模型鲁棒性
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-12-09 DOI: 10.1007/s11263-024-02265-7
Jovita Lukasik, Michael Moeller, Margret Keuper

Zero-cost proxies are nowadays frequently studied and used to search for neural architectures. They show an impressive ability to predict the performance of architectures by making use of their untrained weights. These techniques allow for immense search speed-ups. So far the joint search for well performing and robust architectures has received much less attention in the field of NAS. Therefore, the main focus of zero-cost proxies is the clean accuracy of architectures, whereas the model robustness should play an evenly important part. In this paper, we analyze the ability of common zero-cost proxies to serve as performance predictors for robustness in the popular NAS-Bench-201 search space. We are interested in the single prediction task for robustness and the joint multi-objective of clean and robust accuracy. We further analyze the feature importance of the proxies and show that predicting the robustness makes the prediction task from existing zero-cost proxies more challenging. As a result, the joint consideration of several proxies becomes necessary to predict a model’s robustness while the clean accuracy can be regressed from a single such feature. Our code is available at https://github.com/jovitalukasik/zcp_eval.

目前,人们经常研究并使用零成本代理来搜索神经结构。它们通过使用未训练的权重来预测体系结构的性能,表现出令人印象深刻的能力。这些技术可以极大地提高搜索速度。到目前为止,在NAS领域中,对性能良好且健壮的体系结构的联合搜索很少受到关注。因此,零成本代理的主要焦点是体系结构的干净准确性,而模型的鲁棒性应该发挥同等重要的作用。在本文中,我们分析了常见的零成本代理在流行的NAS-Bench-201搜索空间中作为鲁棒性性能预测指标的能力。我们感兴趣的是单一预测任务的鲁棒性和联合多目标的干净和鲁棒精度。我们进一步分析了代理的特征重要性,并表明预测鲁棒性使现有零成本代理的预测任务更具挑战性。因此,需要联合考虑多个代理来预测模型的鲁棒性,而干净的精度可以从单个这样的特征回归。我们的代码可在https://github.com/jovitalukasik/zcp_eval上获得。
{"title":"An Evaluation of Zero-Cost Proxies - from Neural Architecture Performance Prediction to Model Robustness","authors":"Jovita Lukasik, Michael Moeller, Margret Keuper","doi":"10.1007/s11263-024-02265-7","DOIUrl":"https://doi.org/10.1007/s11263-024-02265-7","url":null,"abstract":"<p>Zero-cost proxies are nowadays frequently studied and used to search for neural architectures. They show an impressive ability to predict the performance of architectures by making use of their untrained weights. These techniques allow for immense search speed-ups. So far the joint search for well performing and robust architectures has received much less attention in the field of NAS. Therefore, the main focus of zero-cost proxies is the clean accuracy of architectures, whereas the model robustness should play an evenly important part. In this paper, we analyze the ability of common zero-cost proxies to serve as performance predictors for robustness in the popular NAS-Bench-201 search space. We are interested in the single prediction task for robustness and the joint multi-objective of clean and robust accuracy. We further analyze the feature importance of the proxies and show that predicting the robustness makes the prediction task from existing zero-cost proxies more challenging. As a result, the joint consideration of several proxies becomes necessary to predict a model’s robustness while the clean accuracy can be regressed from a single such feature. Our code is available at https://github.com/jovitalukasik/zcp_eval.</p>","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"47 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142797127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Occlusion-Preserved Surveillance Video Synopsis with Flexible Object Graph 基于柔性目标图的遮挡保留监控视频摘要
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-12-09 DOI: 10.1007/s11263-024-02302-5
Yongwei Nie, Wei Ge, Siming Zeng, Qing Zhang, Guiqing Li, Ping Li, Hongmin Cai

Video synopsis is a technique that condenses a long surveillance video to a short summary. It faces challenges to process objects originally occluding each other in the source video. Previous approaches either treat occlusion objects as a single object, which however reduce compression ratio; or have to separate occlusion objects individually, but destroy interactions between them and yield visual artifacts. This paper presents a novel data structure called Flexible Object Graph (FOG) to handle original occlusions. Our FOG-based video synopsis approach can manipulate each object flexibly while preserving the original occlusions between them, achieving high synopsis ratio while maintaining interactions of objects. A challenging issue that comes with the introduction of FOG is that FOG may contain circulations that yield conflicts. We solve this problem by proposing a circulation conflict resolving algorithm. Furthermore, video synopsis methods usually minimize a multi-objective energy function. Previous approaches optimize the multiple objectives simultaneously which needs to strike a balance between them. Instead, we propose a stepwise optimization strategy consuming less running time while producing higher quality. Experiments demonstrate the effectiveness of our method.

视频摘要是一种将长监控视频压缩成短摘要的技术。它面临着处理源视频中原本相互遮挡的对象的挑战。先前的方法要么将遮挡对象作为单个对象处理,但降低了压缩比;或者必须单独分离遮挡对象,但破坏它们之间的相互作用并产生视觉伪影。本文提出了一种新的数据结构,称为柔性目标图(FOG)来处理原始遮挡。我们的基于fog的视频摘要方法可以灵活地操纵每个目标,同时保持它们之间的原始遮挡,在保持目标交互的同时实现高的摘要率。引入FOG带来的一个具有挑战性的问题是,FOG可能包含产生冲突的循环。我们提出了一种循环冲突解决算法来解决这个问题。此外,视频摘要方法通常最小化多目标能量函数。以往的方法是同时优化多个目标,需要在多个目标之间取得平衡。相反,我们提出了一种消耗更少运行时间而产生更高质量的逐步优化策略。实验证明了该方法的有效性。
{"title":"Occlusion-Preserved Surveillance Video Synopsis with Flexible Object Graph","authors":"Yongwei Nie, Wei Ge, Siming Zeng, Qing Zhang, Guiqing Li, Ping Li, Hongmin Cai","doi":"10.1007/s11263-024-02302-5","DOIUrl":"https://doi.org/10.1007/s11263-024-02302-5","url":null,"abstract":"<p>Video synopsis is a technique that condenses a long surveillance video to a short summary. It faces challenges to process objects originally occluding each other in the source video. Previous approaches either treat occlusion objects as a single object, which however reduce compression ratio; or have to separate occlusion objects individually, but destroy interactions between them and yield visual artifacts. This paper presents a novel data structure called Flexible Object Graph (FOG) to handle original occlusions. Our FOG-based video synopsis approach can manipulate each object flexibly while preserving the original occlusions between them, achieving high synopsis ratio while maintaining interactions of objects. A challenging issue that comes with the introduction of FOG is that FOG may contain circulations that yield conflicts. We solve this problem by proposing a circulation conflict resolving algorithm. Furthermore, video synopsis methods usually minimize a multi-objective energy function. Previous approaches optimize the multiple objectives simultaneously which needs to strike a balance between them. Instead, we propose a stepwise optimization strategy consuming less running time while producing higher quality. Experiments demonstrate the effectiveness of our method.</p>","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"212 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142797123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Object Pose Estimation Based on Multi-precision Vectors and Seg-Driven PnP 基于多精度矢量和分段驱动PnP的目标姿态估计
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-12-07 DOI: 10.1007/s11263-024-02317-y
Yulin Wang, Hongli Li, Chen Luo

Object pose estimation based on a single RGB image has wide application potential but is difficult to achieve. Existing pose estimation involves various inference pipelines. One popular pipeline is to first use Convolutional Neural Networks (CNN) to predict 2D projections of 3D keypoints in a single RGB image and then calculate the 6D pose via a Perspective-n-Point (PnP) solver. Due to the gap between synthetic data and real data, the model trained on synthetic data has difficulty predicting the 6D pose accurately when applied to real data. To address the acute problem, we propose a two-stage pipeline of object pose estimation based upon multi-precision vectors and segmentation-driven (Seg-Driven) PnP. In keypoint localization stage, we first develop a CNN-based three-branch network to predict multi-precision 2D vectors pointing to 2D keypoints. Then we introduce an accurate and fast Keypoint Voting scheme of Multi-precision vectors (KVM), which computes low-precision 2D keypoints using low-precision vectors and refines 2D keypoints on mid- and high-precision vectors. In the pose calculation stage, we propose Seg-Driven PnP to refine the 3D Translation of poses and get the optimal pose by minimizing the non-overlapping area between segmented and rendered masks. The Seg-Driven PnP leverages 2D segmentation trained on real images to improve the accuracy of pose estimation trained on synthetic data, thereby reducing the synthetic-to-real gap. Extensive experiments show our approach materially outperforms state-of-the-art methods on LM and HB datasets. Importantly, our proposed method works reasonably well for weakly textured and occluded objects in diverse scenes.

基于单幅RGB图像的目标姿态估计具有广泛的应用潜力,但实现难度较大。现有的姿态估计涉及各种推理管道。一种流行的方法是首先使用卷积神经网络(CNN)来预测单个RGB图像中3D关键点的2D投影,然后通过Perspective-n-Point (PnP)求解器计算6D姿态。由于合成数据与真实数据之间的差距,使用合成数据训练的模型在应用于真实数据时难以准确预测6D位姿。为了解决这个尖锐的问题,我们提出了一种基于多精度向量和分割驱动(segdriven) PnP的两阶段目标姿态估计管道。在关键点定位阶段,我们首先开发了一种基于cnn的三分支网络来预测指向二维关键点的多精度二维向量。在此基础上,提出了一种精确、快速的多精度矢量关键点投票方案,该方案利用低精度矢量计算低精度二维关键点,并在中高精度矢量上细化二维关键点。在姿态计算阶段,我们提出了分段驱动的PnP算法来细化姿态的三维平移,并通过最小化分割和渲染蒙版之间的非重叠区域来获得最优姿态。Seg-Driven PnP利用在真实图像上训练的2D分割来提高在合成数据上训练的姿态估计的准确性,从而减少合成与真实的差距。大量的实验表明,我们的方法在LM和HB数据集上明显优于最先进的方法。重要的是,我们提出的方法对于不同场景中的弱纹理和遮挡物体都能很好地工作。
{"title":"Object Pose Estimation Based on Multi-precision Vectors and Seg-Driven PnP","authors":"Yulin Wang, Hongli Li, Chen Luo","doi":"10.1007/s11263-024-02317-y","DOIUrl":"https://doi.org/10.1007/s11263-024-02317-y","url":null,"abstract":"<p>Object pose estimation based on a single RGB image has wide application potential but is difficult to achieve. Existing pose estimation involves various inference pipelines. One popular pipeline is to first use Convolutional Neural Networks (CNN) to predict 2D projections of 3D keypoints in a single RGB image and then calculate the 6D pose via a Perspective-n-Point (PnP) solver. Due to the gap between synthetic data and real data, the model trained on synthetic data has difficulty predicting the 6D pose accurately when applied to real data. To address the acute problem, we propose a two-stage pipeline of object pose estimation based upon multi-precision vectors and segmentation-driven (Seg-Driven) PnP. In keypoint localization stage, we first develop a CNN-based three-branch network to predict multi-precision 2D vectors pointing to 2D keypoints. Then we introduce an accurate and fast Keypoint Voting scheme of Multi-precision vectors (KVM), which computes low-precision 2D keypoints using low-precision vectors and refines 2D keypoints on mid- and high-precision vectors. In the pose calculation stage, we propose Seg-Driven PnP to refine the 3D Translation of poses and get the optimal pose by minimizing the non-overlapping area between segmented and rendered masks. The Seg-Driven PnP leverages 2D segmentation trained on real images to improve the accuracy of pose estimation trained on synthetic data, thereby reducing the synthetic-to-real gap. Extensive experiments show our approach materially outperforms state-of-the-art methods on LM and HB datasets. Importantly, our proposed method works reasonably well for weakly textured and occluded objects in diverse scenes.</p>","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"6 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2024-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142788543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Computer Vision
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1