首页 > 最新文献

Journal of Visual Communication and Image Representation最新文献

英文 中文
Enhancing 3D point cloud generation via Mamba-based time-varying denoising diffusion 通过基于曼巴的时变去噪扩散增强三维点云生成
IF 3.1 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-11-27 DOI: 10.1016/j.jvcir.2025.104657
DaoPeng Zhang, Li Yu
3D point cloud generation plays a pivotal role in a wide range of applications, including robotics, medical imaging, autonomous driving, and virtual/augmented reality (VR/AR). However, generating high-quality point clouds remains highly challenging due to the irregularity and unordered nature of point cloud data. Existing Transformer-based generative models suffer from quadratic computational complexity, which limits their ability to capture global contextual dependencies and often leads to the loss of critical geometric information. To address these limitations, we propose a novel diffusion-based framework for point cloud generation that integrates the Mamba state-space model — known for its linear complexity and strong long-sequence modeling capability — with convolutional layers. Specifically, Mamba is employed to capture global structural dependencies across time steps, while the convolutional layers refine local geometric details. To effectively leverage the strengths of both components, we introduce a learnable masking mechanism that dynamically fuses global and local features at optimal time steps, thereby exploiting their complementary advantages. Extensive experiments demonstrate that our model outperforms previous point cloud generative approaches such as TIGER and PVD in terms of both quality and diversity. On the airplane category, our model achieves a 9.28% improvement in 1-NNA accuracy based on EMD compared to PVD, and a 1.72% improvement based on CD compared to TIGER. Compared with recent baseline models, our method consistently achieves significant gains across multiple evaluation metrics.
3D点云生成在广泛的应用中起着关键作用,包括机器人、医学成像、自动驾驶和虚拟/增强现实(VR/AR)。然而,由于点云数据的不规则性和无序性,生成高质量的点云仍然具有很高的挑战性。现有的基于transformer的生成模型存在二次计算复杂性,这限制了它们捕获全局上下文依赖关系的能力,并且经常导致关键几何信息的丢失。为了解决这些限制,我们提出了一种新的基于扩散的点云生成框架,该框架将Mamba状态空间模型(以其线性复杂性和强大的长序列建模能力而闻名)与卷积层集成在一起。具体来说,Mamba用于捕获跨时间步长的全局结构依赖关系,而卷积层则细化局部几何细节。为了有效地利用这两个组件的优势,我们引入了一种可学习的掩蔽机制,该机制在最佳时间步长动态融合全局和局部特征,从而利用它们的互补优势。大量的实验表明,我们的模型在质量和多样性方面都优于以前的点云生成方法,如TIGER和PVD。在飞机类别中,我们的模型基于EMD的1-NNA精度比PVD提高了9.28%,基于CD的1-NNA精度比TIGER提高了1.72%。与最近的基线模型相比,我们的方法在多个评估指标中一致地取得了显著的收益。
{"title":"Enhancing 3D point cloud generation via Mamba-based time-varying denoising diffusion","authors":"DaoPeng Zhang,&nbsp;Li Yu","doi":"10.1016/j.jvcir.2025.104657","DOIUrl":"10.1016/j.jvcir.2025.104657","url":null,"abstract":"<div><div>3D point cloud generation plays a pivotal role in a wide range of applications, including robotics, medical imaging, autonomous driving, and virtual/augmented reality (VR/AR). However, generating high-quality point clouds remains highly challenging due to the irregularity and unordered nature of point cloud data. Existing Transformer-based generative models suffer from quadratic computational complexity, which limits their ability to capture global contextual dependencies and often leads to the loss of critical geometric information. To address these limitations, we propose a novel diffusion-based framework for point cloud generation that integrates the Mamba state-space model — known for its linear complexity and strong long-sequence modeling capability — with convolutional layers. Specifically, Mamba is employed to capture global structural dependencies across time steps, while the convolutional layers refine local geometric details. To effectively leverage the strengths of both components, we introduce a learnable masking mechanism that dynamically fuses global and local features at optimal time steps, thereby exploiting their complementary advantages. Extensive experiments demonstrate that our model outperforms previous point cloud generative approaches such as TIGER and PVD in terms of both quality and diversity. On the airplane category, our model achieves a 9.28% improvement in 1-NNA accuracy based on EMD compared to PVD, and a 1.72% improvement based on CD compared to TIGER. Compared with recent baseline models, our method consistently achieves significant gains across multiple evaluation metrics.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"115 ","pages":"Article 104657"},"PeriodicalIF":3.1,"publicationDate":"2025-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145610298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BIAN: Bidirectional interwoven attention network for retinal OCT image classification 边:用于视网膜OCT图像分类的双向交织注意网络
IF 3.1 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-11-25 DOI: 10.1016/j.jvcir.2025.104654
Ahmed Alasri , Zhixiang Chen , Yalong Xiao , Chengzhang Zhu , Abdulrahman Noman , Raeed Alsabri , Harrison Xiao Bai
Retinal diseases are a significant global health concern, requiring advanced diagnostic tools for early detection and treatment. Automated diagnosis of retinal diseases using deep learning can significantly enhance early detection and intervention efforts. However, conventional deep learning models that concentrate on localized perspectives often develop feature representations that lack sufficient semantic discriminative capability. Conversely, models that prioritize global semantic-level information may fail to capture essential, subtle local pathological features. To address this issue, we propose BIAN, a novel Bidirectional Interwoven Attention Network designed for the classification of retinal Optical Coherence Tomography (OCT) images. BIAN synergistically combines the strengths of Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) by integrating a ResNet architecture backbone with a ViT backbone through a bidirectional interwoven attention block. This network enables the model to effectively capture both local features and global contextual information. Specifically, the bidirectional interwoven attention block allow the ResNet and ViT components to attend to each other’s feature representations, enhancing the network’s overall learning capacity. We evaluated BIAN on both the OCTID and OCTDL datasets for retinal disease classification. The OCTID dataset includes conditions such as Age-related Macular Degeneration (AMD), Macular Hole (MH), Central Serous Retinopathy (CSR), etc., while OCTDL covers AMD, Diabetic Macular Edema (DME), Epiretinal Membrane (ERM), Retinal Vein Occlusion (RVO), etc. On OCTID, the proposed model achieved 95.7% accuracy for five-class classification, outperforming existing state-of-the-art models. On OCTDL, BIAN attained 94.7% accuracy, with consistently high F1-scores (95.6% on OCTID, 94.6% on OCTDL) and AUC values (99.3% and 99.0%, respectively). These results highlight the potential of BIAN as a robust network for retinal OCT image classification in medical applications.
视网膜疾病是一个重要的全球健康问题,需要先进的诊断工具进行早期发现和治疗。使用深度学习的视网膜疾病自动诊断可以显著提高早期发现和干预的努力。然而,专注于局部视角的传统深度学习模型经常开发缺乏足够语义判别能力的特征表示。相反,优先考虑全局语义级信息的模型可能无法捕获基本的、微妙的局部病理特征。为了解决这个问题,我们提出了一种新的双向交织注意力网络BIAN,用于视网膜光学相干断层扫描(OCT)图像的分类。通过一个双向交织的注意块,将一个ResNet架构主干网与一个ViT主干网集成在一起,将卷积神经网络(cnn)和视觉变压器(ViTs)的优势协同结合起来。该网络使模型能够有效地捕获局部特征和全局上下文信息。具体来说,双向交织的注意块允许ResNet和ViT组件相互关注对方的特征表示,增强了网络的整体学习能力。我们在视网膜疾病分类的OCTID和OCTDL数据集上评估了BIAN。OCTID数据集包括年龄相关性黄斑变性(AMD)、黄斑孔(MH)、中心性浆液性视网膜病变(CSR)等疾病,而OCTDL涵盖AMD、糖尿病性黄斑水肿(DME)、视网膜前膜(ERM)、视网膜静脉闭塞(RVO)等疾病。在OCTID上,该模型对五类分类的准确率达到95.7%,优于现有最先进的模型。在OCTDL上,BIAN的准确率达到94.7%,f1得分(OCTID为95.6%,OCTDL为94.6%)和AUC值(分别为99.3%和99.0%)一直很高。这些结果突出了BIAN在医学应用中作为视网膜OCT图像分类的强大网络的潜力。
{"title":"BIAN: Bidirectional interwoven attention network for retinal OCT image classification","authors":"Ahmed Alasri ,&nbsp;Zhixiang Chen ,&nbsp;Yalong Xiao ,&nbsp;Chengzhang Zhu ,&nbsp;Abdulrahman Noman ,&nbsp;Raeed Alsabri ,&nbsp;Harrison Xiao Bai","doi":"10.1016/j.jvcir.2025.104654","DOIUrl":"10.1016/j.jvcir.2025.104654","url":null,"abstract":"<div><div>Retinal diseases are a significant global health concern, requiring advanced diagnostic tools for early detection and treatment. Automated diagnosis of retinal diseases using deep learning can significantly enhance early detection and intervention efforts. However, conventional deep learning models that concentrate on localized perspectives often develop feature representations that lack sufficient semantic discriminative capability. Conversely, models that prioritize global semantic-level information may fail to capture essential, subtle local pathological features. To address this issue, we propose BIAN, a novel Bidirectional Interwoven Attention Network designed for the classification of retinal Optical Coherence Tomography (OCT) images. BIAN synergistically combines the strengths of Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) by integrating a ResNet architecture backbone with a ViT backbone through a bidirectional interwoven attention block. This network enables the model to effectively capture both local features and global contextual information. Specifically, the bidirectional interwoven attention block allow the ResNet and ViT components to attend to each other’s feature representations, enhancing the network’s overall learning capacity. We evaluated BIAN on both the OCTID and OCTDL datasets for retinal disease classification. The OCTID dataset includes conditions such as Age-related Macular Degeneration (AMD), Macular Hole (MH), Central Serous Retinopathy (CSR), etc., while OCTDL covers AMD, Diabetic Macular Edema (DME), Epiretinal Membrane (ERM), Retinal Vein Occlusion (RVO), etc. On OCTID, the proposed model achieved 95.7% accuracy for five-class classification, outperforming existing state-of-the-art models. On OCTDL, BIAN attained 94.7% accuracy, with consistently high F1-scores (95.6% on OCTID, 94.6% on OCTDL) and AUC values (99.3% and 99.0%, respectively). These results highlight the potential of BIAN as a robust network for retinal OCT image classification in medical applications.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"115 ","pages":"Article 104654"},"PeriodicalIF":3.1,"publicationDate":"2025-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145748550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HySaM: An improved hybrid SAM and Mask R-CNN for underwater instance segmentation HySaM:一种改进的混合SAM和Mask R-CNN用于水下实例分割
IF 3.1 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-11-25 DOI: 10.1016/j.jvcir.2025.104656
Xingfa Wang , Chengjun Chen , Chenggang Dai , Kunhua Liu , Mingxing Lin
Due to inherent absorption and scattering effects, underwater images often exhibit low visibility and significant color deviation. These issues hinder the extraction of discriminative features and adversely impact instance-level segmentation accuracy. To address these challenges, this study proposes a novel Hybrid SAM and Mask R-CNN framework for underwater instance segmentation, integrating the strong generalization capability of SAM with the structural decoding strength of Mask R-CNN. The powerful global modeling ability of SAM effectively mitigates the impact of underwater image degradation, thereby enabling more robust feature representation. Moreover, a novel underwater feature weighted enhancer is introduced in the framework to enhance multi-scale feature fusion and improve the detection of small and scale-varying objects in underwater environments. To provide benchmark data, a large-scale underwater instance segmentation dataset, UW10K, is also constructed, comprising 13,551 images and 22,968 annotated instances across 15 categories. Comprehensive experiments validate the superiority of the proposed model across various instance segmentation tasks. Specifically, it achieves precisions of 74.2 %, 40.5 %, and 70.6 % on UW10K, USIS10K, and WHU Building datasets, respectively. This study is expected to advance ocean exploration and fisheries, while providing valuable training samples for instance segmentation tasks. Datasets and codes are available at https://github.com/xfwang-qut/HySaM.
由于固有的吸收和散射效应,水下图像往往表现出低能见度和显着的色彩偏差。这些问题阻碍了鉴别特征的提取,并对实例级分割的准确性产生不利影响。为了解决这些挑战,本研究提出了一种新的混合SAM和Mask R-CNN框架用于水下实例分割,将SAM的强大泛化能力与Mask R-CNN的结构解码强度相结合。SAM强大的全局建模能力有效减轻了水下图像退化的影响,从而实现了更鲁棒的特征表示。此外,在该框架中引入了一种新的水下特征加权增强器,增强了多尺度特征融合,提高了水下环境中小目标和尺度变化目标的检测能力。为了提供基准数据,还构建了一个大型水下实例分割数据集UW10K,该数据集包括15个类别的13,551张图像和22,968个注释实例。综合实验验证了该模型在各种实例分割任务中的优越性。具体来说,它在UW10K、USIS10K和WHU Building数据集上分别达到了74.2%、40.5%和70.6%的精度。该研究有望推动海洋勘探和渔业,同时为实例分割任务提供有价值的训练样本。数据集和代码可在https://github.com/xfwang-qut/HySaM上获得。
{"title":"HySaM: An improved hybrid SAM and Mask R-CNN for underwater instance segmentation","authors":"Xingfa Wang ,&nbsp;Chengjun Chen ,&nbsp;Chenggang Dai ,&nbsp;Kunhua Liu ,&nbsp;Mingxing Lin","doi":"10.1016/j.jvcir.2025.104656","DOIUrl":"10.1016/j.jvcir.2025.104656","url":null,"abstract":"<div><div>Due to inherent absorption and scattering effects, underwater images often exhibit low visibility and significant color deviation. These issues hinder the extraction of discriminative features and adversely impact instance-level segmentation accuracy. To address these challenges, this study proposes a novel Hybrid SAM and Mask R-CNN framework for underwater instance segmentation, integrating the strong generalization capability of SAM with the structural decoding strength of Mask R-CNN. The powerful global modeling ability of SAM effectively mitigates the impact of underwater image degradation, thereby enabling more robust feature representation. Moreover, a novel underwater feature weighted enhancer is introduced in the framework to enhance multi-scale feature fusion and improve the detection of small and scale-varying objects in underwater environments. To provide benchmark data, a large-scale underwater instance segmentation dataset, UW10K, is also constructed, comprising 13,551 images and 22,968 annotated instances across 15 categories. Comprehensive experiments validate the superiority of the proposed model across various instance segmentation tasks. Specifically, it achieves precisions of 74.2 %, 40.5 %, and 70.6 % on UW10K, USIS10K, and WHU Building datasets, respectively. This study is expected to advance ocean exploration and fisheries, while providing valuable training samples for instance segmentation tasks. Datasets and codes are available at <span><span>https://github.com/xfwang-qut/HySaM</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"115 ","pages":"Article 104656"},"PeriodicalIF":3.1,"publicationDate":"2025-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145694273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-scale interleaved transformer network for image deraining 多尺度交错变压器网络图像提取
IF 3.1 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-11-25 DOI: 10.1016/j.jvcir.2025.104655
Yue Que , Chen Qiu , Hanqing Xiong , Xue Xia , Zhiwei Liu
Convolutional neural networks (CNNs) have demonstrated impressive performance in image deraining tasks. However, CNNs have a limited receptive field, which restricts their ability to adapt to spatial variations in the input. Recently, Transformers have demonstrated promising results in image deraining, surpassing CNN in several cases. However, most existing methods leverage limited spatial input information through attribution analysis. In this paper, we investigated the construction of multi-scale feature representations within Transformers to fully exploit their potential in image deraining. We propose a multi-scale interleaved Transformer framework, which aims to reconstruct high-quality images by leveraging information across different scales, thereby enabling it to better capture the size and distribution of rain. In addition, we introduce a hybrid cross-attention mechanism to replace traditional feature fusion, facilitating global feature interaction and capturing complementary information across scales simultaneously. Our approach surpasses state-of-the-art methods in terms of image deraining performance on two types of benchmark datasets.
卷积神经网络(cnn)在图像训练任务中表现出了令人印象深刻的性能。然而,cnn有一个有限的接受野,这限制了它们适应输入空间变化的能力。最近,变形金刚在图像脱轨方面显示出了令人鼓舞的结果,在一些情况下超过了CNN。然而,大多数现有方法通过归因分析利用有限的空间输入信息。在本文中,我们研究了变形金刚内部的多尺度特征表示的构建,以充分发挥其在图像提取中的潜力。我们提出了一个多尺度交错的变压器框架,旨在通过利用不同尺度的信息重建高质量的图像,从而使其能够更好地捕捉降雨的大小和分布。此外,我们引入了一种混合的交叉注意机制来取代传统的特征融合,促进全局特征交互,同时捕获跨尺度的互补信息。在两种类型的基准数据集上,我们的方法在图像提取性能方面超过了最先进的方法。
{"title":"Multi-scale interleaved transformer network for image deraining","authors":"Yue Que ,&nbsp;Chen Qiu ,&nbsp;Hanqing Xiong ,&nbsp;Xue Xia ,&nbsp;Zhiwei Liu","doi":"10.1016/j.jvcir.2025.104655","DOIUrl":"10.1016/j.jvcir.2025.104655","url":null,"abstract":"<div><div>Convolutional neural networks (CNNs) have demonstrated impressive performance in image deraining tasks. However, CNNs have a limited receptive field, which restricts their ability to adapt to spatial variations in the input. Recently, Transformers have demonstrated promising results in image deraining, surpassing CNN in several cases. However, most existing methods leverage limited spatial input information through attribution analysis. In this paper, we investigated the construction of multi-scale feature representations within Transformers to fully exploit their potential in image deraining. We propose a multi-scale interleaved Transformer framework, which aims to reconstruct high-quality images by leveraging information across different scales, thereby enabling it to better capture the size and distribution of rain. In addition, we introduce a hybrid cross-attention mechanism to replace traditional feature fusion, facilitating global feature interaction and capturing complementary information across scales simultaneously. Our approach surpasses state-of-the-art methods in terms of image deraining performance on two types of benchmark datasets.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"115 ","pages":"Article 104655"},"PeriodicalIF":3.1,"publicationDate":"2025-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145694272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PatchNeRF: Patch-based Neural Radiance Fields for real time view synthesis in wide-scale scenes PatchNeRF:基于patch的神经辐射场,用于大尺度场景的实时视图合成
IF 3.1 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-11-20 DOI: 10.1016/j.jvcir.2025.104602
Ziyu Hu , Xiaoguang Jiang , Qiong Liu , Xin Ding
Recent methods based on Neural Radiance Fields (NeRFs) have excelled in real-time novel view synthesis for small-scale scenes but struggle with fast rendering for large-scale scenes. Achieving a balance in performance between small-scale and large-scale scenes has emerged as a challenging problem. To address this, we propose PatchNeRF, a patch-based NeRF representation for wide-scale scenes. PatchNeRF uses small 2D patches to fit surfaces, learning a 2D neural radiance field for local geometry and texture. To make the most of sampling patches and skip empty space, we propose strategies for initializing and progressively updating the patch structure, along with performing end-to-end training using both large and tiny MLPs. After training, we prebake the implicit 2D neural radiance fields as feature maps to accelerate the rendering process. Experiments demonstrate that our approach outperforms state-of-the-art methods in both small-scale and large-scale scenes, while achieving superior rendering speeds.
基于神经辐射场(Neural Radiance Fields, nerf)的方法在小规模场景的实时新视图合成方面表现出色,但在大规模场景的快速渲染方面存在困难。实现小规模和大规模场景之间的性能平衡已经成为一个具有挑战性的问题。为了解决这个问题,我们提出了PatchNeRF,一个基于补丁的大尺度场景NeRF表示。PatchNeRF使用小的2D补丁来拟合表面,为局部几何和纹理学习2D神经辐射场。为了充分利用采样补丁并跳过空白空间,我们提出了初始化和逐步更新补丁结构的策略,以及使用大型和小型mlp进行端到端训练。训练完成后,我们将隐式二维神经辐射场预焙为特征映射,以加速渲染过程。实验表明,我们的方法在小规模和大规模场景中都优于最先进的方法,同时实现了卓越的渲染速度。
{"title":"PatchNeRF: Patch-based Neural Radiance Fields for real time view synthesis in wide-scale scenes","authors":"Ziyu Hu ,&nbsp;Xiaoguang Jiang ,&nbsp;Qiong Liu ,&nbsp;Xin Ding","doi":"10.1016/j.jvcir.2025.104602","DOIUrl":"10.1016/j.jvcir.2025.104602","url":null,"abstract":"<div><div>Recent methods based on Neural Radiance Fields (NeRFs) have excelled in real-time novel view synthesis for small-scale scenes but struggle with fast rendering for large-scale scenes. Achieving a balance in performance between small-scale and large-scale scenes has emerged as a challenging problem. To address this, we propose PatchNeRF, a patch-based NeRF representation for wide-scale scenes. PatchNeRF uses small 2D patches to fit surfaces, learning a 2D neural radiance field for local geometry and texture. To make the most of sampling patches and skip empty space, we propose strategies for initializing and progressively updating the patch structure, along with performing end-to-end training using both large and tiny MLPs. After training, we prebake the implicit 2D neural radiance fields as feature maps to accelerate the rendering process. Experiments demonstrate that our approach outperforms state-of-the-art methods in both small-scale and large-scale scenes, while achieving superior rendering speeds.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"115 ","pages":"Article 104602"},"PeriodicalIF":3.1,"publicationDate":"2025-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145610299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-modal deep facial expression recognition framework combining knowledge distillation and retrieval-augmented generation 结合知识蒸馏和检索增强生成的多模态深度面部表情识别框架
IF 3.1 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-11-15 DOI: 10.1016/j.jvcir.2025.104645
Beibei Jiang, Yu Zhou
In recent years, significant progress has been made in facial expression recognition (FER) methods based on deep learning. However, existing models still face challenges in terms of computational efficiency and generalization performance when dealing with diverse emotional expressions and complex environmental variations. Recently, large-scale vision-language pre-training models such as CLIP have achieved remarkable success in multi-modal learning. Their rich visual and textual representations offer valuable insights for downstream tasks. Consequently, transferring the knowledge to develop efficient and accurate facial expression recognition (FER) systems has emerged as a key research direction. To the end, this paper proposes a novel model, termed Knowledge Distillation and Retrieval-Augmented Generation (KDRAG), which combines Distillation and Retrieval-Augmented Generation (RAG) techniques to improve the efficiency and accuracy of FER. Through knowledge distillation, the teacher model (ViT-L/14) transfers its rich knowledge to the smaller student model (ViT-B/32). An additional linear projection layer is added to map the teacher model’s output features to the student model’s feature dimensions for feature alignment. Moreover, the RAG mechanism is developed to enhance the emotional understanding of students by retrieving text descriptions related to the input image. Additionally, this framework combines soft loss (from the teacher model’s knowledge) and hard loss (from the true targets of the labels) to enhance the model’s generalization ability. Extensive experimental results on multiple datasets demonstrate that the KDRAG framework can achieve significant improvements in accuracy and computational efficiency, providing new insights for real-time FER systems.
近年来,基于深度学习的面部表情识别方法取得了重大进展。然而,现有模型在处理复杂的情绪表达和复杂的环境变化时,在计算效率和泛化性能方面仍然面临挑战。近年来,大规模的视觉语言预训练模型(如CLIP)在多模态学习中取得了显著的成功。它们丰富的可视化和文本表示为下游任务提供了有价值的见解。因此,利用这些知识开发高效、准确的面部表情识别系统已成为一个重要的研究方向。最后,本文提出了一种新的知识精馏和检索增强生成(KDRAG)模型,该模型将精馏和检索增强生成(RAG)技术相结合,以提高知识精馏和检索增强生成的效率和准确性。教师模型(viti - l /14)通过知识提炼,将其丰富的知识传递给较小的学生模型(viti - b /32)。添加了一个额外的线性投影层,将教师模型的输出特征映射到学生模型的特征维度,以进行特征对齐。此外,我们开发了RAG机制,通过检索与输入图像相关的文本描述来增强学生的情感理解。此外,该框架结合了软损失(来自教师模型的知识)和硬损失(来自标签的真实目标),以增强模型的泛化能力。在多个数据集上的大量实验结果表明,KDRAG框架可以显著提高精度和计算效率,为实时FER系统提供新的见解。
{"title":"Multi-modal deep facial expression recognition framework combining knowledge distillation and retrieval-augmented generation","authors":"Beibei Jiang,&nbsp;Yu Zhou","doi":"10.1016/j.jvcir.2025.104645","DOIUrl":"10.1016/j.jvcir.2025.104645","url":null,"abstract":"<div><div>In recent years, significant progress has been made in facial expression recognition (FER) methods based on deep learning. However, existing models still face challenges in terms of computational efficiency and generalization performance when dealing with diverse emotional expressions and complex environmental variations. Recently, large-scale vision-language pre-training models such as CLIP have achieved remarkable success in multi-modal learning. Their rich visual and textual representations offer valuable insights for downstream tasks. Consequently, transferring the knowledge to develop efficient and accurate facial expression recognition (FER) systems has emerged as a key research direction. To the end, this paper proposes a novel model, termed Knowledge Distillation and Retrieval-Augmented Generation (KDRAG), which combines Distillation and Retrieval-Augmented Generation (RAG) techniques to improve the efficiency and accuracy of FER. Through knowledge distillation, the teacher model (ViT-L/14) transfers its rich knowledge to the smaller student model (ViT-B/32). An additional linear projection layer is added to map the teacher model’s output features to the student model’s feature dimensions for feature alignment. Moreover, the RAG mechanism is developed to enhance the emotional understanding of students by retrieving text descriptions related to the input image. Additionally, this framework combines soft loss (from the teacher model’s knowledge) and hard loss (from the true targets of the labels) to enhance the model’s generalization ability. Extensive experimental results on multiple datasets demonstrate that the KDRAG framework can achieve significant improvements in accuracy and computational efficiency, providing new insights for real-time FER systems.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"114 ","pages":"Article 104645"},"PeriodicalIF":3.1,"publicationDate":"2025-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145520897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing aesthetic image generation with reinforcement learning guided prompt optimization in stable diffusion 在稳定扩散中,用强化学习引导提示优化增强美学图像的生成
IF 3.1 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-11-15 DOI: 10.1016/j.jvcir.2025.104641
Junyong You , Yuan Lin , Bin Hu
Generative models, e.g., stable diffusion, excel at producing compelling images but remain highly dependent on crafted prompts. Refining prompts for specific objectives, especially aesthetic quality, is time-consuming and inconsistent. We propose a novel approach that leverages LLMs to enhance prompt refinement process for stable diffusion. First, we propose a model to predict aesthetic image quality, examining various aesthetic elements in spatial, channel, and color domains. Reinforcement learning is employed to refine the prompt, starting from a rudimentary version and iteratively improving them with LLM’s assistance. This iterative process is guided by a policy network updating prompts based on interactions with the generated images, with a reward function measuring aesthetic improvement and adherence to the prompt. Our experimental results demonstrate that this method significantly boosts the visual quality of generated images when using these refined prompts. Beyond image synthesis, this approach provides a broader framework for improving prompts across diverse applications with the support of LLMs.
生成模型,例如,稳定扩散,擅长产生引人注目的图像,但仍然高度依赖于精心制作的提示。针对特定目标(尤其是美学质量)精炼提示既耗时又不一致。我们提出了一种新的方法,利用llm来增强稳定扩散的快速细化过程。首先,我们提出了一个模型来预测美学图像质量,检查空间,通道和颜色域的各种美学元素。强化学习被用来完善提示,从一个基本的版本开始,在LLM的帮助下迭代改进它们。这个迭代过程由一个策略网络指导,该网络基于与生成的图像的交互更新提示,并带有衡量美学改进和对提示的遵守的奖励功能。我们的实验结果表明,当使用这些改进的提示时,该方法显著提高了生成图像的视觉质量。除了图像合成之外,这种方法还提供了一个更广泛的框架,可以在llm的支持下改进不同应用程序之间的提示。
{"title":"Enhancing aesthetic image generation with reinforcement learning guided prompt optimization in stable diffusion","authors":"Junyong You ,&nbsp;Yuan Lin ,&nbsp;Bin Hu","doi":"10.1016/j.jvcir.2025.104641","DOIUrl":"10.1016/j.jvcir.2025.104641","url":null,"abstract":"<div><div>Generative models, e.g., stable diffusion, excel at producing compelling images but remain highly dependent on crafted prompts. Refining prompts for specific objectives, especially aesthetic quality, is time-consuming and inconsistent. We propose a novel approach that leverages LLMs to enhance prompt refinement process for stable diffusion. First, we propose a model to predict aesthetic image quality, examining various aesthetic elements in spatial, channel, and color domains. Reinforcement learning is employed to refine the prompt, starting from a rudimentary version and iteratively improving them with LLM’s assistance. This iterative process is guided by a policy network updating prompts based on interactions with the generated images, with a reward function measuring aesthetic improvement and adherence to the prompt. Our experimental results demonstrate that this method significantly boosts the visual quality of generated images when using these refined prompts. Beyond image synthesis, this approach provides a broader framework for improving prompts across diverse applications with the support of LLMs.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"114 ","pages":"Article 104641"},"PeriodicalIF":3.1,"publicationDate":"2025-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145571650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Vision-language tracking with attention-based optimization 基于注意力优化的视觉语言跟踪
IF 3.1 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-11-15 DOI: 10.1016/j.jvcir.2025.104644
Shuo Hu , Tongtong Liu , Liyang Han , Run Xing
Most existing visual tracking methods typically employ image patches as target references and endeavor to enhance tracking performance by maximizing the utilization of visual information through various deep networks. However, due to the intrinsic limitations of visual information, the performance of the trackers significantly deteriorates when confronted with drastic target variations or complex background environments. To address these issues, we propose a vision-language multimodal fusion tracker for object tracking. Firstly, we use semantic information from language descriptions to compensate for the instability of visual information, and establish multimodal cross-relations through the fusion of visual and language features. Secondly, we propose an attention-based token screening mechanism that utilizes semantic-guided attention and masking operations to eliminate irrelevant search tokens devoid of target information, thereby enhancing both accuracy and efficiency. Furthermore, we optimize the localization head by introducing channel attention, which effectively improves the accuracy of target positioning. Extensive experiments conducted on the OTB99, LaSOT, and TNL2K datasets demonstrate the effectiveness of our proposed tracking method, achieving success rates of 71.2%, 69.5%, and 58.9%, respectively.
现有的大多数视觉跟踪方法通常采用图像补丁作为目标参考,并通过各种深度网络最大限度地利用视觉信息来提高跟踪性能。然而,由于视觉信息固有的局限性,当目标变化剧烈或背景环境复杂时,跟踪器的性能会显著下降。为了解决这些问题,我们提出了一种用于目标跟踪的视觉语言多模态融合跟踪器。首先,利用语言描述中的语义信息弥补视觉信息的不稳定性,通过视觉特征和语言特征的融合建立多模态交叉关系;其次,我们提出了一种基于注意力的令牌筛选机制,该机制利用语义引导的注意力和屏蔽操作来消除缺乏目标信息的不相关搜索令牌,从而提高准确性和效率。此外,通过引入信道注意对定位头进行优化,有效提高了目标定位的精度。在OTB99、LaSOT和TNL2K数据集上进行的大量实验证明了我们提出的跟踪方法的有效性,成功率分别为71.2%、69.5%和58.9%。
{"title":"Vision-language tracking with attention-based optimization","authors":"Shuo Hu ,&nbsp;Tongtong Liu ,&nbsp;Liyang Han ,&nbsp;Run Xing","doi":"10.1016/j.jvcir.2025.104644","DOIUrl":"10.1016/j.jvcir.2025.104644","url":null,"abstract":"<div><div>Most existing visual tracking methods typically employ image patches as target references and endeavor to enhance tracking performance by maximizing the utilization of visual information through various deep networks. However, due to the intrinsic limitations of visual information, the performance of the trackers significantly deteriorates when confronted with drastic target variations or complex background environments. To address these issues, we propose a vision-language multimodal fusion tracker for object tracking. Firstly, we use semantic information from language descriptions to compensate for the instability of visual information, and establish multimodal cross-relations through the fusion of visual and language features. Secondly, we propose an attention-based token screening mechanism that utilizes semantic-guided attention and masking operations to eliminate irrelevant search tokens devoid of target information, thereby enhancing both accuracy and efficiency. Furthermore, we optimize the localization head by introducing channel attention, which effectively improves the accuracy of target positioning. Extensive experiments conducted on the OTB99, LaSOT, and TNL2K datasets demonstrate the effectiveness of our proposed tracking method, achieving success rates of 71.2%, 69.5%, and 58.9%, respectively.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"114 ","pages":"Article 104644"},"PeriodicalIF":3.1,"publicationDate":"2025-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145571651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Turbo principles meet compression: Rethinking nonlinear transformations in learned image compression Turbo原理满足压缩:重新思考学习图像压缩中的非线性变换
IF 3.1 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-11-15 DOI: 10.1016/j.jvcir.2025.104643
Chao Li , Wen Tan , Fanyang Meng , Runwei Ding , Ye Wang , Wei Liu , Yongsheng Liang
Learned image compression (LIC) has emerged as a powerful approach for achieving high rate–distortion performance. Most existing LIC techniques attempt to address performance limitations associated with downsampling and quantization-induced information loss by employing intricate nonlinear transformations and increasing the feature dimensions in entropy models. In this paper, we introduce a novel perspective by modeling the quantizer as a generalized channel with uniform noise, shifting LIC design toward minimizing the channel’s negative impact on compact feature representations. Drawing inspiration from turbo codes, we propose a turbo-like nonlinear transformation (TLNT). On the encoder side, TLNT-E disperses information loss through parallel component coding units, random interleaving, and puncturing, preserving the integrity of encoded features. At the decoder side, TLNT-D iteratively refines feature representations through interactive processing, enabling accurate reconstruction. Experimental results show that our method outperforms several state-of-the-art nonlinear transformation techniques while maintaining efficiency in parameter count and computational complexity.
学习图像压缩(LIC)已成为实现高速率失真性能的一种强有力的方法。大多数现有的LIC技术试图通过使用复杂的非线性变换和增加熵模型中的特征维数来解决与降采样和量化引起的信息损失相关的性能限制。在本文中,我们引入了一种新的视角,将量化器建模为具有均匀噪声的广义通道,将LIC设计转向最小化通道对紧凑特征表示的负面影响。从涡轮码中获得灵感,我们提出了一种类涡轮非线性变换(TLNT)。在编码器方面,TLNT-E通过并行分量编码单元、随机交错和穿刺来分散信息丢失,保持编码特征的完整性。在解码器端,TLNT-D通过交互处理迭代地改进特征表示,从而实现准确的重建。实验结果表明,该方法在保持参数计数和计算复杂度的同时,优于几种最先进的非线性变换技术。
{"title":"Turbo principles meet compression: Rethinking nonlinear transformations in learned image compression","authors":"Chao Li ,&nbsp;Wen Tan ,&nbsp;Fanyang Meng ,&nbsp;Runwei Ding ,&nbsp;Ye Wang ,&nbsp;Wei Liu ,&nbsp;Yongsheng Liang","doi":"10.1016/j.jvcir.2025.104643","DOIUrl":"10.1016/j.jvcir.2025.104643","url":null,"abstract":"<div><div>Learned image compression (LIC) has emerged as a powerful approach for achieving high rate–distortion performance. Most existing LIC techniques attempt to address performance limitations associated with downsampling and quantization-induced information loss by employing intricate nonlinear transformations and increasing the feature dimensions in entropy models. In this paper, we introduce a novel perspective by modeling the quantizer as a generalized channel with uniform noise, shifting LIC design toward minimizing the channel’s negative impact on compact feature representations. Drawing inspiration from turbo codes, we propose a turbo-like nonlinear transformation (TLNT). On the encoder side, TLNT-E disperses information loss through parallel component coding units, random interleaving, and puncturing, preserving the integrity of encoded features. At the decoder side, TLNT-D iteratively refines feature representations through interactive processing, enabling accurate reconstruction. Experimental results show that our method outperforms several state-of-the-art nonlinear transformation techniques while maintaining efficiency in parameter count and computational complexity.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"114 ","pages":"Article 104643"},"PeriodicalIF":3.1,"publicationDate":"2025-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145571649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Facial image super-resolution network for confusing arbitrary gender classifiers 用于混淆任意性别分类器的面部图像超分辨率网络
IF 3.1 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-11-11 DOI: 10.1016/j.jvcir.2025.104642
Jiliang Wang , Jia Liu , Siwang Zhou
Existing facial image super-resolution methods have identified the capacity to transform low-resolution facial images into high-resolution ones. However, clearer high-resolution facial images increase the possibility of accurately extracting soft biometric features, such as gender, posing a significant risk of privacy leakage. To address this issue, we propose a gender-protected face super-resolution network, which can incorporate gender-identified privacy information by introducing fine image distortion during the super-resolution process. It progressively transforms low-resolution images into high-resolution ones while partially disturbing the face images. This procedure ensures that the generated super-resolution facial images can still be utilized by face matchers for matching purposes, but are less reliable for attribute classifiers that attempt to extract gender features. Furthermore, we introduce leaping adversarial learning to help the super-resolution network to generate gender-protected facial images and work on arbitrary gender classifiers. Extensive experiments have been conducted using multiple face matchers and gender classifiers to evaluate the effectiveness of the proposed network. The results also demonstrate that our proposed image super-resolution network is adaptable to arbitrary attribute classifiers for protecting gender privacy, while preserving facial image quality.
现有的人脸图像超分辨率方法已经确定了将低分辨率人脸图像转换为高分辨率人脸图像的能力。然而,更清晰的高分辨率面部图像增加了准确提取软生物特征(如性别)的可能性,这带来了重大的隐私泄露风险。为了解决这一问题,我们提出了一种性别保护的人脸超分辨率网络,该网络通过在超分辨率过程中引入精细的图像失真来融合性别识别的隐私信息。它在局部干扰人脸图像的同时,逐步将低分辨率图像转换为高分辨率图像。这一过程确保了生成的超分辨率人脸图像仍然可以被人脸匹配器用于匹配目的,但对于试图提取性别特征的属性分类器来说,可靠性较低。此外,我们引入了跳跃对抗学习来帮助超分辨率网络生成性别保护的面部图像,并在任意性别分类器上工作。大量的实验已经使用多个人脸匹配器和性别分类器来评估所提出的网络的有效性。结果还表明,我们提出的图像超分辨率网络可以适应任意属性分类器,在保持面部图像质量的同时保护性别隐私。
{"title":"Facial image super-resolution network for confusing arbitrary gender classifiers","authors":"Jiliang Wang ,&nbsp;Jia Liu ,&nbsp;Siwang Zhou","doi":"10.1016/j.jvcir.2025.104642","DOIUrl":"10.1016/j.jvcir.2025.104642","url":null,"abstract":"<div><div>Existing facial image super-resolution methods have identified the capacity to transform low-resolution facial images into high-resolution ones. However, clearer high-resolution facial images increase the possibility of accurately extracting soft biometric features, such as gender, posing a significant risk of privacy leakage. To address this issue, we propose a gender-protected face super-resolution network, which can incorporate gender-identified privacy information by introducing fine image distortion during the super-resolution process. It progressively transforms low-resolution images into high-resolution ones while partially disturbing the face images. This procedure ensures that the generated super-resolution facial images can still be utilized by face matchers for matching purposes, but are less reliable for attribute classifiers that attempt to extract gender features. Furthermore, we introduce leaping adversarial learning to help the super-resolution network to generate gender-protected facial images and work on arbitrary gender classifiers. Extensive experiments have been conducted using multiple face matchers and gender classifiers to evaluate the effectiveness of the proposed network. The results also demonstrate that our proposed image super-resolution network is adaptable to arbitrary attribute classifiers for protecting gender privacy, while preserving facial image quality.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"114 ","pages":"Article 104642"},"PeriodicalIF":3.1,"publicationDate":"2025-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145520918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Visual Communication and Image Representation
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1