首页 > 最新文献

Journal of Visual Communication and Image Representation最新文献

英文 中文
Human-in-the-loop dual-branch architecture for image super-resolution 面向图像超分辨率的人在环双分支架构
IF 3.1 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-17 DOI: 10.1016/j.jvcir.2026.104726
Suraj Neelakantan, Martin Längkvist, Amy Loutfi
Single-image super-resolution aims to recover high-frequency detail from a single low-resolution image, but practical applications often require balancing distortion against perceptual quality. Existing methods typically produce a single fixed reconstruction and offer limited test-time control over this trade-off. This paper presents DR-SCAN, a dual-branch deep residual network for single-image super-resolution in which, during test-time inference, weights can be assigned to either of the branches to dynamically steer their respective contributions to the reconstructed output. An interactive interface enables users to re-weight the shallow and deep branches at inference or run a one-click LPIPS search, to navigate the distortion–perception trade-off without retraining the model. Ablation experiments confirm that both the second branch and the channel–spatial attention that is used within the residual blocks are essential for the network for better reconstruction, while the interactive tuning routine demonstrates the practical value of post-hoc branch fusion.
单图像超分辨率旨在从单个低分辨率图像中恢复高频细节,但实际应用通常需要平衡失真和感知质量。现有的方法通常只产生一个固定的重构,并且对这种权衡提供有限的测试时间控制。本文提出了一种用于单图像超分辨率的双分支深度残差网络DR-SCAN,在测试时间推理过程中,可以为任意一个分支分配权重,以动态地引导它们各自对重建输出的贡献。交互界面使用户能够在推理时重新权衡浅分支和深分支的权重,或者运行一键式LPIPS搜索,在不重新训练模型的情况下导航扭曲感知权衡。消融实验证实了第二分支和残块内使用的通道空间关注对于更好地重建网络是必不可少的,而交互式调优程序则证明了事后分支融合的实用价值。
{"title":"Human-in-the-loop dual-branch architecture for image super-resolution","authors":"Suraj Neelakantan,&nbsp;Martin Längkvist,&nbsp;Amy Loutfi","doi":"10.1016/j.jvcir.2026.104726","DOIUrl":"10.1016/j.jvcir.2026.104726","url":null,"abstract":"<div><div>Single-image super-resolution aims to recover high-frequency detail from a single low-resolution image, but practical applications often require balancing distortion against perceptual quality. Existing methods typically produce a single fixed reconstruction and offer limited test-time control over this trade-off. This paper presents DR-SCAN, a dual-branch deep residual network for single-image super-resolution in which, during test-time inference, weights can be assigned to either of the branches to dynamically steer their respective contributions to the reconstructed output. An interactive interface enables users to re-weight the shallow and deep branches at inference or run a one-click LPIPS search, to navigate the distortion–perception trade-off without retraining the model. Ablation experiments confirm that both the second branch and the channel–spatial attention that is used within the residual blocks are essential for the network for better reconstruction, while the interactive tuning routine demonstrates the practical value of post-hoc branch fusion.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"116 ","pages":"Article 104726"},"PeriodicalIF":3.1,"publicationDate":"2026-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146024775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pedestrian trajectory prediction using multi-cue transformer 基于多线索变压器的行人轨迹预测
IF 3.1 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-15 DOI: 10.1016/j.jvcir.2026.104723
Yanlong Tian , Rui Zhai , Xiaoting Fan , Qi Xue , Zhong Zhang , Xinshan Zhu
Pedestrian trajectory prediction is a challenging issue because the future trajectories are influenced by the surrounding environment and constrained by the common sense rules. The existing trajectory prediction methods typically consider one kind of cues, i.e., social-aware cue, environment-aware cue, and goal-conditioned cue to model the interactions with the trajectory information, which results in insufficient interactions. In this article, we propose an innovative Transformer network named Multi-cue Transformer (McTrans) aimed at pedestrian trajectory prediction, where we design the Hierarchical Cross-Attention (HCA) module to learn the goal–social–environment interactions between the trajectory information of pedestrians and three kinds of cues from the perspectives of temporal and spatial dependencies. Furthermore, in order to reasonably utilize the guidance of the goal information, we propose the Gradual Goal-guided Loss (GGLoss) which gradually increases the weights of the coordinate difference between the predicted goal and the ground-truth goal as the time steps increase. We conduct extensive experiments on three public datasets, i.e., SDD, inD, and ETH/UCY. The experimental results demonstrate that the proposed McTrans is superior to other state-of-the-art methods.
行人轨迹预测是一个具有挑战性的问题,因为未来的轨迹受周围环境的影响,并受到常识规则的约束。现有的轨迹预测方法通常只考虑一种线索,即社会意识线索、环境意识线索和目标条件线索来模拟与轨迹信息的相互作用,导致相互作用不足。在本文中,我们提出了一种创新的针对行人轨迹预测的Multi-cue Transformer (McTrans)网络,其中我们设计了分层交叉注意(HCA)模块,从时间和空间依赖的角度学习行人轨迹信息与三种线索之间的目标-社会-环境相互作用。此外,为了合理利用目标信息的导引作用,我们提出了渐进式目标导引损失算法(GGLoss),该算法随着时间步长的增加,逐渐增大预测目标与真地目标之间的坐标差的权重。我们在三个公共数据集上进行了广泛的实验,即SDD, inD和ETH/UCY。实验结果表明,所提出的McTrans方法优于其他最先进的方法。
{"title":"Pedestrian trajectory prediction using multi-cue transformer","authors":"Yanlong Tian ,&nbsp;Rui Zhai ,&nbsp;Xiaoting Fan ,&nbsp;Qi Xue ,&nbsp;Zhong Zhang ,&nbsp;Xinshan Zhu","doi":"10.1016/j.jvcir.2026.104723","DOIUrl":"10.1016/j.jvcir.2026.104723","url":null,"abstract":"<div><div>Pedestrian trajectory prediction is a challenging issue because the future trajectories are influenced by the surrounding environment and constrained by the common sense rules. The existing trajectory prediction methods typically consider one kind of cues, i.e., social-aware cue, environment-aware cue, and goal-conditioned cue to model the interactions with the trajectory information, which results in insufficient interactions. In this article, we propose an innovative Transformer network named Multi-cue Transformer (McTrans) aimed at pedestrian trajectory prediction, where we design the Hierarchical Cross-Attention (HCA) module to learn the goal–social–environment interactions between the trajectory information of pedestrians and three kinds of cues from the perspectives of temporal and spatial dependencies. Furthermore, in order to reasonably utilize the guidance of the goal information, we propose the Gradual Goal-guided Loss (GGLoss) which gradually increases the weights of the coordinate difference between the predicted goal and the ground-truth goal as the time steps increase. We conduct extensive experiments on three public datasets, i.e., SDD, inD, and ETH/UCY. The experimental results demonstrate that the proposed McTrans is superior to other state-of-the-art methods.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"116 ","pages":"Article 104723"},"PeriodicalIF":3.1,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145981801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
All-in-focus image fusion using graph wavelet transform for multi-modal light field 基于图小波变换的多模态光场全焦图像融合
IF 3.1 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-15 DOI: 10.1016/j.jvcir.2026.104722
Jinjin Li , Baiyuan Qing , Kun Zhang , Xinyuan Yang , Xiangui Yin , Yichang Liu
The multi-modal nature of light field imaging produces a refocused image stack, but each image suffers from a limited depth-of-field. All-in-focus (AIF) fusion aims to create a single, sharp image from this stack, a task challenged by irregular depth boundaries and degraded spatial resolution. We propose a novel fusion framework based on the graph wavelet transform (GWT). Unlike traditional methods, our approach adaptively models pixel correlations to better handle irregular boundaries while preserving details. The method decomposes each image using a fast GWT. Low-frequency components are fused via a multi-layer strategy, while high-frequency components are merged using an integrated weighting scheme enhanced by guided filtering. Finally, the AIF image is reconstructed via an inverse GWT. Experimental results on light field datasets demonstrate superior performance over existing methods, achieving average EI, QY, and SSIM scores of 44.939, 0.9941, and 0.8719, respectively, showing its potential for practical applications.
光场成像的多模态特性产生了重新聚焦的图像堆栈,但是每个图像都受到景深的限制。全聚焦(AIF)融合旨在从这些叠加中创建一个单一的、清晰的图像,这是一项受到不规则深度边界和空间分辨率下降的挑战的任务。提出了一种基于图小波变换(GWT)的融合框架。与传统方法不同,我们的方法自适应建模像素相关性,在保留细节的同时更好地处理不规则边界。该方法使用快速GWT分解每个图像。低频分量通过多层融合策略融合,高频分量通过引导滤波增强的综合加权方案融合。最后,通过逆GWT重构AIF图像。在光场数据集上的实验结果表明,该方法优于现有方法,平均EI、QY和SSIM得分分别为44.939、0.9941和0.8719,显示了该方法的实际应用潜力。
{"title":"All-in-focus image fusion using graph wavelet transform for multi-modal light field","authors":"Jinjin Li ,&nbsp;Baiyuan Qing ,&nbsp;Kun Zhang ,&nbsp;Xinyuan Yang ,&nbsp;Xiangui Yin ,&nbsp;Yichang Liu","doi":"10.1016/j.jvcir.2026.104722","DOIUrl":"10.1016/j.jvcir.2026.104722","url":null,"abstract":"<div><div>The multi-modal nature of light field imaging produces a refocused image stack, but each image suffers from a limited depth-of-field. All-in-focus (AIF) fusion aims to create a single, sharp image from this stack, a task challenged by irregular depth boundaries and degraded spatial resolution. We propose a novel fusion framework based on the graph wavelet transform (GWT). Unlike traditional methods, our approach adaptively models pixel correlations to better handle irregular boundaries while preserving details. The method decomposes each image using a fast GWT. Low-frequency components are fused via a multi-layer strategy, while high-frequency components are merged using an integrated weighting scheme enhanced by guided filtering. Finally, the AIF image is reconstructed via an inverse GWT. Experimental results on light field datasets demonstrate superior performance over existing methods, achieving average EI, <span><math><msub><mrow><mi>Q</mi></mrow><mrow><mi>Y</mi></mrow></msub></math></span>, and SSIM scores of 44.939, 0.9941, and 0.8719, respectively, showing its potential for practical applications.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"116 ","pages":"Article 104722"},"PeriodicalIF":3.1,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145981619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards fast and effective low-light image enhancement via adaptive Gamma correction and detail refinement 通过自适应伽玛校正和细节细化,实现快速有效的低光图像增强
IF 3.1 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-15 DOI: 10.1016/j.jvcir.2026.104724
Shaoping Xu, Qiyu Chen, Liang Peng, Hanyang Hu, Wuyong Tao
Over the past decade, deep neural networks have significantly advanced low-light image enhancement (LLIE), achieving marked improvements in perceptual quality and robustness. However, these gains are increasingly accompanied by architectural complexity and computational inefficiency, widening the gap between enhancement performance and real-time applicability. This trade-off poses a critical challenge for time-sensitive scenarios requiring both high visual quality and efficient execution. To resolve the efficiency–quality trade-off in LLIE, we propose an ultra-lightweight framework comprising two computationally efficient modules: the adaptive Gamma correction module (AGCM) and the nonlinear refinement module (NRM). Specifically, the AGCM employs lightweight convolutions to generate spatially adaptive, pixel-wise Gamma maps that simultaneously mitigate global underexposure and suppress highlight overexposure, thereby preserving scene-specific luminance characteristics and ensuring visually natural global enhancement. Subsequently, the NRM employs two nonlinear transformation layers that logarithmically compress highlights and adaptively stretch shadows, effectively restoring local details without semantic distortion. Moreover, the first nonlinear transformation layer within the NRM incorporates residual connections to facilitate the capture and exploitation of subtle image features. Finally, the AGCM and NRM modules are jointly optimized using a hybrid loss function combining a reference-based fidelity term and no-reference perceptual metrics (i.e., local contrast, colorfulness, and exposure balance). Extensive experiments demonstrate that the proposed LLIE framework delivers performance comparable to state-of-the-art (SOTA) algorithms, while requiring only 8K parameters, achieving an optimal trade-off between enhancement quality and computational efficiency. This performance stems from our two-stage ultra-lightweight design: global illumination correction via pixel-adaptive Gamma adjustment, followed by detail-aware nonlinear refinement, all realized within a minimally parameterized architecture. As a result, the framework is uniquely suited for real-time deployment in resource-constrained environments.
在过去的十年中,深度神经网络显著地推进了低光图像增强(LLIE),在感知质量和鲁棒性方面取得了显著的进步。然而,这些增益越来越多地伴随着架构复杂性和计算效率低下,扩大了增强性能和实时适用性之间的差距。这种权衡对需要高视觉质量和高效执行的时间敏感场景提出了关键挑战。为了解决LLIE中效率与质量的权衡,我们提出了一个超轻量级框架,包括两个计算效率高的模块:自适应伽玛校正模块(AGCM)和非线性细化模块(NRM)。具体来说,AGCM采用轻量级卷积来生成空间自适应的像素级伽玛图,同时减轻全局曝光不足和抑制高光过度曝光,从而保留特定场景的亮度特征,并确保视觉上自然的全局增强。随后,NRM采用对数压缩高光和自适应拉伸阴影的两个非线性变换层,有效地恢复局部细节而不产生语义失真。此外,NRM中的第一非线性变换层包含残差连接,以方便捕获和利用微妙的图像特征。最后,使用结合基于参考的保真度项和无参考感知度量(即局部对比度、色彩和曝光平衡)的混合损失函数对AGCM和NRM模块进行联合优化。大量的实验表明,所提出的LLIE框架提供了与最先进(SOTA)算法相当的性能,同时只需要8K个参数,在增强质量和计算效率之间实现了最佳权衡。这种性能源于我们的两阶段超轻量化设计:通过像素自适应伽马调整进行全局照明校正,然后是细节感知非线性细化,所有这些都在最小参数化架构中实现。因此,该框架非常适合在资源受限的环境中进行实时部署。
{"title":"Towards fast and effective low-light image enhancement via adaptive Gamma correction and detail refinement","authors":"Shaoping Xu,&nbsp;Qiyu Chen,&nbsp;Liang Peng,&nbsp;Hanyang Hu,&nbsp;Wuyong Tao","doi":"10.1016/j.jvcir.2026.104724","DOIUrl":"10.1016/j.jvcir.2026.104724","url":null,"abstract":"<div><div>Over the past decade, deep neural networks have significantly advanced low-light image enhancement (LLIE), achieving marked improvements in perceptual quality and robustness. However, these gains are increasingly accompanied by architectural complexity and computational inefficiency, widening the gap between enhancement performance and real-time applicability. This trade-off poses a critical challenge for time-sensitive scenarios requiring both high visual quality and efficient execution. To resolve the efficiency–quality trade-off in LLIE, we propose an ultra-lightweight framework comprising two computationally efficient modules: the adaptive Gamma correction module (AGCM) and the nonlinear refinement module (NRM). Specifically, the AGCM employs lightweight convolutions to generate spatially adaptive, pixel-wise Gamma maps that simultaneously mitigate global underexposure and suppress highlight overexposure, thereby preserving scene-specific luminance characteristics and ensuring visually natural global enhancement. Subsequently, the NRM employs two nonlinear transformation layers that logarithmically compress highlights and adaptively stretch shadows, effectively restoring local details without semantic distortion. Moreover, the first nonlinear transformation layer within the NRM incorporates residual connections to facilitate the capture and exploitation of subtle image features. Finally, the AGCM and NRM modules are jointly optimized using a hybrid loss function combining a reference-based fidelity term and no-reference perceptual metrics (i.e., local contrast, colorfulness, and exposure balance). Extensive experiments demonstrate that the proposed LLIE framework delivers performance comparable to state-of-the-art (SOTA) algorithms, while requiring only 8K parameters, achieving an optimal trade-off between enhancement quality and computational efficiency. This performance stems from our two-stage ultra-lightweight design: global illumination correction via pixel-adaptive Gamma adjustment, followed by detail-aware nonlinear refinement, all realized within a minimally parameterized architecture. As a result, the framework is uniquely suited for real-time deployment in resource-constrained environments.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"116 ","pages":"Article 104724"},"PeriodicalIF":3.1,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146024845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-scale Spatial Frequency Interaction Variance Perception Model for Deepfake Face Detection 深度假人脸检测的多尺度空间频率交互方差感知模型
IF 3.1 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-14 DOI: 10.1016/j.jvcir.2026.104719
Yihang Wang , Shouxin Liu , Xudong Chen , Seok Tae Kim , Xiaowei Li
The negative effects of deepfake technology have attracted increasing attention and become a prominent social issue. Existing detection approaches typically refine conventional network architectures to uncover subtle manipulation traces, yet most focus exclusively on either spatial- or frequency-domain cues, overlooking their interaction. To address the limitations in existing deepfake detection methods, we present an innovative Multi-Scale Spatial-Frequency Variance-sensing (MSFV) model. This model effectively combines spatial and frequency information by utilizing iterative, variance-guided self-attention mechanisms. By integrating these two domains, the MSFV model enhances detection capabilities and improves the identification of subtle manipulations present in deepfake images. A dedicated high-frequency separation module further enhances the extraction of forgery indicators from the high-frequency components of manipulated images. Extensive experiments demonstrate that MSFV achieves classification accuracies of 98.95 % on the DFDC dataset and 97.92 % on the FaceForensics++ dataset, confirming its strong detection capability, generalization, and robustness compared with existing methods.
深度造假技术的负面影响已经引起了越来越多的关注,成为一个突出的社会问题。现有的检测方法通常会改进传统的网络架构,以发现微妙的操作痕迹,但大多数只关注空间或频域线索,而忽略了它们的相互作用。为了解决现有深度伪造检测方法的局限性,我们提出了一种创新的多尺度空间-频率方差感知(MSFV)模型。该模型利用迭代、方差导向的自注意机制,有效地结合了空间和频率信息。通过整合这两个域,MSFV模型增强了检测能力,并改进了对深度伪造图像中存在的微妙操纵的识别。专用的高频分离模块进一步增强了从被操纵图像的高频成分中提取伪造指标的能力。大量实验表明,MSFV在DFDC数据集上的分类准确率为98.95%,在facefrensics ++数据集上的分类准确率为97.92%,与现有方法相比,MSFV具有较强的检测能力、泛化能力和鲁棒性。
{"title":"Multi-scale Spatial Frequency Interaction Variance Perception Model for Deepfake Face Detection","authors":"Yihang Wang ,&nbsp;Shouxin Liu ,&nbsp;Xudong Chen ,&nbsp;Seok Tae Kim ,&nbsp;Xiaowei Li","doi":"10.1016/j.jvcir.2026.104719","DOIUrl":"10.1016/j.jvcir.2026.104719","url":null,"abstract":"<div><div>The negative effects of deepfake technology have attracted increasing attention and become a prominent social issue. Existing detection approaches typically refine conventional network architectures to uncover subtle manipulation traces, yet most focus exclusively on either spatial- or frequency-domain cues, overlooking their interaction. To address the limitations in existing deepfake detection methods, we present an innovative Multi-Scale Spatial-Frequency Variance-sensing (MSFV) model. This model effectively combines spatial and frequency information by utilizing iterative, variance-guided self-attention mechanisms. By integrating these two domains, the MSFV model enhances detection capabilities and improves the identification of subtle manipulations present in deepfake images. A dedicated high-frequency separation module further enhances the extraction of forgery indicators from the high-frequency components of manipulated images. Extensive experiments demonstrate that MSFV achieves classification accuracies of 98.95 % on the DFDC dataset and 97.92 % on the FaceForensics++ dataset, confirming its strong detection capability, generalization, and robustness compared with existing methods.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"116 ","pages":"Article 104719"},"PeriodicalIF":3.1,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146024446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CKCR: Context-aware knowledge construction and retrieval for knowledge-based visual question answering 基于知识的视觉问答的语境感知知识构建与检索
IF 3.1 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-13 DOI: 10.1016/j.jvcir.2026.104711
Fengjuan Wang , Jiayi Liu , Ruonan Zhang, Zhengxue Li, Feng Zhang, Gaoyun An
Knowledge-based Visual Question Answering (KB-VQA) requires models to integrate visual content with external knowledge to answer questions, which is crucial for building intelligent systems capable of real-world understanding. However, effectively incorporating external knowledge into visual reasoning faces three major challenges: the incompleteness of external knowledge bases leads to missing knowledge for many specific visual scenarios, semantic gaps exist between retrieved textual knowledge and visual content making alignment difficult, and effective mechanisms for fusing heterogeneous knowledge sources are lacking. While Multimodal Large Language Models(MLLMs) have demonstrated strong performance in visual understanding tasks, but face notable challenges in KB-VQA, particularly in knowledge utilization efficiency and semantic alignment, which seriously limits the reasoning depth and robustness. To address these problems, a Context-aware Knowledge Construction and Retrieval (CKCR) method is proposed for knowledge-based VQA, which includes the following three modules. The multi-granularity knowledge retrieval module constructs joint query vector based on the multi-dimensional embedding representation of images and questions, accurately obtaining explicit knowledge that is highly matched with the context. The vision-to-knowledge generation module supplements fine-grained semantic clues from the perspective of visual content, generating visual knowledge closely related to the image and making up for the expression limitations of general knowledge. To achieve deep alignment of knowledge representation, the knowledge adaptive learning module accurately embeds multi-source knowledge into the semantic space of MLLM by introducing a learnable knowledge mapping mechanism. Experimental evaluation on OK-VQA and A-OKVQA dataset shows the CKCR outperforms state-of-the-art methods of the same-scale. Ablation experiments and visualization analysis demonstrate the superiority of CKCR in its perception of fine-grained visual information and its ability to align knowledge semantics. Our code will be released on GitHub: https://github.com/fjwang3/CKCR.
基于知识的视觉问答(knowledge -based Visual Question answer, KB-VQA)要求模型集成视觉内容和外部知识来回答问题,这对于构建能够理解现实世界的智能系统至关重要。然而,将外部知识有效地整合到视觉推理中面临着三个主要挑战:外部知识库的不完整性导致许多特定视觉场景的知识缺失;检索的文本知识与视觉内容之间存在语义差距导致对齐困难;缺乏有效的融合异构知识来源的机制。虽然多模态大型语言模型(Multimodal Large Language Models, mllm)在视觉理解任务中表现出了较强的性能,但在知识利用效率和语义对齐方面面临着显著的挑战,严重限制了推理深度和鲁棒性。针对这些问题,本文提出了一种基于知识的VQA的情境感知知识构建与检索方法,该方法包括以下三个模块。多粒度知识检索模块基于图像和问题的多维嵌入表示构建联合查询向量,准确获取与上下文高度匹配的显式知识。视觉到知识生成模块从视觉内容的角度补充细粒度的语义线索,生成与图像密切相关的视觉知识,弥补一般知识的表达局限性。为实现知识表示的深度对齐,知识自适应学习模块通过引入可学习的知识映射机制,将多源知识精确嵌入到MLLM的语义空间中。在OK-VQA和A-OKVQA数据集上的实验评估表明,CKCR优于同尺度的最先进方法。消融实验和可视化分析证明了CKCR在细粒度视觉信息感知和知识语义对齐能力方面的优势。我们的代码将在GitHub上发布:https://github.com/fjwang3/CKCR。
{"title":"CKCR: Context-aware knowledge construction and retrieval for knowledge-based visual question answering","authors":"Fengjuan Wang ,&nbsp;Jiayi Liu ,&nbsp;Ruonan Zhang,&nbsp;Zhengxue Li,&nbsp;Feng Zhang,&nbsp;Gaoyun An","doi":"10.1016/j.jvcir.2026.104711","DOIUrl":"10.1016/j.jvcir.2026.104711","url":null,"abstract":"<div><div>Knowledge-based Visual Question Answering (KB-VQA) requires models to integrate visual content with external knowledge to answer questions, which is crucial for building intelligent systems capable of real-world understanding. However, effectively incorporating external knowledge into visual reasoning faces three major challenges: the incompleteness of external knowledge bases leads to missing knowledge for many specific visual scenarios, semantic gaps exist between retrieved textual knowledge and visual content making alignment difficult, and effective mechanisms for fusing heterogeneous knowledge sources are lacking. While Multimodal Large Language Models(MLLMs) have demonstrated strong performance in visual understanding tasks, but face notable challenges in KB-VQA, particularly in knowledge utilization efficiency and semantic alignment, which seriously limits the reasoning depth and robustness. To address these problems, a Context-aware Knowledge Construction and Retrieval (CKCR) method is proposed for knowledge-based VQA, which includes the following three modules. The multi-granularity knowledge retrieval module constructs joint query vector based on the multi-dimensional embedding representation of images and questions, accurately obtaining explicit knowledge that is highly matched with the context. The vision-to-knowledge generation module supplements fine-grained semantic clues from the perspective of visual content, generating visual knowledge closely related to the image and making up for the expression limitations of general knowledge. To achieve deep alignment of knowledge representation, the knowledge adaptive learning module accurately embeds multi-source knowledge into the semantic space of MLLM by introducing a learnable knowledge mapping mechanism. Experimental evaluation on OK-VQA and A-OKVQA dataset shows the CKCR outperforms state-of-the-art methods of the same-scale. Ablation experiments and visualization analysis demonstrate the superiority of CKCR in its perception of fine-grained visual information and its ability to align knowledge semantics. Our code will be released on GitHub: <span><span>https://github.com/fjwang3/CKCR</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"116 ","pages":"Article 104711"},"PeriodicalIF":3.1,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145981617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semantic Response GAN (SR-GAN) for embroidery pattern generation 语义响应GAN (SR-GAN)用于刺绣图案生成
IF 3.1 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-08 DOI: 10.1016/j.jvcir.2026.104707
Shaofan Chen
High-resolution, detail-rich image generation models are essential for text-driven embroidery pattern synthesis. In this paper, the Semantic Response Generative Adversarial Network (SR-GAN) is used for embroidery image synthesis. It generates higher-quality images and improves text-image alignment. The model integrates word-level text embeddings into the image latent space through a cross-attention mechanism and a confidence-aware fusion scheme. In this way, word-level semantic features are effectively injected into hidden image features. The Semantic Perception Module is also refined by replacing standard convolutions with depthwise separable convolutions, which reduces the number of model parameters. In addition, the Deep Attention Multimodal Similarity Model directly scores word-pixel correspondences to compute fine-grained matching loss. It injects embroidery-domain word embeddings into the text encoder for joint training and further tightens the alignment between generated images and text. Experimental results show that the proposed method achieves an FID of 13.84 and an IS of 5.51.
高分辨率、细节丰富的图像生成模型对于文本驱动的刺绣图案合成是必不可少的。本文将语义响应生成对抗网络(SR-GAN)用于刺绣图像的合成。它可以生成更高质量的图像,并改善文本图像对齐。该模型通过交叉注意机制和置信度感知融合方案将词级文本嵌入到图像潜在空间中。这样,将词级语义特征有效地注入到隐藏的图像特征中。语义感知模块也通过用深度可分离卷积代替标准卷积来改进,这减少了模型参数的数量。此外,深度注意多模态相似模型直接对字像素对应进行评分,计算细粒度匹配损失。它将刺绣域词嵌入到文本编码器中进行联合训练,并进一步加强生成的图像与文本之间的对齐。实验结果表明,该方法的FID为13.84,IS为5.51。
{"title":"Semantic Response GAN (SR-GAN) for embroidery pattern generation","authors":"Shaofan Chen","doi":"10.1016/j.jvcir.2026.104707","DOIUrl":"10.1016/j.jvcir.2026.104707","url":null,"abstract":"<div><div>High-resolution, detail-rich image generation models are essential for text-driven embroidery pattern synthesis. In this paper, the Semantic Response Generative Adversarial Network (SR-GAN) is used for embroidery image synthesis. It generates higher-quality images and improves text-image alignment. The model integrates word-level text embeddings into the image latent space through a cross-attention mechanism and a confidence-aware fusion scheme. In this way, word-level semantic features are effectively injected into hidden image features. The Semantic Perception Module is also refined by replacing standard convolutions with depthwise separable convolutions, which reduces the number of model parameters. In addition, the Deep Attention Multimodal Similarity Model directly scores word-pixel correspondences to compute fine-grained matching loss. It injects embroidery-domain word embeddings into the text encoder for joint training and further tightens the alignment between generated images and text. Experimental results show that the proposed method achieves an FID of 13.84 and an IS of 5.51.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"116 ","pages":"Article 104707"},"PeriodicalIF":3.1,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145981618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Image copy-move forgery detection using three-stage matching with constraints 基于约束的三级匹配图像复制-移动伪造检测
IF 3.1 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-08 DOI: 10.1016/j.jvcir.2026.104709
Panpan Niu, Hongxin Wang, Xingqi Wang
Copy-move forgery is one of the most commonly used manipulations for tampering digital images. In recent years, keypoint-based detection methods have achieved encouraging results, but there are still several shortcomings that can be improved. First, unability to generate sufficient keypoints in small or smooth regions, causing detection failure. Second, lack of robust and discriminative descriptors for image keypoints, resulting in false matches. Third, high computational cost of image keypoints matching. To tackle this challenge, we present a new keypoint-based image copy-move forgery detection (CMFD) using three-stage matching with constraints. In keypoint extraction, we extract sufficient SIFT keypoints by adaptively enlarging image and enhancing image contrast. In feature description, we adopt the combination of complex and real values of Polar Harmonic Fourier Moments (PHFMs) as the PHFMs-based hybrid feature vector of each keypoint, which substantially enhances the differentiation of the features. In feature matching, we present a fast stratification approach based on SLIC and locally optimal orientation pattern (LOOP), and utilize the stratification results as the constraints of matching, which can reduce the search space. Then a high-precision three-stage matching strategy based on amplitude information, phase information and distance information is executed. In post-processing, the location of the tampered regions is finally determined by one-step filtering and one-step clustering. Extensive experimental results show the superiority of the proposed method over the existing representative CMFD techniques.
复制-移动伪造是篡改数字图像最常用的手法之一。近年来,基于关键点的检测方法取得了令人鼓舞的成果,但仍有一些不足之处需要改进。首先,无法在小区域或光滑区域生成足够的关键点,导致检测失败。其次,对图像关键点缺乏鲁棒性和判别性的描述符,导致匹配错误。第三,图像关键点匹配的计算成本高。为了解决这一挑战,我们提出了一种新的基于关键点的图像复制-移动伪造检测(CMFD),该检测使用带有约束的三阶段匹配。在关键点提取方面,我们通过自适应放大图像和增强图像对比度来提取足够的SIFT关键点。在特征描述中,我们采用极调和傅里叶矩(PHFMs)的复值与实值的组合作为每个关键点的基于PHFMs的混合特征向量,大大增强了特征的差异性。在特征匹配中,我们提出了一种基于SLIC和局部最优方向模式(LOOP)的快速分层方法,并利用分层结果作为匹配约束,减小了搜索空间。然后执行基于幅值信息、相位信息和距离信息的高精度三级匹配策略。在后处理中,通过一步滤波和一步聚类最终确定篡改区域的位置。大量的实验结果表明,该方法优于现有的具有代表性的CMFD技术。
{"title":"Image copy-move forgery detection using three-stage matching with constraints","authors":"Panpan Niu,&nbsp;Hongxin Wang,&nbsp;Xingqi Wang","doi":"10.1016/j.jvcir.2026.104709","DOIUrl":"10.1016/j.jvcir.2026.104709","url":null,"abstract":"<div><div>Copy-move forgery is one of the most commonly used manipulations for tampering digital images. In recent years, keypoint-based detection methods have achieved encouraging results, but there are still several shortcomings that can be improved. First, unability to generate sufficient keypoints in small or smooth regions, causing detection failure. Second, lack of robust and discriminative descriptors for image keypoints, resulting in false matches. Third, high computational cost of image keypoints matching. To tackle this challenge, we present a new keypoint-based image copy-move forgery detection (CMFD) using three-stage matching with constraints. In keypoint extraction, we extract sufficient SIFT keypoints by adaptively enlarging image and enhancing image contrast. In feature description, we adopt the combination of complex and real values of Polar Harmonic Fourier Moments (PHFMs) as the PHFMs-based hybrid feature vector of each keypoint, which substantially enhances the differentiation of the features. In feature matching, we present a fast stratification approach based on SLIC and locally optimal orientation pattern (LOOP), and utilize the stratification results as the constraints of matching, which can reduce the search space. Then a high-precision three-stage matching strategy based on amplitude information, phase information and distance information is executed. In post-processing, the location of the tampered regions is finally determined by one-step filtering and one-step clustering. Extensive experimental results show the superiority of the proposed method over the existing representative CMFD techniques.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"116 ","pages":"Article 104709"},"PeriodicalIF":3.1,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145950136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LaDeL: Lane detection via multimodal large language model with visual instruction tuning LaDeL:通过带有视觉指令调整的多模态大语言模型进行车道检测
IF 3.1 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-06 DOI: 10.1016/j.jvcir.2025.104704
Yun Zhang , Xin Cheng , Zhou Zhou , Jingmei Zhou , Tong Yang
Lane detection plays a fundamental role in autonomous driving by providing geometric and semantic guidance for robust localization and planning. Empirical studies have shown that reliable lane perception can reduce vehicle localization error by up to 15% and improve trajectory stability by more than 10%, underscoring its critical importance in safety-critical navigation systems. Visual degradations such as occlusions, worn paint, and illumination shifts result in missing or ambiguous lane boundaries, reducing the reliability of appearance-only methods and motivating scene-aware reasoning. Inspired by the human ability to jointly interpret scene context and road structure, this work presents LaDeL (Lane Detection with Large Language Models), which, to our knowledge, is the first framework to leverage multimodal large language models for lane detection through visual-instruction reasoning. LaDeL reformulates lane perception as a multimodal question-answering task that performs lane localization, lane counting, and scene captioning in a unified manner. We introduce lane-specific tokens to enable precise numerical coordinate prediction and construct a diverse instruction-tuning corpus combining lane queries, lane-count prompts, and scene descriptions. Experiments demonstrate that LaDeL achieves state-of-the-art performance, including an F1-score of 82.35% on CULane and 98.23% on TuSimple, outperforming previous methods. Although LaDeL requires greater computational resources than conventional lane detection networks, it provides new insight into integrating geometric perception with high-level reasoning. Beyond lane detection, this formulation opens opportunities for language-guided perception and reasoning in autonomous driving, including road-scene analysis, interactive driving assistants, and language-aware perception.
车道检测通过为鲁棒定位和规划提供几何和语义指导,在自动驾驶中起着至关重要的作用。实证研究表明,可靠的车道感知可以将车辆定位误差降低15%,并将轨迹稳定性提高10%以上,这凸显了其在安全关键型导航系统中的重要性。视觉退化,如遮挡、磨损的油漆和照明变化,导致缺失或模糊的车道边界,降低了仅外观方法的可靠性,并激发了场景感知推理。受人类共同解释场景上下文和道路结构的能力的启发,这项工作提出了LaDeL (Lane Detection with Large Language Models),据我们所知,这是第一个利用多模态大语言模型通过视觉指令推理进行车道检测的框架。LaDeL将车道感知重新定义为一个多模态问答任务,以统一的方式执行车道定位、车道计数和场景字幕。我们引入了特定于车道的标记来实现精确的数值坐标预测,并构建了一个结合车道查询、车道计数提示和场景描述的多样化指令调优语料库。实验表明,LaDeL的性能达到了最先进的水平,在CULane上的f1得分为82.35%,在TuSimple上的f1得分为98.23%,优于以往的方法。尽管LaDeL比传统的车道检测网络需要更多的计算资源,但它为将几何感知与高级推理相结合提供了新的见解。除了车道检测之外,该公式还为自动驾驶中的语言引导感知和推理提供了机会,包括道路场景分析、交互式驾驶助手和语言感知感知。
{"title":"LaDeL: Lane detection via multimodal large language model with visual instruction tuning","authors":"Yun Zhang ,&nbsp;Xin Cheng ,&nbsp;Zhou Zhou ,&nbsp;Jingmei Zhou ,&nbsp;Tong Yang","doi":"10.1016/j.jvcir.2025.104704","DOIUrl":"10.1016/j.jvcir.2025.104704","url":null,"abstract":"<div><div>Lane detection plays a fundamental role in autonomous driving by providing geometric and semantic guidance for robust localization and planning. Empirical studies have shown that reliable lane perception can reduce vehicle localization error by up to 15% and improve trajectory stability by more than 10%, underscoring its critical importance in safety-critical navigation systems. Visual degradations such as occlusions, worn paint, and illumination shifts result in missing or ambiguous lane boundaries, reducing the reliability of appearance-only methods and motivating scene-aware reasoning. Inspired by the human ability to jointly interpret scene context and road structure, this work presents LaDeL (Lane Detection with Large Language Models), which, to our knowledge, is the first framework to leverage multimodal large language models for lane detection through visual-instruction reasoning. LaDeL reformulates lane perception as a multimodal question-answering task that performs lane localization, lane counting, and scene captioning in a unified manner. We introduce lane-specific tokens to enable precise numerical coordinate prediction and construct a diverse instruction-tuning corpus combining lane queries, lane-count prompts, and scene descriptions. Experiments demonstrate that LaDeL achieves state-of-the-art performance, including an F1-score of 82.35% on CULane and 98.23% on TuSimple, outperforming previous methods. Although LaDeL requires greater computational resources than conventional lane detection networks, it provides new insight into integrating geometric perception with high-level reasoning. Beyond lane detection, this formulation opens opportunities for language-guided perception and reasoning in autonomous driving, including road-scene analysis, interactive driving assistants, and language-aware perception.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"116 ","pages":"Article 104704"},"PeriodicalIF":3.1,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145981706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Infrared small UAV target detection via depthwise separable residual dense attention network 基于深度可分离残差密集注意网络的红外小型无人机目标检测
IF 3.1 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-06 DOI: 10.1016/j.jvcir.2025.104703
Keyang Cheng , Nan Chen , Chang Liu , Yue Yu , Hao Zhou , Zhe Wang , Changsheng Peng
Unmanned aerial vehicles (UAVs) are extensively utilized in both military and civilian sectors, offering benefits and posing challenges. Traditional infrared small target detection techniques often suffer from high false alarm rates and low accuracy. To overcome these issues, we propose the Depthwise Separable Residual Dense Attention Network (DSRDANet), which redefines the detection task as a residual image prediction problem. This approach features an Adaptive Adjustment Segmentation Module (AASM) that uses depthwise separable residual dense blocks to extract detailed hierarchical features during encoding. Additionally, multi-scale feature fusion blocks are included to thoroughly aggregate multi-scale features and enhance residual image reconstruction during decoding. Furthermore, the Channel Attention Modulation Module (CAMM) is designed to model channel interdependencies and spatial encoding, optimizing the outputs from AASM by adjusting feature importance distribution across channels, ensuring comprehensive target attention. Experimental results on datasets for infrared small UAV target detection and tracking in various backgrounds validate our approach. Compared to state-of-the-art methods, our technique significantly enhances performance, improving the average F1 score by nearly 0.1, the IOU by 0.12, and the CG by 0.66.
无人驾驶飞行器(uav)广泛应用于军事和民用领域,带来了好处,也带来了挑战。传统的红外小目标检测技术存在虚警率高、准确率低的问题。为了克服这些问题,我们提出了深度可分残差密集注意网络(DSRDANet),它将检测任务重新定义为残差图像预测问题。该方法采用自适应调整分割模块(AASM),利用深度可分残差密集块在编码过程中提取详细的层次特征。此外,该算法还引入了多尺度特征融合块,实现了多尺度特征的深度聚合,增强了解码过程中的残差图像重建。此外,设计了信道注意调制模块(CAMM),对信道相互依赖和空间编码进行建模,通过调整信道间特征重要性分布来优化AASM输出,确保全面的目标注意。在不同背景下红外小型无人机目标检测与跟踪数据集上的实验结果验证了我们的方法。与最先进的方法相比,我们的技术显著提高了性能,平均F1分数提高了近0.1,IOU提高了0.12,CG提高了0.66。
{"title":"Infrared small UAV target detection via depthwise separable residual dense attention network","authors":"Keyang Cheng ,&nbsp;Nan Chen ,&nbsp;Chang Liu ,&nbsp;Yue Yu ,&nbsp;Hao Zhou ,&nbsp;Zhe Wang ,&nbsp;Changsheng Peng","doi":"10.1016/j.jvcir.2025.104703","DOIUrl":"10.1016/j.jvcir.2025.104703","url":null,"abstract":"<div><div>Unmanned aerial vehicles (UAVs) are extensively utilized in both military and civilian sectors, offering benefits and posing challenges. Traditional infrared small target detection techniques often suffer from high false alarm rates and low accuracy. To overcome these issues, we propose the Depthwise Separable Residual Dense Attention Network (DSRDANet), which redefines the detection task as a residual image prediction problem. This approach features an Adaptive Adjustment Segmentation Module (AASM) that uses depthwise separable residual dense blocks to extract detailed hierarchical features during encoding. Additionally, multi-scale feature fusion blocks are included to thoroughly aggregate multi-scale features and enhance residual image reconstruction during decoding. Furthermore, the Channel Attention Modulation Module (CAMM) is designed to model channel interdependencies and spatial encoding, optimizing the outputs from AASM by adjusting feature importance distribution across channels, ensuring comprehensive target attention. Experimental results on datasets for infrared small UAV target detection and tracking in various backgrounds validate our approach. Compared to state-of-the-art methods, our technique significantly enhances performance, improving the average F1 score by nearly 0.1, the IOU by 0.12, and the CG by 0.66.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"116 ","pages":"Article 104703"},"PeriodicalIF":3.1,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145981705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Visual Communication and Image Representation
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1