首页 > 最新文献

Image and Vision Computing最新文献

英文 中文
OIDSty: One-shot identity-preserving face stylization OIDSty:一次性保留身份的面部样式化
IF 4.2 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-07 DOI: 10.1016/j.imavis.2026.105899
Kairui Wang , Xinying Liu , Di Zhao , Xuelei Geng , Tian Xian , Yonghao Chang
In recent years, image generation techniques based on diffusion models have made significant progress in the field of facial stylization. However, existing methods still face challenges in achieving high identity fidelity while maintaining strong stylistic expressiveness, particularly in balancing the geometric deformations introduced by stylization with the preservation of fine facial features (such as facial features and poses). To address this issue, this paper proposes a novel single-sample facial stylization system—OIDSty. Its core innovation lies in decoupling identity preservation and style injection tasks across distinct attention layers, primarily achieved through two key designs: (1) High-Fidelity Identity Module, which innovatively combines strong semantic conditions and weak spatial conditions to guide cross-attention layers. This design enables precise retention of core identity and facial layout features while permitting stylized geometric deformations; (2) The DINO-Style Texture Guidance Module introduces this loss function into the self-attention layer to compute the feature difference between the ideal stylized output and the current output. This loss is integrated into the denoising sampling process, dynamically calibrating latent features through gradients to ensure efficient and accurate transfer of stylized textures onto the target image. Extensive experimental results demonstrate that OIDSty generates high-fidelity, stylistically distinct images across multiple styles. Compared to existing state-of-the-art methods, our method exhibits significant advantages across all objective and subjective evaluation metrics without requiring complex parameter tuning.
近年来,基于扩散模型的图像生成技术在人脸风格化领域取得了重大进展。然而,现有的方法仍然面临着在保持强烈的风格表现力的同时实现高身份保真度的挑战,特别是在平衡风格化引入的几何变形与保留精细的面部特征(如面部特征和姿势)方面。为了解决这一问题,本文提出了一种新的单样本面部风格化系统oidsty。其核心创新点在于将不同注意层之间的身份保存和风格注入任务解耦,主要通过两个关键设计来实现:(1)高保真身份模块,创新地将强语义条件和弱空间条件结合起来,引导跨注意层。这种设计能够精确地保留核心身份和面部布局特征,同时允许程式化的几何变形;(2) DINO-Style Texture Guidance Module将该损失函数引入自关注层,计算理想风格化输出与当前输出的特征差。这种损失被整合到去噪采样过程中,通过梯度动态校准潜在特征,以确保有效和准确地将风格化纹理转移到目标图像上。大量的实验结果表明,OIDSty可以生成高保真度、风格鲜明的多风格图像。与现有的最先进的方法相比,我们的方法在所有客观和主观评估指标上都表现出显著的优势,而不需要复杂的参数调整。
{"title":"OIDSty: One-shot identity-preserving face stylization","authors":"Kairui Wang ,&nbsp;Xinying Liu ,&nbsp;Di Zhao ,&nbsp;Xuelei Geng ,&nbsp;Tian Xian ,&nbsp;Yonghao Chang","doi":"10.1016/j.imavis.2026.105899","DOIUrl":"10.1016/j.imavis.2026.105899","url":null,"abstract":"<div><div>In recent years, image generation techniques based on diffusion models have made significant progress in the field of facial stylization. However, existing methods still face challenges in achieving high identity fidelity while maintaining strong stylistic expressiveness, particularly in balancing the geometric deformations introduced by stylization with the preservation of fine facial features (such as facial features and poses). To address this issue, this paper proposes a novel single-sample facial stylization system—OIDSty. Its core innovation lies in decoupling identity preservation and style injection tasks across distinct attention layers, primarily achieved through two key designs: (1) High-Fidelity Identity Module, which innovatively combines strong semantic conditions and weak spatial conditions to guide cross-attention layers. This design enables precise retention of core identity and facial layout features while permitting stylized geometric deformations; (2) The DINO-Style Texture Guidance Module introduces this loss function into the self-attention layer to compute the feature difference between the ideal stylized output and the current output. This loss is integrated into the denoising sampling process, dynamically calibrating latent features through gradients to ensure efficient and accurate transfer of stylized textures onto the target image. Extensive experimental results demonstrate that OIDSty generates high-fidelity, stylistically distinct images across multiple styles. Compared to existing state-of-the-art methods, our method exhibits significant advantages across all objective and subjective evaluation metrics without requiring complex parameter tuning.</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"167 ","pages":"Article 105899"},"PeriodicalIF":4.2,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145978219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LoRA-empowered efficient diffusion for accurate fine-grained detail rendering in real-image cartoonization 基于lora的高效扩散,在实景图像卡通化中实现精确的细粒度细节渲染
IF 4.2 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-06 DOI: 10.1016/j.imavis.2026.105898
Mingjin Liu , Yien Li
Recent advances in generative models have enabled diverse applications, from text-to-image synthesis to artistic content creation. However, generating high-quality, domain-specific content — particularly for culturally unique styles like Chinese opera — remains challenging due to limited generalization on long-tail data and the high cost of fine-tuning with specialized datasets. To address these limitations, we propose DreamOpera, a novel framework for transforming real-world Chinese opera character photographs into stylized cartoon representations. Our approach leverages a two-step process: (1) feature extraction using a pre-trained encoder to capture key visual attributes (e.g., clothing, facial features), and (2) domain transformation via a LoRA-fine-tuned diffusion model trained on a small, unpaired dataset of cartoon-style opera images. This strategy bypasses the need for costly paired data while preserving fine-grained details. Experiments demonstrate that DreamOpera outperforms existing methods in generating high-fidelity, culturally nuanced artwork, offering practical value for cultural dissemination and digital art.
生成模型的最新进展使各种应用成为可能,从文本到图像的合成到艺术内容的创作。然而,由于长尾数据的泛化有限,以及使用专门数据集进行微调的高成本,生成高质量的、特定领域的内容——特别是针对中国戏曲等文化独特风格的内容——仍然具有挑战性。为了解决这些限制,我们提出了DreamOpera,这是一个将现实世界的中国戏曲人物照片转换为程式化卡通表现的新框架。我们的方法利用了两个步骤的过程:(1)使用预训练的编码器进行特征提取,以捕获关键的视觉属性(例如,服装,面部特征);(2)通过lora微调扩散模型进行域转换,该模型训练在一个小型的,未配对的卡通风格歌剧图像数据集上。这种策略绕过了对昂贵的成对数据的需求,同时保留了细粒度的细节。实验表明,DreamOpera在生成高保真、文化细腻的艺术品方面优于现有方法,为文化传播和数字艺术提供了实用价值。
{"title":"LoRA-empowered efficient diffusion for accurate fine-grained detail rendering in real-image cartoonization","authors":"Mingjin Liu ,&nbsp;Yien Li","doi":"10.1016/j.imavis.2026.105898","DOIUrl":"10.1016/j.imavis.2026.105898","url":null,"abstract":"<div><div>Recent advances in generative models have enabled diverse applications, from text-to-image synthesis to artistic content creation. However, generating high-quality, domain-specific content — particularly for culturally unique styles like Chinese opera — remains challenging due to limited generalization on long-tail data and the high cost of fine-tuning with specialized datasets. To address these limitations, we propose DreamOpera, a novel framework for transforming real-world Chinese opera character photographs into stylized cartoon representations. Our approach leverages a two-step process: (1) feature extraction using a pre-trained encoder to capture key visual attributes (e.g., clothing, facial features), and (2) domain transformation via a LoRA-fine-tuned diffusion model trained on a small, unpaired dataset of cartoon-style opera images. This strategy bypasses the need for costly paired data while preserving fine-grained details. Experiments demonstrate that DreamOpera outperforms existing methods in generating high-fidelity, culturally nuanced artwork, offering practical value for cultural dissemination and digital art.</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"167 ","pages":"Article 105898"},"PeriodicalIF":4.2,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145978221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Long-FAS: Cross-domain face anti-spoofing with long text guidance long - fas:长文本引导的跨域人脸防欺骗
IF 4.2 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-06 DOI: 10.1016/j.imavis.2026.105901
Jianwen Zhang , Jianfeng Zhang , Dedong Yang, Rongtao Li, Ziyang Li
Recent studies have demonstrated that utilizing natural language as a supervisory signal can enhance face anti-spoofing (FAS) performance; however, these methods still fall short in fully addressing long-text inputs and fine-grained information. To mitigate these limitations, we leverage MiniGPT-4 to generate detailed long-form textual descriptions of facial features for input images, and propose a novel framework, Long-FAS, which extracts textual and visual information through a dual-branch architecture. Specifically, we incorporate positional encoding for knowledge retention to enable the learning of effective feature representations from long texts, and employ principal component analysis (PCA) matching to capture essential attribute information while prioritizing critical attributes. Furthermore, matching visual and textual features at both coarse and fine granularities enhances the model’s ability to effectively handle both long and short texts, thereby empowering it to learn robust discriminative cues from facial images. Extensive experiments demonstrate that our approach significantly outperforms state-of-the-art counterparts.
最近的研究表明,利用自然语言作为监督信号可以提高人脸抗欺骗(FAS)性能;然而,这些方法仍然不能完全处理长文本输入和细粒度信息。为了减轻这些限制,我们利用MiniGPT-4为输入图像生成详细的长篇面部特征文本描述,并提出了一个新的框架Long-FAS,它通过双分支架构提取文本和视觉信息。具体来说,我们将位置编码用于知识保留,以便从长文本中学习有效的特征表示,并使用主成分分析(PCA)匹配来捕获基本属性信息,同时对关键属性进行优先级排序。此外,在粗粒度和细粒度上匹配视觉和文本特征增强了模型有效处理长文本和短文本的能力,从而使其能够从面部图像中学习稳健的判别线索。大量的实验表明,我们的方法明显优于最先进的同行。
{"title":"Long-FAS: Cross-domain face anti-spoofing with long text guidance","authors":"Jianwen Zhang ,&nbsp;Jianfeng Zhang ,&nbsp;Dedong Yang,&nbsp;Rongtao Li,&nbsp;Ziyang Li","doi":"10.1016/j.imavis.2026.105901","DOIUrl":"10.1016/j.imavis.2026.105901","url":null,"abstract":"<div><div>Recent studies have demonstrated that utilizing natural language as a supervisory signal can enhance face anti-spoofing (FAS) performance; however, these methods still fall short in fully addressing long-text inputs and fine-grained information. To mitigate these limitations, we leverage MiniGPT-4 to generate detailed long-form textual descriptions of facial features for input images, and propose a novel framework, Long-FAS, which extracts textual and visual information through a dual-branch architecture. Specifically, we incorporate positional encoding for knowledge retention to enable the learning of effective feature representations from long texts, and employ principal component analysis (PCA) matching to capture essential attribute information while prioritizing critical attributes. Furthermore, matching visual and textual features at both coarse and fine granularities enhances the model’s ability to effectively handle both long and short texts, thereby empowering it to learn robust discriminative cues from facial images. Extensive experiments demonstrate that our approach significantly outperforms state-of-the-art counterparts.</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"167 ","pages":"Article 105901"},"PeriodicalIF":4.2,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145927625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Distributed quantum model learning for traffic density estimation 交通密度估计的分布式量子模型学习
IF 4.2 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-06 DOI: 10.1016/j.imavis.2026.105900
Kewen Wang, Bin Wang, Wenzhe Zhai, Jing-an Cheng
In Intelligent Autonomous Transport Systems (IATS), the integration of lightweight machine learning techniques enables the deployment of real-time and efficient AI models on edge devices. A fundamental aspect is to estimate traffic density, which is crucial for efficient intelligent traffic control. The rapid progress in deep neural networks (DNNs) has led to a notable improvement in the accuracy of traffic density estimation. However, two main issues remain unsolved. Firstly, current DNN models involve numerous parameters and consume large computing resources, and thus their performance degrades when detecting multi-scale vehicle targets. Secondly, growing privacy concerns have made individuals increasingly unwilling to share their data for model training, which leads to data isolation challenges. To address the problems above, we introduce the Distributed Quantum Model Learning (DQML) model for traffic density estimation. It combines an Efficient Quantum-driven Adaptive (EQA) module to capture multi-scale information using quantum states. In addition, we propose a distributed learning strategy that trains multiple client models with local data and aggregates them via a global parameter server. This strategy ensures privacy protection while offering a significant improvement in estimation performance compared to models trained on limited and isolated data. We evaluated the proposed model on six key benchmarks for vehicle and crowd density analysis, and comprehensive experiments demonstrated that it surpasses other state-of-the-art models in both accuracy and efficiency.
在智能自主运输系统(IATS)中,轻量级机器学习技术的集成可以在边缘设备上部署实时高效的人工智能模型。交通密度估计是实现高效智能交通控制的一个重要方面。深度神经网络(dnn)的快速发展使得交通密度估计的准确性得到了显著提高。然而,两个主要问题仍未解决。首先,目前的深度神经网络模型涉及的参数多,计算资源消耗大,在检测多尺度车辆目标时性能下降。其次,越来越多的隐私问题使得个人越来越不愿意分享他们的数据用于模型训练,这导致了数据隔离的挑战。为了解决上述问题,我们引入分布式量子模型学习(DQML)模型用于交通密度估计。它结合了一个高效量子驱动自适应(EQA)模块,利用量子态捕获多尺度信息。此外,我们提出了一种分布式学习策略,该策略使用本地数据训练多个客户端模型,并通过全局参数服务器聚合它们。与在有限和孤立数据上训练的模型相比,该策略确保了隐私保护,同时显著提高了估计性能。我们在车辆和人群密度分析的六个关键基准上对所提出的模型进行了评估,综合实验表明,它在准确性和效率方面都优于其他最先进的模型。
{"title":"Distributed quantum model learning for traffic density estimation","authors":"Kewen Wang,&nbsp;Bin Wang,&nbsp;Wenzhe Zhai,&nbsp;Jing-an Cheng","doi":"10.1016/j.imavis.2026.105900","DOIUrl":"10.1016/j.imavis.2026.105900","url":null,"abstract":"<div><div>In Intelligent Autonomous Transport Systems (IATS), the integration of lightweight machine learning techniques enables the deployment of real-time and efficient AI models on edge devices. A fundamental aspect is to estimate traffic density, which is crucial for efficient intelligent traffic control. The rapid progress in deep neural networks (DNNs) has led to a notable improvement in the accuracy of traffic density estimation. However, two main issues remain unsolved. Firstly, current DNN models involve numerous parameters and consume large computing resources, and thus their performance degrades when detecting multi-scale vehicle targets. Secondly, growing privacy concerns have made individuals increasingly unwilling to share their data for model training, which leads to data isolation challenges. To address the problems above, we introduce the Distributed Quantum Model Learning (DQML) model for traffic density estimation. It combines an Efficient Quantum-driven Adaptive (EQA) module to capture multi-scale information using quantum states. In addition, we propose a distributed learning strategy that trains multiple client models with local data and aggregates them via a global parameter server. This strategy ensures privacy protection while offering a significant improvement in estimation performance compared to models trained on limited and isolated data. We evaluated the proposed model on six key benchmarks for vehicle and crowd density analysis, and comprehensive experiments demonstrated that it surpasses other state-of-the-art models in both accuracy and efficiency.</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"167 ","pages":"Article 105900"},"PeriodicalIF":4.2,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145927626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrating spatial features and dynamically learned temporal features via contrastive learning for video temporal grounding in LLM 基于对比学习的LLM视频时间基础空间特征与动态学习时间特征集成
IF 4.2 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-05 DOI: 10.1016/j.imavis.2026.105895
Peifu Wang , Yixiong Liang , Yigang Cen , Lihui Cen , Zhe Qu , Jingling Liu , Shichao Kan
Video temporal grounding (VTG) is crucial for fine-grained temporal understanding in vision-language tasks. While large vision-language models (LVLMs) have shown promising results through image–text alignment and video-instruction tuning, they represent videos as static sequences of sampled frames processed by image-based vision encoders, inherently limiting their capacity to capture dynamic and sequential information effectively, leading to suboptimal performance. To address this, we propose integrating spatial features with dynamically learned temporal features using contrastive learning. Temporal features are dynamically extracted by learning a set of temporal query tokens, which prompt temporal feature extraction via contrastive alignment between video sequences and their corresponding descriptions. On the other hand, VTG based on large language models are always supervised solely through the language modeling loss, which is insufficient for effectively guiding such tasks. Thus, the VTG model in our method is trained with a temporal localization loss that combines mean squared error (MSE), intersection-over-union (IoU) of the temporal range, and cosine similarity of temporal embeddings, which is designed to be applicable to large language models. Our experiments on benchmark datasets demonstrate the effectiveness of the proposed method.
视频时间基础(VTG)对于视觉语言任务中的细粒度时间理解至关重要。虽然大型视觉语言模型(LVLMs)通过图像-文本对齐和视频指令调优显示了有希望的结果,但它们将视频表示为由基于图像的视觉编码器处理的采样帧的静态序列,这固有地限制了它们有效捕获动态和顺序信息的能力,导致性能不佳。为了解决这个问题,我们建议使用对比学习将空间特征与动态学习的时间特征相结合。通过学习一组时间查询令牌来动态提取时间特征,这些令牌通过视频序列与其相应描述之间的对比比对来提示时间特征提取。另一方面,基于大型语言模型的VTG往往仅通过语言建模损失进行监督,不足以有效指导此类任务。因此,在我们的方法中,VTG模型是用结合均方误差(MSE)、时间范围的交集-过并(IoU)和时间嵌入的余弦相似度的时间定位损失来训练的,该方法被设计为适用于大型语言模型。我们在基准数据集上的实验证明了该方法的有效性。
{"title":"Integrating spatial features and dynamically learned temporal features via contrastive learning for video temporal grounding in LLM","authors":"Peifu Wang ,&nbsp;Yixiong Liang ,&nbsp;Yigang Cen ,&nbsp;Lihui Cen ,&nbsp;Zhe Qu ,&nbsp;Jingling Liu ,&nbsp;Shichao Kan","doi":"10.1016/j.imavis.2026.105895","DOIUrl":"10.1016/j.imavis.2026.105895","url":null,"abstract":"<div><div>Video temporal grounding (VTG) is crucial for fine-grained temporal understanding in vision-language tasks. While large vision-language models (LVLMs) have shown promising results through image–text alignment and video-instruction tuning, they represent videos as static sequences of sampled frames processed by image-based vision encoders, inherently limiting their capacity to capture dynamic and sequential information effectively, leading to suboptimal performance. To address this, we propose integrating spatial features with dynamically learned temporal features using contrastive learning. Temporal features are dynamically extracted by learning a set of temporal query tokens, which prompt temporal feature extraction via contrastive alignment between video sequences and their corresponding descriptions. On the other hand, VTG based on large language models are always supervised solely through the language modeling loss, which is insufficient for effectively guiding such tasks. Thus, the VTG model in our method is trained with a temporal localization loss that combines mean squared error (MSE), intersection-over-union (IoU) of the temporal range, and cosine similarity of temporal embeddings, which is designed to be applicable to large language models. Our experiments on benchmark datasets demonstrate the effectiveness of the proposed method.</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"167 ","pages":"Article 105895"},"PeriodicalIF":4.2,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145927624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DRM-YOLO: A YOLOv11-based structural optimization method for small object detection in UAV aerial imagery DRM-YOLO:一种基于yolov11的无人机航拍小目标检测结构优化方法
IF 4.2 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-30 DOI: 10.1016/j.imavis.2025.105894
Hongbo Bi, Rui Dai, Fengyang Han, Cong Zhang
With the falling cost of UAVs and advances in automation, drones are increasingly applied in agriculture, inspection, and smart cities. However, small object detection remains difficult due to tiny targets, sparse features, and complex backgrounds. To tackle these challenges, this paper presents an improved small object detection framework for UAV imagery, optimized from the YOLOv11n architecture. First, the proposed MetaDWBlock integrates multi-branch depthwise separable convolutions with a lightweight MLP, and its hierarchical MetaDWStage enhances contextual and fine-grained feature modeling. Second, the Cross-scale Feature Fusion Module (CFFM) employs the CARAFE upsampling operator for precise fusion of shallow spatial and deep semantic features, improving multi-scale perception. Finally, a scale-, spatial-, and task-aware Dynamic Head with an added P2 branch forms a four-branch detection head, markedly boosting detection accuracy for tiny objects. Experimental results on the VisDrone2019 dataset demonstrate that the proposed DRM-YOLO model significantly outperforms the baseline YOLOv11n in small object detection tasks, achieving a 21.4% improvement in [email protected] and a 13.1% improvement in [email protected]. These results fully validate the effectiveness and practical value of the proposed method in enhancing the accuracy and robustness of small object detection in UAV aerial imagery. The code and results of our method are available at https://github.com/DRdairuiDR/DRM--YOLO.
随着无人机成本的下降和自动化程度的提高,无人机在农业、检查、智慧城市等领域的应用越来越多。然而,由于目标微小、特征稀疏、背景复杂等原因,小目标检测仍然是一个难题。为了解决这些挑战,本文提出了一种改进的无人机图像小目标检测框架,该框架在YOLOv11n架构的基础上进行了优化。首先,提出的MetaDWBlock将多分支深度可分离卷积与轻量级MLP集成在一起,其分层MetaDWStage增强了上下文和细粒度特征建模。其次,跨尺度特征融合模块(CFFM)采用CARAFE上采样算子对浅层空间和深层语义特征进行精确融合,提高多尺度感知能力。最后,一个具有尺度、空间和任务感知的动态头与一个额外的P2分支形成了一个四分支检测头,显著提高了对微小物体的检测精度。在VisDrone2019数据集上的实验结果表明,提出的DRM-YOLO模型在小目标检测任务中显著优于基线YOLOv11n,在[email protected]中提高了21.4%,在[email protected]中提高了13.1%。这些结果充分验证了该方法在提高无人机航测图像小目标检测精度和鲁棒性方面的有效性和实用价值。我们的方法的代码和结果可在https://github.com/DRdairuiDR/DRM--YOLO上获得。
{"title":"DRM-YOLO: A YOLOv11-based structural optimization method for small object detection in UAV aerial imagery","authors":"Hongbo Bi,&nbsp;Rui Dai,&nbsp;Fengyang Han,&nbsp;Cong Zhang","doi":"10.1016/j.imavis.2025.105894","DOIUrl":"10.1016/j.imavis.2025.105894","url":null,"abstract":"<div><div>With the falling cost of UAVs and advances in automation, drones are increasingly applied in agriculture, inspection, and smart cities. However, small object detection remains difficult due to tiny targets, sparse features, and complex backgrounds. To tackle these challenges, this paper presents an improved small object detection framework for UAV imagery, optimized from the YOLOv11n architecture. First, the proposed MetaDWBlock integrates multi-branch depthwise separable convolutions with a lightweight MLP, and its hierarchical MetaDWStage enhances contextual and fine-grained feature modeling. Second, the Cross-scale Feature Fusion Module (CFFM) employs the CARAFE upsampling operator for precise fusion of shallow spatial and deep semantic features, improving multi-scale perception. Finally, a scale-, spatial-, and task-aware Dynamic Head with an added P2 branch forms a four-branch detection head, markedly boosting detection accuracy for tiny objects. Experimental results on the VisDrone2019 dataset demonstrate that the proposed DRM-YOLO model significantly outperforms the baseline YOLOv11n in small object detection tasks, achieving a 21.4% improvement in [email protected] and a 13.1% improvement in [email protected]. These results fully validate the effectiveness and practical value of the proposed method in enhancing the accuracy and robustness of small object detection in UAV aerial imagery. The code and results of our method are available at <span><span>https://github.com/DRdairuiDR/DRM--YOLO</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"167 ","pages":"Article 105894"},"PeriodicalIF":4.2,"publicationDate":"2025-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145885584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dual-stage network combining transformer and hybrid convolutions for stereo image super-resolution 结合变压器和混合卷积的双级网络立体图像超分辨率
IF 4.2 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-29 DOI: 10.1016/j.imavis.2025.105892
Jintao Zeng , Aiwen Jiang , Feiqiang Liu
Stereo image super-resolution aims to recover high-resolution image from given low-resolution left and right view images. Its challenges lie in fully feature extraction on each perspective and skillfully information integration from different perspectives. Among current methods, almost all super-resolution models employ single-stage strategy either based on transformer or convolution neural network(CNN). For highly nonlinear problems, single-stage network may not achieve very ideal performance with acceptable complexity. In this paper, we have proposed a dual-stage stereo image super-resolution network (DSSRNet) which integrates the complementary advantages of transformer and convolutions. Specifically, we design cross-stage attention module (CASM) to bridge informative feature transmission between successive stages. Moreover, we utilize fourier convolutions to efficiently model global and local features, benefiting restoring image details and texture. We have compared the proposed DSSRNet with several state-of-the-art methods on public benchmark datasets. The comprehensive experiments demonstrate that DSSRNet can restore clear structural features and richer texture details, achieving leading performance on PSNR, SSIM and LPIPS metrics with acceptable computation burden in stereo image super-resolution field. Related source codes and models will be released on https://github.com/Zjtao-lab/DSSRNet.
立体图像超分辨率旨在从给定的低分辨率左右视图图像中恢复高分辨率图像。其难点在于如何从各个角度充分提取特征,如何巧妙地将不同角度的信息进行整合。在现有的方法中,几乎所有的超分辨率模型都采用基于变压器或卷积神经网络(CNN)的单级策略。对于高度非线性问题,单级网络可能无法在可接受的复杂度下获得非常理想的性能。在本文中,我们提出了一种双级立体图像超分辨率网络(DSSRNet),它融合了变压器和卷积的互补优势。具体而言,我们设计了跨阶段注意模块(CASM),以架起信息特征在连续阶段之间传递的桥梁。此外,我们利用傅里叶卷积有效地建模全局和局部特征,有利于恢复图像的细节和纹理。我们将提出的DSSRNet与几种最先进的方法在公共基准数据集上进行了比较。综合实验表明,DSSRNet可以恢复清晰的结构特征和更丰富的纹理细节,在立体图像超分辨率领域的PSNR、SSIM和LPIPS指标上取得领先的性能,且计算负担可接受。相关源代码和模型将在https://github.com/Zjtao-lab/DSSRNet上发布。
{"title":"Dual-stage network combining transformer and hybrid convolutions for stereo image super-resolution","authors":"Jintao Zeng ,&nbsp;Aiwen Jiang ,&nbsp;Feiqiang Liu","doi":"10.1016/j.imavis.2025.105892","DOIUrl":"10.1016/j.imavis.2025.105892","url":null,"abstract":"<div><div>Stereo image super-resolution aims to recover high-resolution image from given low-resolution left and right view images. Its challenges lie in fully feature extraction on each perspective and skillfully information integration from different perspectives. Among current methods, almost all super-resolution models employ single-stage strategy either based on transformer or convolution neural network(CNN). For highly nonlinear problems, single-stage network may not achieve very ideal performance with acceptable complexity. In this paper, we have proposed a dual-stage stereo image super-resolution network (DSSRNet) which integrates the complementary advantages of transformer and convolutions. Specifically, we design cross-stage attention module (CASM) to bridge informative feature transmission between successive stages. Moreover, we utilize fourier convolutions to efficiently model global and local features, benefiting restoring image details and texture. We have compared the proposed DSSRNet with several state-of-the-art methods on public benchmark datasets. The comprehensive experiments demonstrate that DSSRNet can restore clear structural features and richer texture details, achieving leading performance on PSNR, SSIM and LPIPS metrics with acceptable computation burden in stereo image super-resolution field. Related source codes and models will be released on <span><span>https://github.com/Zjtao-lab/DSSRNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"167 ","pages":"Article 105892"},"PeriodicalIF":4.2,"publicationDate":"2025-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145885581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Statistic temporal checking and spatial consistency based 3D size reconstruction of multiple objects from indoor monocular videos 基于统计时间检验和空间一致性的室内单目视频多目标三维尺寸重建
IF 4.2 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-27 DOI: 10.1016/j.imavis.2025.105890
Ziyue Wang , Xina Cheng , Takeshi Ikenaga
Reconstructing accurate 3D sizes of multiple objects from indoor monocular videos has gradually become a significant topic for robotics, smart homes, and wireless signal analysis. However, existing monocular reconstruction pipelines often focus on the surface or 3D bounding box reconstruction of objects, making unreliable size estimation due to occlusion, missing depth, and incomplete visibility. To accurately reconstruct the real size of objects in different shapes under complex indoor conditions, this work proposes statistic checking module with depth layering and spatial consistency checking for accurate object size reconstruction. First, by checking the frequency of feature points from the semantic information, statistic temporal checking is used to remove outliers around object region by checking the probability of foreground and background region. Secondly, depth layering provides depth prior, which helps to enhance the boundary of objects and increases the 3D reconstruction accuracy. Then, a semantic-guided spatial consistency checking module infers the hidden or occluded parts of objects by exploiting category-specific priors and spatial consistency. The inferred complete object boundaries are enclosed using surface fitting and volumetric filling, resulting in final volumetric occupancy estimates for each individual object. Extensive experiments demonstrate that the proposed method achieves 0.3137 error rate, which is approximately 0.5641 lower than the average.
从室内单目视频中重建多个物体的精确3D尺寸已逐渐成为机器人、智能家居和无线信号分析领域的重要课题。然而,现有的单目重建管道往往侧重于物体的表面或三维边界盒重建,由于遮挡、缺失深度和不完全可见性,导致尺寸估计不可靠。为了在复杂室内条件下准确重建不同形状物体的真实尺寸,本文提出了具有深度分层和空间一致性检查的统计检查模块,用于精确重建物体尺寸。首先,从语义信息中检测特征点的频率,通过检测前景和背景区域的概率,采用统计时态检测去除目标区域周围的离群点;其次,深度分层提供了深度先验,有助于增强物体的边界,提高三维重建精度。然后,一个语义引导的空间一致性检查模块通过利用特定类别的先验和空间一致性推断出对象的隐藏或遮挡部分。推断出的完整物体边界使用表面拟合和体积填充来封闭,从而得到每个单独物体的最终体积占用估计。大量实验表明,该方法的错误率为0.3137,比平均值低约0.5641。
{"title":"Statistic temporal checking and spatial consistency based 3D size reconstruction of multiple objects from indoor monocular videos","authors":"Ziyue Wang ,&nbsp;Xina Cheng ,&nbsp;Takeshi Ikenaga","doi":"10.1016/j.imavis.2025.105890","DOIUrl":"10.1016/j.imavis.2025.105890","url":null,"abstract":"<div><div>Reconstructing accurate 3D sizes of multiple objects from indoor monocular videos has gradually become a significant topic for robotics, smart homes, and wireless signal analysis. However, existing monocular reconstruction pipelines often focus on the surface or 3D bounding box reconstruction of objects, making unreliable size estimation due to occlusion, missing depth, and incomplete visibility. To accurately reconstruct the real size of objects in different shapes under complex indoor conditions, this work proposes statistic checking module with depth layering and spatial consistency checking for accurate object size reconstruction. First, by checking the frequency of feature points from the semantic information, statistic temporal checking is used to remove outliers around object region by checking the probability of foreground and background region. Secondly, depth layering provides depth prior, which helps to enhance the boundary of objects and increases the 3D reconstruction accuracy. Then, a semantic-guided spatial consistency checking module infers the hidden or occluded parts of objects by exploiting category-specific priors and spatial consistency. The inferred complete object boundaries are enclosed using surface fitting and volumetric filling, resulting in final volumetric occupancy estimates for each individual object. Extensive experiments demonstrate that the proposed method achieves 0.3137 error rate, which is approximately 0.5641 lower than the average.</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"167 ","pages":"Article 105890"},"PeriodicalIF":4.2,"publicationDate":"2025-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145885582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
OCC-MLLM-CoT: Self-correction enhanced occlusion recognition with large language models via 3D-aware supervision, chain-of-thoughts guidance occ - mlm - cot:通过3d感知监督、思维链引导,对大型语言模型进行自我纠错增强的遮挡识别
IF 4.2 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-24 DOI: 10.1016/j.imavis.2025.105881
Chaoyi Wang , Fangzhou Meng , Jun Pei , Lijie Xia , Jianpo Liu , Xiaobing Yuan , Xinhan Di
Comprehending occluded objects remains an underexplored challenge for existing large-scale visual–language multi-modal models. Current state-of-the-art multi-modal large models struggle to provide satisfactory performance in comprehending occluded objects despite using universal visual encoders and supervised learning strategies. To address this limitation, we propose OCC-MLLM-CoT, a multi-modal large vision–language framework that integrates 3D-aware supervision with Chain-of-Thoughts reasoning. Our approach consists of three key components: (1) a comprehensive framework combining a large multi-modal vision–language model with a specialized 3D reconstruction expert model; (2) a multi-modal Chain-of-Thoughts mechanism trained through both supervised and reinforcement learning strategies, enabling the model to develop advanced reasoning and self-reflection capabilities; and (3) a novel large-scale dataset containing 110,000 samples of occluded objects held in hand, specifically designed for multi-modal chain-of-thoughts reasoning. Experimental evaluations demonstrate that our proposed method achieves an 11.14% improvement in decision score, increasing from 0.6412 to 0.7526 compared to state-of-the-art multi-modal large language models.
对于现有的大规模视觉语言多模态模型来说,理解被遮挡的物体仍然是一个未被充分探索的挑战。尽管使用了通用视觉编码器和监督学习策略,但目前最先进的多模态大型模型在理解遮挡物体方面仍难以提供令人满意的性能。为了解决这一限制,我们提出了OCC-MLLM-CoT,这是一个多模态大型视觉语言框架,将3d感知监督与思维链推理集成在一起。我们的方法由三个关键部分组成:(1)将大型多模态视觉语言模型与专门的3D重建专家模型相结合的综合框架;(2)通过监督学习和强化学习策略训练的多模态思维链机制,使模型能够发展高级推理和自我反思能力;(3)专门为多模态思维链推理设计的包含110,000个被遮挡物体样本的新型大规模数据集。实验评估表明,与最先进的多模态大型语言模型相比,我们提出的方法在决策得分方面提高了11.14%,从0.6412提高到0.7526。
{"title":"OCC-MLLM-CoT: Self-correction enhanced occlusion recognition with large language models via 3D-aware supervision, chain-of-thoughts guidance","authors":"Chaoyi Wang ,&nbsp;Fangzhou Meng ,&nbsp;Jun Pei ,&nbsp;Lijie Xia ,&nbsp;Jianpo Liu ,&nbsp;Xiaobing Yuan ,&nbsp;Xinhan Di","doi":"10.1016/j.imavis.2025.105881","DOIUrl":"10.1016/j.imavis.2025.105881","url":null,"abstract":"<div><div>Comprehending occluded objects remains an underexplored challenge for existing large-scale visual–language multi-modal models. Current state-of-the-art multi-modal large models struggle to provide satisfactory performance in comprehending occluded objects despite using universal visual encoders and supervised learning strategies. To address this limitation, we propose OCC-MLLM-CoT, a multi-modal large vision–language framework that integrates 3D-aware supervision with Chain-of-Thoughts reasoning. Our approach consists of three key components: (1) a comprehensive framework combining a large multi-modal vision–language model with a specialized 3D reconstruction expert model; (2) a multi-modal Chain-of-Thoughts mechanism trained through both supervised and reinforcement learning strategies, enabling the model to develop advanced reasoning and self-reflection capabilities; and (3) a novel large-scale dataset containing 110,000 samples of occluded objects held in hand, specifically designed for multi-modal chain-of-thoughts reasoning. Experimental evaluations demonstrate that our proposed method achieves an 11.14% improvement in decision score, increasing from 0.6412 to 0.7526 compared to state-of-the-art multi-modal large language models.</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"167 ","pages":"Article 105881"},"PeriodicalIF":4.2,"publicationDate":"2025-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145885583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HDD-Unet: A Unet-based architecture for low-light image enhancement HDD-Unet:用于弱光图像增强的基于unet的架构
IF 4.2 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-24 DOI: 10.1016/j.imavis.2025.105889
Elissavet Batziou , Konstantinos Ioannidis , Ioannis Patras , Stefanos Vrochidis , Ioannis Kompatsiaris
Low-light imaging has become a popular topic in image processing, with the quality enhancement of low light images being as a significant challenge, due to the difficulty in retaining colors, patterns, texture and style when generating a normal light image. Our objectives are mainly to firstly better preserve texture regions in image enhancement, while, secondly, preserving colors via color histogram blocks and, finally, to enhance the quality of image through dense denoising blocks. Our proposed novel framework, namely HDD-Unet, is a double Unet based on photorealistic style transfer for low-light image enhancement. The proposed low-light image enhancement method combines color histogram-based fusion, Haar wavelet pooling, dense-denoising blocks and U-net as a backbone architecture to enhance the contrast, reduce noise, and improve the visibility of low light images. Experimental results demonstrate that our proposed method outperforms existing methods in terms of PSNR and SSIM quantitative evaluation metrics, reaching or outperforming state-of-the-art accuracy, but with less resources. We also conduct an ablation study to investigate the impact of our approach on overexposed images, and systematic analysis on the late fusion weighting parameters. Multiple experiments were conducted with artificial noise inserted to accomplish more efficient comparison. The results show that the proposed framework enhances accurately images with various gamma corrections. The proposed method represents a significant advance in the field of low light image enhancement and has the potential to address several challenges associated with low light imaging.
弱光成像已经成为图像处理中的一个热门话题,由于在生成正常光图像时难以保留颜色、图案、纹理和风格,因此增强弱光图像的质量是一个重大挑战。我们的目标主要是首先在图像增强中更好地保留纹理区域,其次通过颜色直方图块来保留颜色,最后通过密集去噪块来增强图像质量。我们提出的新框架,即HDD-Unet,是一种基于真实感风格转移的双Unet,用于弱光图像增强。提出的弱光图像增强方法将基于颜色直方图的融合、Haar小波池、密集去噪块和U-net作为主干架构,增强弱光图像的对比度,降低噪声,提高图像的可见度。实验结果表明,我们提出的方法在PSNR和SSIM定量评估指标方面优于现有方法,达到或优于最先进的精度,但资源更少。我们还进行了消融研究,以研究我们的方法对过度曝光图像的影响,并对后期融合加权参数进行了系统分析。为了更有效地进行比较,多次实验都加入了人工噪声。结果表明,该框架在不同的伽玛校正下都能提高图像的精度。所提出的方法代表了低光图像增强领域的重大进步,并具有解决与低光成像相关的几个挑战的潜力。
{"title":"HDD-Unet: A Unet-based architecture for low-light image enhancement","authors":"Elissavet Batziou ,&nbsp;Konstantinos Ioannidis ,&nbsp;Ioannis Patras ,&nbsp;Stefanos Vrochidis ,&nbsp;Ioannis Kompatsiaris","doi":"10.1016/j.imavis.2025.105889","DOIUrl":"10.1016/j.imavis.2025.105889","url":null,"abstract":"<div><div>Low-light imaging has become a popular topic in image processing, with the quality enhancement of low light images being as a significant challenge, due to the difficulty in retaining colors, patterns, texture and style when generating a normal light image. Our objectives are mainly to firstly better preserve texture regions in image enhancement, while, secondly, preserving colors via color histogram blocks and, finally, to enhance the quality of image through dense denoising blocks. Our proposed novel framework, namely HDD-Unet, is a double Unet based on photorealistic style transfer for low-light image enhancement. The proposed low-light image enhancement method combines color histogram-based fusion, Haar wavelet pooling, dense-denoising blocks and U-net as a backbone architecture to enhance the contrast, reduce noise, and improve the visibility of low light images. Experimental results demonstrate that our proposed method outperforms existing methods in terms of PSNR and SSIM quantitative evaluation metrics, reaching or outperforming state-of-the-art accuracy, but with less resources. We also conduct an ablation study to investigate the impact of our approach on overexposed images, and systematic analysis on the late fusion weighting parameters. Multiple experiments were conducted with artificial noise inserted to accomplish more efficient comparison. The results show that the proposed framework enhances accurately images with various gamma corrections. The proposed method represents a significant advance in the field of low light image enhancement and has the potential to address several challenges associated with low light imaging.</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"167 ","pages":"Article 105889"},"PeriodicalIF":4.2,"publicationDate":"2025-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145842588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Image and Vision Computing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1