首页 > 最新文献

IEEE Transactions on Circuits and Systems for Video Technology最新文献

英文 中文
Multimodal Local Global Interaction Networks for Automatic Depression Severity Estimation 基于多模态局部全局交互网络的抑郁严重程度自动估计
IF 11.1 1区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-09-22 DOI: 10.1109/TCSVT.2025.3612697
Mingyue Niu;Zhuhong Shao;Yongjun He;Jianhua Tao;Björn W. Schuller
Physiological studies have shown that differences between depressed and healthy individuals are manifested in the audio and video modalities. Hence, some researchers have combined local and global information from audio or video modality to obtain the unimodal representation. Attention mechanisms or Multi-Layer Perceptrons (MLPs) are then used to complete the fusion of different representations. However, attention mechanisms or MLPs is essentially a linear aggregation manner, and lacks the ability to explore the element-wise interaction between local and global representations within and across modalities, which affects the accuracy of estimating the depression severity. To this end, we propose a Representation Interaction (RI) module, which uses the mutual linear adjustment to achieve element-wise interaction between representations. Thus, the RI module can be seen as an mutual observation of two representations, which helps to achieve complementary advantages and improve the model’s ability to characterize depression cues. Furthermore, since the interaction process generates multiple representations, we propose a Multi-representation Prediction (MP) module. This module implements multi-representation vectorization in a hierarchical manner from summarizing a single representation to aggregating multiple representations, and adopts the attention mechanism to obtain the estimation of an individual depression severity. In this way, we use the RI and MP modules to construct the Multimodal Local Global Interaction (MLGI) network. The experimental performance on AVEC 2013 and AVEC 2014 depression datasets demonstrates the effectiveness of our method.
生理学研究表明,抑郁和健康个体之间的差异表现在音频和视频方式上。因此,一些研究人员将来自音频或视频的局部和全局信息结合起来,以获得单模态表示。然后使用注意机制或多层感知器(mlp)来完成不同表征的融合。然而,注意机制或mlp本质上是一种线性聚合方式,缺乏探索模式内部和跨模式的局部和全局表征之间元素明智的相互作用的能力,这影响了估计抑郁严重程度的准确性。为此,我们提出了一个表示交互(RI)模块,它使用相互线性调整来实现表示之间的元素交互。因此,RI模块可以被视为两种表征的相互观察,这有助于实现互补优势,并提高模型表征抑郁线索的能力。此外,由于交互过程产生多个表示,我们提出了一个多表示预测(MP)模块。该模块采用从汇总单一表征到聚合多个表征的分层方式实现多表征矢量化,并采用注意机制获得个体抑郁严重程度的估计。通过这种方式,我们使用RI和MP模块来构建多模态局部全局交互(MLGI)网络。在AVEC 2013和AVEC 2014洼地数据集上的实验结果验证了该方法的有效性。
{"title":"Multimodal Local Global Interaction Networks for Automatic Depression Severity Estimation","authors":"Mingyue Niu;Zhuhong Shao;Yongjun He;Jianhua Tao;Björn W. Schuller","doi":"10.1109/TCSVT.2025.3612697","DOIUrl":"https://doi.org/10.1109/TCSVT.2025.3612697","url":null,"abstract":"Physiological studies have shown that differences between depressed and healthy individuals are manifested in the audio and video modalities. Hence, some researchers have combined local and global information from audio or video modality to obtain the unimodal representation. Attention mechanisms or Multi-Layer Perceptrons (MLPs) are then used to complete the fusion of different representations. However, attention mechanisms or MLPs is essentially a linear aggregation manner, and lacks the ability to explore the element-wise interaction between local and global representations within and across modalities, which affects the accuracy of estimating the depression severity. To this end, we propose a Representation Interaction (RI) module, which uses the mutual linear adjustment to achieve element-wise interaction between representations. Thus, the RI module can be seen as an mutual observation of two representations, which helps to achieve complementary advantages and improve the model’s ability to characterize depression cues. Furthermore, since the interaction process generates multiple representations, we propose a Multi-representation Prediction (MP) module. This module implements multi-representation vectorization in a hierarchical manner from summarizing a single representation to aggregating multiple representations, and adopts the attention mechanism to obtain the estimation of an individual depression severity. In this way, we use the RI and MP modules to construct the Multimodal Local Global Interaction (MLGI) network. The experimental performance on AVEC 2013 and AVEC 2014 depression datasets demonstrates the effectiveness of our method.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"36 2","pages":"2649-2664"},"PeriodicalIF":11.1,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High Accuracy Rate Control for Neural Video Coding Based on Rate-Distortion Modeling 基于率失真建模的神经视频编码高精度率控制
IF 11.1 1区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-09-17 DOI: 10.1109/TCSVT.2025.3610946
Longtao Feng;Qian Yin;Jiaqi Zhang;Yuwen He;Siwei Ma
In recent years, rate control (RC) for neural video coding (NVC) has become an active research area. However, existing RC methods in NVC neglect the actual rate-distortion (R-D) characteristics and lack dedicated optimization strategies for intra and inter modes, leading to significant bit rate errors. To address these issues, we propose a high accuracy RC method for NVC based on R-D modeling, which integrates intra frame RC, inter frame RC and bit allocation. Specifically, the rate-quantization parameter (R-Q) model and R-D model are established for both intra frame and inter frame in NVC. To derive the model parameters, intra frame parameters are estimated using high dimensional features, while inter frame parameters are derived using gradient descent based model update methods. Based on the proposed R-Q model, intra frame and inter frame RC methods are proposed to determine the quantization parameters (QP). Meanwhile, a bit allocation method is developed based on the derived R-D models to allocate bits for the intra frame and inter frame. Extensive experiments demonstrate that, benefiting from the accurate R-Q models derived by the proposed approach, highly accurate RC is achieved with only 0.56% average bit rate error. Compared with other methods, the proposed method reduces the average bit rate error by more than 4.18%, and achieves over 8.94% Bjøntegaard Delta Rate savings.
近年来,神经视频编码(NVC)的速率控制成为一个活跃的研究领域。然而,NVC中现有的RC方法忽略了实际的率失真(R-D)特性,并且缺乏针对内模式和间模式的专用优化策略,导致严重的比特率误差。为了解决这些问题,我们提出了一种基于R-D建模的NVC高精度RC方法,该方法集成了帧内RC、帧间RC和位分配。具体而言,分别建立了NVC帧内和帧间的速率量化参数(R-Q)模型和R-D模型。为了获得模型参数,采用高维特征估计帧内参数,采用基于梯度下降的模型更新方法获得帧间参数。基于所提出的R-Q模型,提出了帧内和帧间RC方法来确定量化参数(QP)。同时,基于所导出的R-D模型,提出了一种分配帧内和帧间比特的方法。大量实验表明,利用该方法得到的精确的R-Q模型,可以实现高精度的RC,平均比特率误差仅为0.56%。与其他方法相比,该方法将平均比特率误差率降低了4.18%以上,实现了8.94%以上的Bjøntegaard Delta rate节约。
{"title":"High Accuracy Rate Control for Neural Video Coding Based on Rate-Distortion Modeling","authors":"Longtao Feng;Qian Yin;Jiaqi Zhang;Yuwen He;Siwei Ma","doi":"10.1109/TCSVT.2025.3610946","DOIUrl":"https://doi.org/10.1109/TCSVT.2025.3610946","url":null,"abstract":"In recent years, rate control (RC) for neural video coding (NVC) has become an active research area. However, existing RC methods in NVC neglect the actual rate-distortion (<italic>R-D</i>) characteristics and lack dedicated optimization strategies for intra and inter modes, leading to significant bit rate errors. To address these issues, we propose a high accuracy RC method for NVC based on <italic>R-D</i> modeling, which integrates intra frame RC, inter frame RC and bit allocation. Specifically, the rate-quantization parameter (<italic>R-Q</i>) model and <italic>R-D</i> model are established for both intra frame and inter frame in NVC. To derive the model parameters, intra frame parameters are estimated using high dimensional features, while inter frame parameters are derived using gradient descent based model update methods. Based on the proposed <italic>R-Q</i> model, intra frame and inter frame RC methods are proposed to determine the quantization parameters (QP). Meanwhile, a bit allocation method is developed based on the derived <italic>R-D</i> models to allocate bits for the intra frame and inter frame. Extensive experiments demonstrate that, benefiting from the accurate <italic>R-Q</i> models derived by the proposed approach, highly accurate RC is achieved with only 0.56% average bit rate error. Compared with other methods, the proposed method reduces the average bit rate error by more than 4.18%, and achieves over 8.94% Bjøntegaard Delta Rate savings.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"36 2","pages":"2551-2567"},"PeriodicalIF":11.1,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HOH-Net: High-Order Hierarchical Middle-Feature Learning Network for Visible-Infrared Person Re-Identification HOH-Net:高阶分层中特征学习网络,用于可见-红外人物再识别
IF 11.1 1区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-09-16 DOI: 10.1109/TCSVT.2025.3609840
Liuxiang Qiu;Si Chen;Jing-Hao Xue;Da-Han Wang;Shunzhi Zhu;Yan Yan
Visible-infrared person re-identification (VI-ReID) is a cross-modality retrieval task that aims to match images of the same person across visible (VIS) and infrared (IR) modalities. Existing VI-ReID methods ignore high-order structure information of features and struggle to learn a reliable common feature space due to the modality discrepancy between VIS and IR images. To alleviate the above issues, we propose a novel high-order hierarchical middle-feature learning network (HOH-Net) for VI-ReID. We introduce a high-order structure learning (HSL) module to explore the high-order relationships of short- and long-range feature nodes, for significantly mitigating model collapse and effectively obtaining discriminative features. We further develop a fine-coarse graph attention alignment (FCGA) module, which efficiently aligns multi-modality feature nodes from node-level and region-level perspectives, ensuring reliable middle-feature representations. Moreover, we exploit a hierarchical middle-feature agent learning (HMAL) loss to hierarchically reduce the modality discrepancy at each stage of the network by using the agents of middle features. The proposed HMAL loss also exchanges detailed and semantic information between low- and high-stage networks. Finally, we introduce a modality-range identity-center contrastive (MRIC) loss to minimize the distances between VIS, IR, and middle features. Extensive experiments demonstrate that the proposed HOH-Net yields state-of-the-art performance on the image-based and video-based VI-ReID datasets. The code is available at: https://github.com/Jaulaucoeng/HOS-Net
可见-红外人物再识别(VI-ReID)是一种跨模态检索任务,旨在通过可见(VIS)和红外(IR)模态匹配同一个人的图像。现有的VI-ReID方法忽略了特征的高阶结构信息,并且由于VIS和IR图像的模态差异而难以学习到可靠的公共特征空间。为了解决上述问题,我们提出了一种新的高阶分层中特征学习网络(HOH-Net)。我们引入了一个高阶结构学习(HSL)模块来探索短、长特征节点之间的高阶关系,以显著减轻模型崩溃并有效地获得判别特征。我们进一步开发了精细粗图注意对齐(FCGA)模块,该模块从节点级和区域级的角度有效地对齐多模态特征节点,确保可靠的中间特征表示。此外,我们利用分层的中间特征代理学习(HMAL)损失,利用中间特征代理分层地减少网络各阶段的模态差异。提出的HMAL损失还可以在低级和高级网络之间交换详细的语义信息。最后,我们引入了模态范围身份中心对比(mrc)损失,以最小化VIS, IR和中间特征之间的距离。大量的实验表明,所提出的HOH-Net在基于图像和基于视频的VI-ReID数据集上产生了最先进的性能。代码可从https://github.com/Jaulaucoeng/HOS-Net获得
{"title":"HOH-Net: High-Order Hierarchical Middle-Feature Learning Network for Visible-Infrared Person Re-Identification","authors":"Liuxiang Qiu;Si Chen;Jing-Hao Xue;Da-Han Wang;Shunzhi Zhu;Yan Yan","doi":"10.1109/TCSVT.2025.3609840","DOIUrl":"https://doi.org/10.1109/TCSVT.2025.3609840","url":null,"abstract":"Visible-infrared person re-identification (VI-ReID) is a cross-modality retrieval task that aims to match images of the same person across visible (VIS) and infrared (IR) modalities. Existing VI-ReID methods ignore high-order structure information of features and struggle to learn a reliable common feature space due to the modality discrepancy between VIS and IR images. To alleviate the above issues, we propose a novel high-order hierarchical middle-feature learning network (HOH-Net) for VI-ReID. We introduce a high-order structure learning (HSL) module to explore the high-order relationships of short- and long-range feature nodes, for significantly mitigating model collapse and effectively obtaining discriminative features. We further develop a fine-coarse graph attention alignment (FCGA) module, which efficiently aligns multi-modality feature nodes from node-level and region-level perspectives, ensuring reliable middle-feature representations. Moreover, we exploit a hierarchical middle-feature agent learning (HMAL) loss to hierarchically reduce the modality discrepancy at each stage of the network by using the agents of middle features. The proposed HMAL loss also exchanges detailed and semantic information between low- and high-stage networks. Finally, we introduce a modality-range identity-center contrastive (MRIC) loss to minimize the distances between VIS, IR, and middle features. Extensive experiments demonstrate that the proposed HOH-Net yields state-of-the-art performance on the image-based and video-based VI-ReID datasets. The code is available at: <uri>https://github.com/Jaulaucoeng/HOS-Net</uri>","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"36 2","pages":"2607-2622"},"PeriodicalIF":11.1,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FedDAAM: Federated Domain Adversarial Learning With Attention Mechanism for Privacy Preserving Multimodal Depression Assessment 基于注意机制的联邦领域对抗学习多模态抑郁评估
IF 11.1 1区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-09-16 DOI: 10.1109/TCSVT.2025.3609776
Lang He;Weizhao Yang;Junnan Zhao;Haifeng Chen;Dongmei Jiang
Major depressive disorder (MDD) is projected to become one of the leading mental disorders by 2030. While audiovisual cues have garnered significant attention in depression recognition research owing to their non-invasive acquisition and rich emotional expressiveness. However, conventional centralized training paradigms raise substantial privacy concerns for individuals with depression and are further hindered by data heterogeneity and label inconsistency across datasets. To overcome these challenges, a hybrid architecture, termed Federated Domain Adversarial with Attention Mechanism (FedDAAM), for privacy preserving multimodal depression assessment, is proposed. FedDAAM introduces a mechanism by differentiating discriminative features into depression-public and depression-private features. Specifically, to extract visual depression-private features from the AVEC2013 and AVEC2014 datasets, a local attention-aware (LAA) architecture is developed. For the depression-public features, action units (AUs), landmarks, head poses, and eye gazes features are adopted. In addition, to consider the transferability and performance of individual client, a dynamic parameter aggregation mechanism, termed FedDyA, is proposed. Extensive validations are performed on the AVEC2013, AVEC2014 and AVEC2017 databases, resulting in root mean square error (RMSE) and mean absolute error (MAE) of 8.61/6.78, 8.59/6.77, and 4.71/3.68, respectively. More importantly, to the best of our knowledge, this is the first study to borrow federated learning (FL) for multimodal depression assessment. The proposed framework offers a novel solution for privacy-aware, distributed clinical diagnosis of depression. Code will be available at: https://github.com/helang818/FedDAAM/
预计到2030年,重度抑郁症(MDD)将成为主要的精神障碍之一。而视听线索因其非侵入性获取和丰富的情感表达能力,在抑郁症识别研究中受到了极大的关注。然而,传统的集中式训练范式引起了抑郁症患者的大量隐私问题,并进一步受到数据集的数据异质性和标签不一致的阻碍。为了克服这些挑战,提出了一种用于保护隐私的多模态抑郁评估的混合架构,称为联邦域对抗与注意机制(federdaam)。FedDAAM引入了一种机制,将歧视性特征区分为抑郁-公共特征和抑郁-私人特征。具体而言,为了从AVEC2013和AVEC2014数据集中提取视觉抑郁私密特征,开发了一种局部注意力感知(LAA)架构。洼地公共特征采用动作单元、地标、头部姿势、眼睛注视特征。此外,考虑到单个客户端的可转移性和性能,提出了一种动态参数聚合机制,称为FedDyA。对AVEC2013、AVEC2014和AVEC2017数据库进行了广泛的验证,得出均方根误差(RMSE)为8.61/6.78,平均绝对误差(MAE)为8.59/6.77,平均绝对误差(MAE)为4.71/3.68。更重要的是,据我们所知,这是第一个将联邦学习(FL)用于多模式抑郁评估的研究。提出的框架提供了一个新的解决方案,隐私意识,抑郁症的分布式临床诊断。代码将在https://github.com/helang818/FedDAAM/上提供
{"title":"FedDAAM: Federated Domain Adversarial Learning With Attention Mechanism for Privacy Preserving Multimodal Depression Assessment","authors":"Lang He;Weizhao Yang;Junnan Zhao;Haifeng Chen;Dongmei Jiang","doi":"10.1109/TCSVT.2025.3609776","DOIUrl":"https://doi.org/10.1109/TCSVT.2025.3609776","url":null,"abstract":"Major depressive disorder (MDD) is projected to become one of the leading mental disorders by 2030. While audiovisual cues have garnered significant attention in depression recognition research owing to their non-invasive acquisition and rich emotional expressiveness. However, conventional centralized training paradigms raise substantial privacy concerns for individuals with depression and are further hindered by data heterogeneity and label inconsistency across datasets. To overcome these challenges, a hybrid architecture, termed <bold>Fed</b>erated <bold>D</b>omain <bold>A</b>dversarial with <bold>A</b>ttention <bold>M</b>echanism (FedDAAM), for privacy preserving multimodal depression assessment, is proposed. FedDAAM introduces a mechanism by differentiating discriminative features into depression-public and depression-private features. Specifically, to extract visual depression-private features from the AVEC2013 and AVEC2014 datasets, a local attention-aware (LAA) architecture is developed. For the depression-public features, action units (AUs), landmarks, head poses, and eye gazes features are adopted. In addition, to consider the transferability and performance of individual client, a dynamic parameter aggregation mechanism, termed FedDyA, is proposed. Extensive validations are performed on the AVEC2013, AVEC2014 and AVEC2017 databases, resulting in root mean square error (RMSE) and mean absolute error (MAE) of 8.61/6.78, 8.59/6.77, and 4.71/3.68, respectively. More importantly, to the best of our knowledge, this is the first study to borrow federated learning (FL) for multimodal depression assessment. The proposed framework offers a novel solution for privacy-aware, distributed clinical diagnosis of depression. Code will be available at: <uri>https://github.com/helang818/FedDAAM/</uri>","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"36 2","pages":"2635-2648"},"PeriodicalIF":11.1,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring Pruning-Based Efficient Object Tracking via Hybrid Knowledge Distillation 基于剪枝的混合知识蒸馏高效目标跟踪研究
IF 11.1 1区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-09-12 DOI: 10.1109/TCSVT.2025.3609410
Yidong Song;Shilei Wang;Zhaochuan Zeng;Jikai Zheng;Zhenhua Wang;Jifeng Ning
Transformer-based trackers have demonstrated remarkable advancements in real-time tracking tasks on edge devices. Since lightweight backbone networks are typically designed for general-purpose tasks, our analysis reveals that, when applied to target tracking, they often contain structurally redundant layers, which limits the model’s efficiency. To address this issue, we propose a novel tracking framework that integrates backbone pruning with Hybrid Knowledge Distillation (HKD), effectively reducing model parameters and FLOPs while preserving high tracking accuracy. Inspired by the success of MiniLM and Focal and Global Distillation (FGD), we design a HKD framework tailored for tracking tasks. Our HKD introduces a multi-level and complementary distillation scheme, consisting of Token Distillation, Local Distillation, and Global Distillation. In Token Distillation, unlike MiniLM, which distills attention via QK dot-products and V, we disentangle and separately distill Q, K, and V representations to enhance structural attention alignment for tracking. For Local Distillation, we use the FGD concept by incorporating spatial foreground-background masks to capture region-specific discriminative cues more effectively. In Global Distillation, we use Vision Mamba module to model long-range dependencies and enhance semantic-level feature alignment. Our tracker HKDT achieves state-of-the-art (SOTA) performance across multiple datasets. On the GOT-10k benchmark, it demonstrates a groundbreaking 67.6% Average Overlap (AO), outperforming the current SOTA real-time tracker HiT-Base by 3.6% in accuracy while reducing computational costs by 64% and achieving 115% faster tracking speed on CPU platforms. The code and model will be available soon.
基于变压器的跟踪器在边缘设备的实时跟踪任务中表现出了显着的进步。由于轻量级骨干网通常是为通用任务设计的,我们的分析表明,当应用于目标跟踪时,它们通常包含结构冗余层,这限制了模型的效率。为了解决这一问题,我们提出了一种新的跟踪框架,该框架将主干修剪与混合知识蒸馏(HKD)相结合,有效地减少了模型参数和FLOPs,同时保持了较高的跟踪精度。受MiniLM、Focal和Global Distillation (FGD)成功的启发,我们设计了一个为跟踪任务量身定制的HKD框架。我们的HKD引入了一个多层次和互补的蒸馏方案,包括令牌蒸馏、局部蒸馏和全局蒸馏。在令牌蒸馏中,与通过QK点积和V提取注意力的MiniLM不同,我们分离并分别提取Q, K和V表示,以增强跟踪的结构性注意力对齐。对于局部蒸馏,我们通过结合空间前景-背景掩膜来使用FGD概念,以更有效地捕获区域特定的判别线索。在Global Distillation中,我们使用Vision Mamba模块对远程依赖关系进行建模,并增强语义级的特征对齐。我们的跟踪器HKDT在多个数据集上实现了最先进的(SOTA)性能。在GOT-10k基准测试中,它的平均重叠度(AO)达到了67.6%,比当前SOTA实时跟踪器HiT-Base的精度高出3.6%,同时计算成本降低了64%,在CPU平台上的跟踪速度提高了115%。代码和模型将很快提供。
{"title":"Exploring Pruning-Based Efficient Object Tracking via Hybrid Knowledge Distillation","authors":"Yidong Song;Shilei Wang;Zhaochuan Zeng;Jikai Zheng;Zhenhua Wang;Jifeng Ning","doi":"10.1109/TCSVT.2025.3609410","DOIUrl":"https://doi.org/10.1109/TCSVT.2025.3609410","url":null,"abstract":"Transformer-based trackers have demonstrated remarkable advancements in real-time tracking tasks on edge devices. Since lightweight backbone networks are typically designed for general-purpose tasks, our analysis reveals that, when applied to target tracking, they often contain structurally redundant layers, which limits the model’s efficiency. To address this issue, we propose a novel tracking framework that integrates backbone pruning with Hybrid Knowledge Distillation (HKD), effectively reducing model parameters and FLOPs while preserving high tracking accuracy. Inspired by the success of MiniLM and Focal and Global Distillation (FGD), we design a HKD framework tailored for tracking tasks. Our HKD introduces a multi-level and complementary distillation scheme, consisting of Token Distillation, Local Distillation, and Global Distillation. In Token Distillation, unlike MiniLM, which distills attention via QK dot-products and V, we disentangle and separately distill Q, K, and V representations to enhance structural attention alignment for tracking. For Local Distillation, we use the FGD concept by incorporating spatial foreground-background masks to capture region-specific discriminative cues more effectively. In Global Distillation, we use Vision Mamba module to model long-range dependencies and enhance semantic-level feature alignment. Our tracker HKDT achieves state-of-the-art (SOTA) performance across multiple datasets. On the GOT-10k benchmark, it demonstrates a groundbreaking 67.6% Average Overlap (AO), outperforming the current SOTA real-time tracker HiT-Base by 3.6% in accuracy while reducing computational costs by 64% and achieving 115% faster tracking speed on CPU platforms. The code and model will be available soon.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"36 2","pages":"2433-2448"},"PeriodicalIF":11.1,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Weakly Supervised Object Detection for Aerial Images With Instance-Aware Label Assignment 基于实例感知标签分配的航空图像弱监督目标检测
IF 11.1 1区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-09-12 DOI: 10.1109/TCSVT.2025.3609322
Xuan Xie;Xiang Yuan;Gong Cheng
Weakly supervised object detection has emerged as a cost-effective and promising solution in remote sensing, as it requires only image-level labels and alleviates the burden of labor-intensive instance-level annotations. Existing approaches tend to assign top-scoring proposals and their highly overlapping counterparts as positive samples, thereby overlooking the inherent gap between high classification confidence and precise localization, which in turn introduces the risk of part domination and instance missing. In order to address these concerns, this paper introduces an Instance-aware Label Assignment scheme for weakly supervised object detection in remote sensing images, termed ILA. Specifically, we propose a context-aware learning network that aims to prioritize regions fully covering the object over top-scoring yet incomplete candidates. This is empowered by the proposed context classification loss, which dynamically responds to the degree of object visibility, thereby driving the model toward representative proposals and mitigating the optimization dilemma caused by partial coverage. Additionally, an instance excavation module is implemented to reduce the risk of misclassifying object instances as negatives. At its core lies the proposed pseudo ground truth mining (PGM) algorithm, which constructs reliable pseudo boxes from the outputs of the basic multiple instance learning network to excavate potential object instances. Comprehensive evaluations on the challenging NWPU VHR-10.v2 and DIOR datasets underscore the efficacy of our approach, with achieved mean average precision (mAP) scores of 76.56% and 31.73%, respectively.
弱监督目标检测已成为遥感领域一种具有成本效益和前景的解决方案,因为它只需要图像级标签,减轻了劳动密集型实例级注释的负担。现有方法倾向于将得分最高的提案及其高度重叠的对应提案分配为正样本,从而忽略了高分类置信度和精确定位之间的固有差距,这反过来又引入了部分支配和实例缺失的风险。为了解决这些问题,本文介绍了一种用于遥感图像弱监督目标检测的实例感知标签分配方案,称为ILA。具体来说,我们提出了一个上下文感知学习网络,旨在优先考虑完全覆盖对象的区域,而不是得分最高但不完整的候选区域。提议的上下文分类损失可以动态响应对象可见性的程度,从而推动模型向具有代表性的提议发展,缓解部分覆盖带来的优化困境。此外,还实现了一个实例挖掘模块,以降低将对象实例错误分类为负数的风险。其核心是伪地真挖掘(PGM)算法,该算法从基本多实例学习网络的输出中构造可靠的伪盒来挖掘潜在的对象实例。具有挑战性的NWPU VHR-10的综合评价。v2和DIOR数据集强调了我们方法的有效性,实现的平均精度(mAP)得分分别为76.56%和31.73%。
{"title":"Weakly Supervised Object Detection for Aerial Images With Instance-Aware Label Assignment","authors":"Xuan Xie;Xiang Yuan;Gong Cheng","doi":"10.1109/TCSVT.2025.3609322","DOIUrl":"https://doi.org/10.1109/TCSVT.2025.3609322","url":null,"abstract":"Weakly supervised object detection has emerged as a cost-effective and promising solution in remote sensing, as it requires only image-level labels and alleviates the burden of labor-intensive instance-level annotations. Existing approaches tend to assign top-scoring proposals and their highly overlapping counterparts as positive samples, thereby overlooking the inherent gap between high classification confidence and precise localization, which in turn introduces the risk of part domination and instance missing. In order to address these concerns, this paper introduces an <underline>I</u>nstance-aware <underline>L</u>abel <underline>A</u>ssignment scheme for weakly supervised object detection in remote sensing images, termed ILA. Specifically, we propose a context-aware learning network that aims to prioritize regions fully covering the object over top-scoring yet incomplete candidates. This is empowered by the proposed context classification loss, which dynamically responds to the degree of object visibility, thereby driving the model toward representative proposals and mitigating the optimization dilemma caused by partial coverage. Additionally, an instance excavation module is implemented to reduce the risk of misclassifying object instances as negatives. At its core lies the proposed pseudo ground truth mining (PGM) algorithm, which constructs reliable pseudo boxes from the outputs of the basic multiple instance learning network to excavate potential object instances. Comprehensive evaluations on the challenging NWPU VHR-10.v2 and DIOR datasets underscore the efficacy of our approach, with achieved mean average precision (mAP) scores of 76.56% and 31.73%, respectively.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"36 2","pages":"2492-2504"},"PeriodicalIF":11.1,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IEEE Circuits and Systems Society Information IEEE电路与系统学会信息
IF 11.1 1区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-09-09 DOI: 10.1109/TCSVT.2025.3600974
{"title":"IEEE Circuits and Systems Society Information","authors":"","doi":"10.1109/TCSVT.2025.3600974","DOIUrl":"https://doi.org/10.1109/TCSVT.2025.3600974","url":null,"abstract":"","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"35 9","pages":"C3-C3"},"PeriodicalIF":11.1,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11154653","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145021441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IEEE Transactions on Circuits and Systems for Video Technology Publication Information IEEE视频技术电路与系统汇刊
IF 11.1 1区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-09-09 DOI: 10.1109/TCSVT.2025.3600972
{"title":"IEEE Transactions on Circuits and Systems for Video Technology Publication Information","authors":"","doi":"10.1109/TCSVT.2025.3600972","DOIUrl":"https://doi.org/10.1109/TCSVT.2025.3600972","url":null,"abstract":"","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"35 9","pages":"C2-C2"},"PeriodicalIF":11.1,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11154656","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145021215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Beyond One and Two Tower: Cross-Modal Consensus Learning for Image-Text Retrieval 超越一座和两座塔:图像-文本检索的跨模态共识学习
IF 11.1 1区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-09-04 DOI: 10.1109/TCSVT.2025.3605958
Zhangxiang Shi;Yunlai Ding;Junyu Dong;Tianzhu Zhang
Existing image-text retrieval methods mainly rely on region and word features to measure cross-modal similarities. Thus, dense cross-modal semantic alignment which matches regions and words becomes crucial. However, this is non-trivial due to the heterogeneity gap and the cross-modal attention used to achieve this alignment is inefficient. Towards solving this problem, we propose a novel framework that goes beyond the previous one-tower and two-tower frameworks to learn cross-modal consensus efficiently. The proposed framework does not align regions and words directly like existing methods but uses semantic prototypes as a bridge to attend specific contents with the same semantics among different modalities through semantic decoders, through which cross-modal semantic alignment is naturally achieved. Furthermore, we design a novel plug-and-play self-correction method based on optimal transport to alleviate the drawbacks of incomplete pairwise labels in existing multimodal datasets. On top of various base backbones, we carry out extensive experiments on two benchmark datasets, i.e., Flickr30K and MS-COCO, demonstrating the effectiveness, superiority and generalization of our method.
现有的图像-文本检索方法主要依靠区域特征和词特征来度量跨模态相似度。因此,匹配区域和单词的密集跨模态语义对齐变得至关重要。然而,由于异质性差距和用于实现这种对齐的跨模态注意力是低效的,这不是微不足道的。为了解决这个问题,我们提出了一个新的框架,它超越了以前的单塔和双塔框架,以有效地学习跨模态共识。本文提出的框架不像现有方法那样直接对齐区域和单词,而是以语义原型为桥梁,通过语义解码器在不同模态中参与具有相同语义的特定内容,从而自然实现跨模态语义对齐。此外,我们设计了一种基于最优传输的即插即用自校正方法,以减轻现有多模态数据集不完全成对标记的缺点。在各种基础骨干网的基础上,我们在两个基准数据集(Flickr30K和MS-COCO)上进行了大量的实验,证明了我们方法的有效性、优越性和泛化性。
{"title":"Beyond One and Two Tower: Cross-Modal Consensus Learning for Image-Text Retrieval","authors":"Zhangxiang Shi;Yunlai Ding;Junyu Dong;Tianzhu Zhang","doi":"10.1109/TCSVT.2025.3605958","DOIUrl":"https://doi.org/10.1109/TCSVT.2025.3605958","url":null,"abstract":"Existing image-text retrieval methods mainly rely on region and word features to measure cross-modal similarities. Thus, dense cross-modal semantic alignment which matches regions and words becomes crucial. However, this is non-trivial due to the heterogeneity gap and the cross-modal attention used to achieve this alignment is inefficient. Towards solving this problem, we propose a novel framework that goes beyond the previous one-tower and two-tower frameworks to learn cross-modal consensus efficiently. The proposed framework does not align regions and words directly like existing methods but uses semantic prototypes as a bridge to attend specific contents with the same semantics among different modalities through semantic decoders, through which cross-modal semantic alignment is naturally achieved. Furthermore, we design a novel plug-and-play self-correction method based on optimal transport to alleviate the drawbacks of incomplete pairwise labels in existing multimodal datasets. On top of various base backbones, we carry out extensive experiments on two benchmark datasets, <italic>i.e.</i>, Flickr30K and MS-COCO, demonstrating the effectiveness, superiority and generalization of our method.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"36 2","pages":"2581-2593"},"PeriodicalIF":11.1,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FPAD: Fuzzy-Prototype-Guided Adversarial Attack and Defense for Deep Cross-Modal Hashing FPAD:深度跨模态哈希的模糊原型引导对抗性攻击与防御
IF 11.1 1区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-08-29 DOI: 10.1109/TCSVT.2025.3604033
Zhongqing Yu;Xin Liu;Yiu-Ming Cheung;Lei Zhu;Xing Xu;Nannan Wang
Deep cross-modal hashing models generally inherit the vulnerabilities of deep neural networks, making them susceptible to adversarial attacks and thus posing a serious security risk during real-world deployment. Current adversarial attack or defense strategies often establish a weak correlation between the hashing codes and the targeted semantic representations, and there is still a lack of related works that simultaneously consider the attack and defense for deep cross-modal hashing. To alleviate these concerns, we propose a Fuzzy-Prototype-guided Adversarial Attack and Defense (FPAD) framework to enhance the adversarial robustness of deep cross-modal hashing models. First, an adaptive fuzzy-prototype learning network (FpNet) is efficiently presented to extract a set of fuzzy-prototypes, aiming to encode the underlying semantic structure of the heterogeneous modalities in both feature and Hamming spaces. Then, these derived prototypical hash codes are heuristically employed to supervise the generation of high-quality adversarial examples, while a fuzzy-prototype rectification scheme is simultaneously designed to preserve the latent semantic consistency between the adversarial and benign examples. By mixing the adversarial samples with the original training samples as the augmented inputs, an efficient fuzzy-prototype-guided adversarial learning framework is proposed to execute the collaborative adversarial training and generate robust cross-modal hash codes with high adversarial defense capabilities, therefore resisting various attacks and benefiting various challenging cross-modal hashing tasks. Extensive experiments evaluated on benchmark datasets show that the proposed FPAD framework not only produces high-quality adversarial samples to enhance the adversarial training process, but also shows its high adversarial defense capability to benefit various cross-modal hashing tasks. The code is available at: https://github.com/yzq131/FPAD
深度跨模态散列模型通常继承深度神经网络的漏洞,使其容易受到对抗性攻击,从而在实际部署中构成严重的安全风险。目前的对抗性攻击或防御策略往往在哈希码和目标语义表示之间建立弱相关性,并且仍然缺乏同时考虑深度跨模态哈希攻击和防御的相关工作。为了减轻这些担忧,我们提出了一个模糊原型引导的对抗性攻击和防御(FPAD)框架,以增强深度跨模态哈希模型的对抗性鲁棒性。首先,提出了一种自适应模糊原型学习网络(FpNet)来提取一组模糊原型,目的是在特征空间和汉明空间对异构模态的底层语义结构进行编码。然后,启发式地使用这些衍生的原型哈希码来监督高质量对抗样本的生成,同时设计模糊原型校正方案以保持对抗样本和良性样本之间的潜在语义一致性。通过将对抗样本与原始训练样本混合作为增强输入,提出了一种高效的模糊原型引导对抗学习框架来执行协同对抗训练,并生成具有高对抗防御能力的鲁棒跨模态哈希码,从而抵抗各种攻击,并有利于各种具有挑战性的跨模态哈希任务。在基准数据集上进行的大量实验表明,所提出的FPAD框架不仅可以产生高质量的对抗样本,以增强对抗训练过程,而且还显示出较高的对抗防御能力,有利于各种跨模态哈希任务。代码可从https://github.com/yzq131/FPAD获得
{"title":"FPAD: Fuzzy-Prototype-Guided Adversarial Attack and Defense for Deep Cross-Modal Hashing","authors":"Zhongqing Yu;Xin Liu;Yiu-Ming Cheung;Lei Zhu;Xing Xu;Nannan Wang","doi":"10.1109/TCSVT.2025.3604033","DOIUrl":"https://doi.org/10.1109/TCSVT.2025.3604033","url":null,"abstract":"Deep cross-modal hashing models generally inherit the vulnerabilities of deep neural networks, making them susceptible to adversarial attacks and thus posing a serious security risk during real-world deployment. Current adversarial attack or defense strategies often establish a weak correlation between the hashing codes and the targeted semantic representations, and there is still a lack of related works that simultaneously consider the attack and defense for deep cross-modal hashing. To alleviate these concerns, we propose a Fuzzy-Prototype-guided Adversarial Attack and Defense (FPAD) framework to enhance the adversarial robustness of deep cross-modal hashing models. First, an adaptive fuzzy-prototype learning network (FpNet) is efficiently presented to extract a set of fuzzy-prototypes, aiming to encode the underlying semantic structure of the heterogeneous modalities in both feature and Hamming spaces. Then, these derived prototypical hash codes are heuristically employed to supervise the generation of high-quality adversarial examples, while a fuzzy-prototype rectification scheme is simultaneously designed to preserve the latent semantic consistency between the adversarial and benign examples. By mixing the adversarial samples with the original training samples as the augmented inputs, an efficient fuzzy-prototype-guided adversarial learning framework is proposed to execute the collaborative adversarial training and generate robust cross-modal hash codes with high adversarial defense capabilities, therefore resisting various attacks and benefiting various challenging cross-modal hashing tasks. Extensive experiments evaluated on benchmark datasets show that the proposed FPAD framework not only produces high-quality adversarial samples to enhance the adversarial training process, but also shows its high adversarial defense capability to benefit various cross-modal hashing tasks. The code is available at: <uri>https://github.com/yzq131/FPAD</uri>","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"36 2","pages":"2568-2580"},"PeriodicalIF":11.1,"publicationDate":"2025-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Transactions on Circuits and Systems for Video Technology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1