首页 > 最新文献

IEEE transactions on pattern analysis and machine intelligence最新文献

英文 中文
Deeply Learned Robust Matrix Completion for Large-scale Low-rank Data Recovery. 大规模低秩数据恢复的深度学习鲁棒矩阵补全。
IF 18.6 Pub Date : 2026-01-29 DOI: 10.1109/TPAMI.2026.3659041
HanQin Cai, Chandra Kundu, Jialin Liu, Wotao Yin

Robust matrix completion (RMC) is a widely used machine learning tool that simultaneously tackles two critical issues in low-rank data analysis: missing data entries and extreme outliers. This paper proposes a novel scalable and learnable non-convex approach, coined Learned Robust Matrix Completion (LRMC), for large-scale RMC problems. LRMC enjoys low computational complexity with linear convergence. Motivated by the proposed theorem, the free parameters of LRMC can be effectively learned via deep unfolding to achieve optimum performance. Furthermore, this paper proposes a flexible feedforward-recurrent-mixed neural network framework that extends deep unfolding from fixed-number iterations to infinite iterations. The superior empirical performance of LRMC is verified with extensive experiments against state-of-the-art on synthetic datasets and real applications, including video background subtraction, ultrasound imaging, face modeling, and cloud removal from satellite imagery.

鲁棒矩阵补全(RMC)是一种广泛使用的机器学习工具,它同时解决了低秩数据分析中的两个关键问题:缺失数据条目和极端异常值。针对大规模矩阵补全问题,提出了一种新的可扩展、可学习的非凸方法——学习鲁棒矩阵补全(LRMC)。LRMC计算复杂度低,具有线性收敛性。在该定理的激励下,LRMC的自由参数可以通过深度展开有效地学习,从而达到最优性能。在此基础上,提出了一种柔性前馈-递归-混合神经网络框架,将深度展开从固定迭代扩展到无限迭代。LRMC卓越的经验性能通过对最先进的合成数据集和实际应用的广泛实验得到验证,包括视频背景减去、超声成像、人脸建模和从卫星图像中去除云。
{"title":"Deeply Learned Robust Matrix Completion for Large-scale Low-rank Data Recovery.","authors":"HanQin Cai, Chandra Kundu, Jialin Liu, Wotao Yin","doi":"10.1109/TPAMI.2026.3659041","DOIUrl":"https://doi.org/10.1109/TPAMI.2026.3659041","url":null,"abstract":"<p><p>Robust matrix completion (RMC) is a widely used machine learning tool that simultaneously tackles two critical issues in low-rank data analysis: missing data entries and extreme outliers. This paper proposes a novel scalable and learnable non-convex approach, coined Learned Robust Matrix Completion (LRMC), for large-scale RMC problems. LRMC enjoys low computational complexity with linear convergence. Motivated by the proposed theorem, the free parameters of LRMC can be effectively learned via deep unfolding to achieve optimum performance. Furthermore, this paper proposes a flexible feedforward-recurrent-mixed neural network framework that extends deep unfolding from fixed-number iterations to infinite iterations. The superior empirical performance of LRMC is verified with extensive experiments against state-of-the-art on synthetic datasets and real applications, including video background subtraction, ultrasound imaging, face modeling, and cloud removal from satellite imagery.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146088451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SEGA: A Transferable Signed Ensemble Gaussian Black-Box Attack Against No-Reference Image Quality Assessment Models. SEGA:针对无参考图像质量评估模型的可转移签名集成高斯黑盒攻击。
IF 18.6 Pub Date : 2026-01-29 DOI: 10.1109/TPAMI.2026.3659164
Yujia Liu, Dingquan Li, Zhixuan Li, Tiejun Huang

No-Reference Image Quality Assessment (NR-IQA) models play an important role in various real-world applications. Recently, adversarial attacks against NR-IQA models have attracted increasing attention, as they provide valuable insights for revealing model vulnerabilities and guiding robust system design. Some effective attacks have been proposed against NR-IQA models in white-box settings, where the attacker has full access to the target model. However, these attacks often suffer from poor transferability to unknown target models in more realistic black-box scenarios, where the target model is inaccessible. This work makes the first attempt to address the challenge of low transferability in attacking NR-IQA models by proposing a transferable Signed Ensemble Gaussian black-box Attack (SEGA). The main idea is to approximate the gradient of the target model by applying Gaussian smoothing to source models and ensembling their smoothed gradients. To ensure the imperceptibility of adversarial perturbations, SEGA further removes inappropriate perturbations using a specially designed perturbation filter mask. Experimental results demonstrate the superior transferability of SEGA, validating its effectiveness in enabling successful transfer-based black-box attacks against NR-IQA models. Code for this paper is available at https://github.com/YogaLYJ/SEGA_IQA.

无参考图像质量评估(NR-IQA)模型在各种实际应用中发挥着重要作用。最近,针对NR-IQA模型的对抗性攻击引起了越来越多的关注,因为它们为揭示模型漏洞和指导健壮的系统设计提供了有价值的见解。在白盒设置中,针对NR-IQA模型提出了一些有效的攻击,攻击者可以完全访问目标模型。然而,在更现实的黑盒场景中,这些攻击往往难以转移到未知的目标模型,因为目标模型是不可访问的。这项工作首次尝试通过提出一种可转移的签名集成高斯黑盒攻击(SEGA)来解决攻击NR-IQA模型的低可转移性挑战。其主要思想是通过对源模型应用高斯平滑并集成其平滑梯度来近似目标模型的梯度。为了确保对抗性扰动的不可感知性,SEGA进一步使用特殊设计的扰动滤波掩膜去除不适当的扰动。实验结果证明了SEGA优越的可转移性,验证了其在针对NR-IQA模型的成功基于转移的黑盒攻击中的有效性。本文的代码可从https://github.com/YogaLYJ/SEGA_IQA获得。
{"title":"SEGA: A Transferable Signed Ensemble Gaussian Black-Box Attack Against No-Reference Image Quality Assessment Models.","authors":"Yujia Liu, Dingquan Li, Zhixuan Li, Tiejun Huang","doi":"10.1109/TPAMI.2026.3659164","DOIUrl":"https://doi.org/10.1109/TPAMI.2026.3659164","url":null,"abstract":"<p><p>No-Reference Image Quality Assessment (NR-IQA) models play an important role in various real-world applications. Recently, adversarial attacks against NR-IQA models have attracted increasing attention, as they provide valuable insights for revealing model vulnerabilities and guiding robust system design. Some effective attacks have been proposed against NR-IQA models in white-box settings, where the attacker has full access to the target model. However, these attacks often suffer from poor transferability to unknown target models in more realistic black-box scenarios, where the target model is inaccessible. This work makes the first attempt to address the challenge of low transferability in attacking NR-IQA models by proposing a transferable Signed Ensemble Gaussian black-box Attack (SEGA). The main idea is to approximate the gradient of the target model by applying Gaussian smoothing to source models and ensembling their smoothed gradients. To ensure the imperceptibility of adversarial perturbations, SEGA further removes inappropriate perturbations using a specially designed perturbation filter mask. Experimental results demonstrate the superior transferability of SEGA, validating its effectiveness in enabling successful transfer-based black-box attacks against NR-IQA models. Code for this paper is available at https://github.com/YogaLYJ/SEGA_IQA.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146088522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Privacy-Preserving Model Transcription With Differentially Private Synthetic Distillation. 差分私密合成蒸馏的隐私保护模型转录。
IF 18.6 Pub Date : 2026-01-29 DOI: 10.1109/TPAMI.2026.3659110
Bochao Liu, Shiming Ge, Pengju Wang, Shikun Li, Tongliang Liu

While many deep learning models trained on private datasets have been deployed in various practical tasks, they may pose a privacy leakage risk as attackers could recover informative data or label knowledge from models. In this work, we present privacy-preserving model transcription, a data-free model-to-model conversion solution to facilitate model deployment with a privacy guarantee. To this end, we propose a cooperative-competitive learning approach termed differentially private synthetic distillation that learns to convert a pretrained model (teacher) into its privacy-preserving counterpart (student) via a trainable generator without access to private data. The learning collaborates with three players in a unified framework and performs alternate optimization: i) the generator is learned to generate synthetic data, ii) the teacher and student accept the synthetic data and compute differential private labels by flexible data or label noisy perturbation, and iii) the student is updated with noisy labels and the generator is updated by taking the student as a discriminator for adversarial training. We theoretically prove that our approach can guarantee differential privacy and convergence. The transcribed student has good performance and privacy protection, while the resulting generator can generate private synthetic data for downstream tasks. Extensive experiments clearly demonstrate that our approach outperforms 26 state-of-the-arts.

虽然许多在私有数据集上训练的深度学习模型已经部署在各种实际任务中,但它们可能会带来隐私泄露风险,因为攻击者可以从模型中恢复信息数据或标签知识。在这项工作中,我们提出了保护隐私的模型转录,这是一种无数据的模型到模型转换解决方案,可以在保证隐私的情况下促进模型部署。为此,我们提出了一种称为差异私有合成蒸馏的合作-竞争学习方法,该方法通过无需访问私有数据的可训练生成器学习将预训练模型(教师)转换为其隐私保护对应模型(学生)。该学习与三个参与者在一个统一的框架内协作,并进行交替优化:1)学习生成器生成合成数据,2)教师和学生接受合成数据,并通过柔性数据或标签噪声扰动计算微分私有标签,3)用噪声标签更新学生,并将学生作为判别器进行对抗性训练来更新生成器。我们从理论上证明了我们的方法可以保证差分隐私和收敛性。转录的学生具有良好的性能和隐私保护,而生成的生成器可以为下游任务生成私有合成数据。大量的实验清楚地表明,我们的方法优于26个最先进的。
{"title":"Privacy-Preserving Model Transcription With Differentially Private Synthetic Distillation.","authors":"Bochao Liu, Shiming Ge, Pengju Wang, Shikun Li, Tongliang Liu","doi":"10.1109/TPAMI.2026.3659110","DOIUrl":"https://doi.org/10.1109/TPAMI.2026.3659110","url":null,"abstract":"<p><p>While many deep learning models trained on private datasets have been deployed in various practical tasks, they may pose a privacy leakage risk as attackers could recover informative data or label knowledge from models. In this work, we present privacy-preserving model transcription, a data-free model-to-model conversion solution to facilitate model deployment with a privacy guarantee. To this end, we propose a cooperative-competitive learning approach termed differentially private synthetic distillation that learns to convert a pretrained model (teacher) into its privacy-preserving counterpart (student) via a trainable generator without access to private data. The learning collaborates with three players in a unified framework and performs alternate optimization: i) the generator is learned to generate synthetic data, ii) the teacher and student accept the synthetic data and compute differential private labels by flexible data or label noisy perturbation, and iii) the student is updated with noisy labels and the generator is updated by taking the student as a discriminator for adversarial training. We theoretically prove that our approach can guarantee differential privacy and convergence. The transcribed student has good performance and privacy protection, while the resulting generator can generate private synthetic data for downstream tasks. Extensive experiments clearly demonstrate that our approach outperforms 26 state-of-the-arts.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146088433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adversarial Imitation Learning with General Function Approximation: Theoretical Analysis and Practical Algorithms. 基于一般函数逼近的对抗模仿学习:理论分析与实用算法。
IF 18.6 Pub Date : 2026-01-26 DOI: 10.1109/TPAMI.2026.3657578
Tian Xu, Zhilong Zhang, Zexuan Chen, Ruishuo Chen, Yihao Sun, Yang Yu

Adversarial imitation learning (AIL), a prominent approach in imitation learning, has achieved significant practical success powered by neural network approximation. However, existing theoretical analyses of AIL are primarily confined to simplified settings-such as tabular and linear function approximation-and involve complex algorithmic designs that impede practical implementation. This creates a substantial gap between theory and practice. This paper bridges this gap by exploring the theoretical underpinnings of online AIL with general function approximation. We introduce a novel framework called optimization-based AIL (OPT-AIL), which performs online optimization for reward learning coupled with optimism-regularized optimization for policy learning. Within this framework, we develop two concrete methods: model-free OPT-AIL and model-based OPT-AIL. Our theoretical analysis demonstrates that both variants achieve polynomial expert sample complexity and interaction complexity for learning near-expert policies. To the best of our knowledge, they represent the first provably efficient AIL methods under general function approximation. From a practical standpoint, OPT-AIL requires only the approximate optimization of two objectives, thereby facilitating practical implementation. Empirical studies demonstrate that OPT-AIL outperforms previous state-of-the-art deep AIL methods across several challenging tasks.

对抗性模仿学习(AIL)是模仿学习中的一种重要方法,在神经网络逼近的支持下取得了重大的实践成功。然而,现有的理论分析主要局限于简化的设置,如表格和线性函数近似,并涉及复杂的算法设计,阻碍了实际实施。这就造成了理论与实践之间的巨大差距。本文通过探索具有一般函数近似的在线人工智能的理论基础来弥补这一差距。我们引入了一种新的框架,称为基于优化的AIL (OPT-AIL),它对奖励学习进行在线优化,并对策略学习进行乐观正则化优化。在此框架下,我们开发了两种具体方法:无模型OPT-AIL和基于模型的OPT-AIL。我们的理论分析表明,这两种变体在学习近专家策略时都达到了多项式的专家样本复杂度和交互复杂度。据我们所知,它们代表了第一个在一般函数近似下被证明有效的AIL方法。从实际的角度来看,OPT-AIL只需要对两个目标进行近似优化,从而便于实际实施。实证研究表明,OPT-AIL在几个具有挑战性的任务中优于以前最先进的深度AIL方法。
{"title":"Adversarial Imitation Learning with General Function Approximation: Theoretical Analysis and Practical Algorithms.","authors":"Tian Xu, Zhilong Zhang, Zexuan Chen, Ruishuo Chen, Yihao Sun, Yang Yu","doi":"10.1109/TPAMI.2026.3657578","DOIUrl":"https://doi.org/10.1109/TPAMI.2026.3657578","url":null,"abstract":"<p><p>Adversarial imitation learning (AIL), a prominent approach in imitation learning, has achieved significant practical success powered by neural network approximation. However, existing theoretical analyses of AIL are primarily confined to simplified settings-such as tabular and linear function approximation-and involve complex algorithmic designs that impede practical implementation. This creates a substantial gap between theory and practice. This paper bridges this gap by exploring the theoretical underpinnings of online AIL with general function approximation. We introduce a novel framework called optimization-based AIL (OPT-AIL), which performs online optimization for reward learning coupled with optimism-regularized optimization for policy learning. Within this framework, we develop two concrete methods: model-free OPT-AIL and model-based OPT-AIL. Our theoretical analysis demonstrates that both variants achieve polynomial expert sample complexity and interaction complexity for learning near-expert policies. To the best of our knowledge, they represent the first provably efficient AIL methods under general function approximation. From a practical standpoint, OPT-AIL requires only the approximate optimization of two objectives, thereby facilitating practical implementation. Empirical studies demonstrate that OPT-AIL outperforms previous state-of-the-art deep AIL methods across several challenging tasks.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6,"publicationDate":"2026-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146055738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Parameter-Efficient Fine-Tuning Methods for Pretrained Language Models: A Critical Review and Assessment. 预训练语言模型的参数有效微调方法:一个重要的回顾和评估。
IF 18.6 Pub Date : 2026-01-26 DOI: 10.1109/TPAMI.2026.3657354
Lingling Xu, Haoran Xie, S Joe Qin, Xiaohui Tao, Fu Lee Wang

With the continuous growth in the number of parameters of the Transformer-based pretrained language models (PLMs), particularly the emergence of large language models (LLMs) with billions of parameters, many natural language processing (NLP) tasks have demonstrated remarkable success. However, the enormous size and computational demands of these models pose significant challenges for adapting them to specific downstream tasks, especially in environments with limited computational resources. Parameter-Efficient Fine-Tuning (PEFT) offers an effective solution by reducing the number of fine-tuning parameters and memory usage while achieving comparable performance to full fine-tuning. The demands for fine-tuning PLMs, especially LLMs, have led to a surge in the development of PEFT methods, as depicted in Fig. 1. In this paper, we present a comprehensive and systematic review of PEFT methods for PLMs. We summarize these PEFT methods, discuss their applications, and outline future directions. Furthermore, extensive experiments are conducted using several representative PEFT methods to better understand their effectiveness in parameter efficiency and memory efficiency. By offering insights into the latest advancements and practical applications, this survey serves as an invaluable resource for researchers and practitioners seeking to navigate the challenges and opportunities presented by PEFT in the context of PLMs.

随着基于transformer的预训练语言模型(plm)参数数量的持续增长,特别是具有数十亿参数的大型语言模型(llm)的出现,许多自然语言处理(NLP)任务已经取得了显着的成功。然而,这些模型的巨大尺寸和计算需求给它们适应特定的下游任务带来了重大挑战,特别是在计算资源有限的环境中。参数高效微调(PEFT)提供了一种有效的解决方案,它减少了微调参数的数量和内存使用,同时实现了与完全微调相当的性能。对微调plm的需求,特别是llm,导致了PEFT方法发展的激增,如图1所示。在本文中,我们提出了一个全面和系统的评价PEFT方法的plm。我们总结了这些PEFT方法,讨论了它们的应用,并概述了未来的发展方向。此外,为了更好地了解PEFT方法在参数效率和存储效率方面的有效性,还使用了几种具有代表性的PEFT方法进行了大量的实验。通过提供对最新进展和实际应用的见解,该调查为研究人员和从业者提供了宝贵的资源,帮助他们在PLMs背景下应对PEFT所带来的挑战和机遇。
{"title":"Parameter-Efficient Fine-Tuning Methods for Pretrained Language Models: A Critical Review and Assessment.","authors":"Lingling Xu, Haoran Xie, S Joe Qin, Xiaohui Tao, Fu Lee Wang","doi":"10.1109/TPAMI.2026.3657354","DOIUrl":"https://doi.org/10.1109/TPAMI.2026.3657354","url":null,"abstract":"<p><p>With the continuous growth in the number of parameters of the Transformer-based pretrained language models (PLMs), particularly the emergence of large language models (LLMs) with billions of parameters, many natural language processing (NLP) tasks have demonstrated remarkable success. However, the enormous size and computational demands of these models pose significant challenges for adapting them to specific downstream tasks, especially in environments with limited computational resources. Parameter-Efficient Fine-Tuning (PEFT) offers an effective solution by reducing the number of fine-tuning parameters and memory usage while achieving comparable performance to full fine-tuning. The demands for fine-tuning PLMs, especially LLMs, have led to a surge in the development of PEFT methods, as depicted in Fig. 1. In this paper, we present a comprehensive and systematic review of PEFT methods for PLMs. We summarize these PEFT methods, discuss their applications, and outline future directions. Furthermore, extensive experiments are conducted using several representative PEFT methods to better understand their effectiveness in parameter efficiency and memory efficiency. By offering insights into the latest advancements and practical applications, this survey serves as an invaluable resource for researchers and practitioners seeking to navigate the challenges and opportunities presented by PEFT in the context of PLMs.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6,"publicationDate":"2026-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146055668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TextMonkey: an OCR-Free Large Multimodal Model for Understanding Document. TextMonkey:一个用于理解文档的无ocr大型多模态模型。
IF 18.6 Pub Date : 2026-01-26 DOI: 10.1109/TPAMI.2026.3653415
Yuliang Liu, Biao Yang, Qiang Liu, Zhang Li, Zhiyin Ma, Shuo Zhang, Xiang Bai

We present TextMonkey, a large multimodal model (LMM) tailored for text-centric tasks. Our approach introduces enhancement across several dimensions: By adopting Shifted Window Attention layer, we achieve cross-window connectivity at higher input resolutions and stabilize early training; We hypothesize that images may contain redundant tokens, and by using similarity to filter out significant tokens, we can not only streamline the token length but also enhance the model's performance. Moreover, by expanding our model's capabilities to encompass text spotting and grounding, and incorporating positional information into responses, we enhance interpretability. Evaluation on 12 benchmarks shows notable improvements: 5.2% in Scene Text-Centric tasks (including STVQA, TextVQA, and OCRVQA), 6.9% in Document-Oriented tasks (such as DocVQA, InfoVQA, ChartVQA, DeepForm, Kleister Charity, and WikiTableQuestions), and 2.8% in Key Information Extraction tasks (comprising FUNSD, SROIE, and POIE). It outperforms in scene text spotting with a 10.9% increase and sets a new standard on OCRBench, a comprehensive benchmark consisting of 29 OCR-related assessments, with a score of 561, surpassing previous open-sourced large multimodal models for document understanding. Code is released at https://github.com/Yuliang-Liu/Monkey.

我们提出TextMonkey,一个为文本中心任务量身定制的大型多模态模型(LMM)。我们的方法在几个维度上引入了增强:通过采用移位窗口注意层,我们在更高的输入分辨率下实现了跨窗口连接,并稳定了早期训练;我们假设图像可能包含冗余的标记,通过相似性过滤掉重要的标记,不仅可以简化标记长度,还可以提高模型的性能。此外,通过扩展我们的模型的能力来包含文本定位和基础,并将位置信息合并到响应中,我们增强了可解释性。对12个基准的评估显示出显著的改进:以场景文本为中心的任务(包括STVQA、TextVQA和OCRVQA)提高了5.2%,面向文档的任务(如DocVQA、InfoVQA、ChartVQA、DeepForm、Kleister Charity和WikiTableQuestions)提高了6.9%,关键信息提取任务(包括fundd、SROIE和POIE)提高了2.8%。它在场景文本识别方面表现出色,提高了10.9%,并在OCRBench(一个由29个OCRBench相关评估组成的综合基准)上设定了一个新的标准,得分为561分,超过了以前用于文档理解的开源大型多模态模型。代码发布在https://github.com/Yuliang-Liu/Monkey。
{"title":"TextMonkey: an OCR-Free Large Multimodal Model for Understanding Document.","authors":"Yuliang Liu, Biao Yang, Qiang Liu, Zhang Li, Zhiyin Ma, Shuo Zhang, Xiang Bai","doi":"10.1109/TPAMI.2026.3653415","DOIUrl":"https://doi.org/10.1109/TPAMI.2026.3653415","url":null,"abstract":"<p><p>We present TextMonkey, a large multimodal model (LMM) tailored for text-centric tasks. Our approach introduces enhancement across several dimensions: By adopting Shifted Window Attention layer, we achieve cross-window connectivity at higher input resolutions and stabilize early training; We hypothesize that images may contain redundant tokens, and by using similarity to filter out significant tokens, we can not only streamline the token length but also enhance the model's performance. Moreover, by expanding our model's capabilities to encompass text spotting and grounding, and incorporating positional information into responses, we enhance interpretability. Evaluation on 12 benchmarks shows notable improvements: 5.2% in Scene Text-Centric tasks (including STVQA, TextVQA, and OCRVQA), 6.9% in Document-Oriented tasks (such as DocVQA, InfoVQA, ChartVQA, DeepForm, Kleister Charity, and WikiTableQuestions), and 2.8% in Key Information Extraction tasks (comprising FUNSD, SROIE, and POIE). It outperforms in scene text spotting with a 10.9% increase and sets a new standard on OCRBench, a comprehensive benchmark consisting of 29 OCR-related assessments, with a score of 561, surpassing previous open-sourced large multimodal models for document understanding. Code is released at https://github.com/Yuliang-Liu/Monkey.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6,"publicationDate":"2026-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146055658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Goal-oriented Dynamic Weight Optimization for Multi-Object Navigation. 面向目标的多目标导航动态权重优化。
IF 18.6 Pub Date : 2026-01-26 DOI: 10.1109/TPAMI.2026.3657778
Haitao Zeng, Xinhang Song, Shuqiang Jiang

Multi-object navigation (MON) tasks involve sequentially locating multiple targets in an unknown environment, requiring global long-term planning under incomplete information. This necessitates that the agent dynamically balance immediate actions and long-term rewards while considering both local adaptability and global foresight. However, current methods overly focus on local path optimization, which leads to slower convergence in sparse reward settings and increases the risk of deadlocks or trap states. The core challenge of MON lies in the deformation of the shared decision space, where independent optimization leads to redundant and overlapping paths. Thus, path planning requires dynamic, cross-task optimization rather than simple subtask aggregation. To minimize overall effort, the optimization process should adaptively balance task contributions through weight adjustment. Thus, we propose the Goal-oriented Dynamic Weight Optimization (GDWO) algorithm. GDWO integrates target-specific value loss functions into a unified optimization framework and dynamically adjusts weights through gradient-based updates. To prevent over-optimization, weights are normalized during training according to navigation success rates, prioritizing more challenging targets. This adaptive mechanism effectively addresses the challenge of sparse rewards and improves convergence efficiency. By leveraging this mechanism, GDWO unifies multiple objectives within a unified decision space, achieving efficient optimization and balancing short-term gains with long-term goals. Additionally, we introduce two auxiliary modules: prior knowledge-based navigation and frontier-aware exploration to further enhance GDWO's performance. Experimental results on the Gibson and Matterport3D datasets demonstrate that GDWO achieves improvements in key metrics for MON tasks. It optimizes path planning, reduces exploration costs, and enhances navigation efficiency, enabling the agent to perform tasks more effectively in complex environments.

多目标导航任务是指在未知环境中对多个目标进行顺序定位的任务,需要在信息不完全的情况下进行全局长期规划。这就要求agent在考虑局部适应性和全局前瞻的同时,动态地平衡即时行为和长期回报。然而,目前的方法过于关注局部路径优化,这导致在稀疏奖励设置下收敛速度较慢,并增加了死锁或陷阱状态的风险。MON的核心挑战在于共享决策空间的变形,其中独立的优化会导致路径的冗余和重叠。因此,路径规划需要动态的跨任务优化,而不是简单的子任务聚合。为了使总工作量最小化,优化过程应该通过权重调整自适应地平衡任务贡献。因此,我们提出了面向目标的动态权重优化(GDWO)算法。GDWO将目标值损失函数集成到统一的优化框架中,并通过基于梯度的更新动态调整权重。为了防止过度优化,权重在训练过程中根据导航成功率归一化,优先考虑更具挑战性的目标。这种自适应机制有效地解决了奖励稀疏的挑战,提高了收敛效率。通过利用这一机制,GDWO将多个目标统一到统一的决策空间中,实现高效优化,平衡短期收益与长期目标。此外,我们还引入了两个辅助模块:基于先验知识的导航和边界感知探索,以进一步提高GDWO的性能。在Gibson和Matterport3D数据集上的实验结果表明,GDWO在MON任务的关键指标上取得了改进。它优化了路径规划,降低了探索成本,提高了导航效率,使智能体能够在复杂环境中更有效地执行任务。
{"title":"Goal-oriented Dynamic Weight Optimization for Multi-Object Navigation.","authors":"Haitao Zeng, Xinhang Song, Shuqiang Jiang","doi":"10.1109/TPAMI.2026.3657778","DOIUrl":"https://doi.org/10.1109/TPAMI.2026.3657778","url":null,"abstract":"<p><p>Multi-object navigation (MON) tasks involve sequentially locating multiple targets in an unknown environment, requiring global long-term planning under incomplete information. This necessitates that the agent dynamically balance immediate actions and long-term rewards while considering both local adaptability and global foresight. However, current methods overly focus on local path optimization, which leads to slower convergence in sparse reward settings and increases the risk of deadlocks or trap states. The core challenge of MON lies in the deformation of the shared decision space, where independent optimization leads to redundant and overlapping paths. Thus, path planning requires dynamic, cross-task optimization rather than simple subtask aggregation. To minimize overall effort, the optimization process should adaptively balance task contributions through weight adjustment. Thus, we propose the Goal-oriented Dynamic Weight Optimization (GDWO) algorithm. GDWO integrates target-specific value loss functions into a unified optimization framework and dynamically adjusts weights through gradient-based updates. To prevent over-optimization, weights are normalized during training according to navigation success rates, prioritizing more challenging targets. This adaptive mechanism effectively addresses the challenge of sparse rewards and improves convergence efficiency. By leveraging this mechanism, GDWO unifies multiple objectives within a unified decision space, achieving efficient optimization and balancing short-term gains with long-term goals. Additionally, we introduce two auxiliary modules: prior knowledge-based navigation and frontier-aware exploration to further enhance GDWO's performance. Experimental results on the Gibson and Matterport3D datasets demonstrate that GDWO achieves improvements in key metrics for MON tasks. It optimizes path planning, reduces exploration costs, and enhances navigation efficiency, enabling the agent to perform tasks more effectively in complex environments.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6,"publicationDate":"2026-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146055726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Wasserstein Distances Made Explainable: Insights into Dataset Shifts and Transport Phenomena. Wasserstein距离变得可解释:对数据集移动和传输现象的见解。
IF 18.6 Pub Date : 2026-01-22 DOI: 10.1109/TPAMI.2026.3656947
Philip Naumann, Jacob Kauffmann, Gregoire Montavon

Wasserstein distances provide a powerful framework for comparing data distributions. They can be used to analyze processes over time or to detect inhomogeneities within data. However, simply calculating the Wasserstein distance or analyzing the corresponding transport plan (or coupling) may not be sufficient for understanding what factors contribute to a high or low Wasserstein distance. In this work, we propose a novel solution based on Explainable AI that allows us to efficiently and accurately attribute Wasserstein distances to various data components, including data subgroups, input features, or interpretable subspaces. Our method achieves high accuracy across diverse datasets and Wasserstein distance specifications, and its practical utility is demonstrated in three use cases.

沃瑟斯坦距离为比较数据分布提供了一个强大的框架。它们可用于随着时间的推移分析进程或检测数据中的不同质性。然而,简单地计算Wasserstein距离或分析相应的运输计划(或耦合)可能不足以理解是什么因素导致了高或低的Wasserstein距离。在这项工作中,我们提出了一种基于可解释人工智能的新解决方案,使我们能够有效准确地将沃瑟斯坦距离归因于各种数据组件,包括数据子组、输入特征或可解释的子空间。我们的方法在不同的数据集和Wasserstein距离规范中实现了很高的精度,并在三个用例中证明了它的实用性。
{"title":"Wasserstein Distances Made Explainable: Insights into Dataset Shifts and Transport Phenomena.","authors":"Philip Naumann, Jacob Kauffmann, Gregoire Montavon","doi":"10.1109/TPAMI.2026.3656947","DOIUrl":"https://doi.org/10.1109/TPAMI.2026.3656947","url":null,"abstract":"<p><p>Wasserstein distances provide a powerful framework for comparing data distributions. They can be used to analyze processes over time or to detect inhomogeneities within data. However, simply calculating the Wasserstein distance or analyzing the corresponding transport plan (or coupling) may not be sufficient for understanding what factors contribute to a high or low Wasserstein distance. In this work, we propose a novel solution based on Explainable AI that allows us to efficiently and accurately attribute Wasserstein distances to various data components, including data subgroups, input features, or interpretable subspaces. Our method achieves high accuracy across diverse datasets and Wasserstein distance specifications, and its practical utility is demonstrated in three use cases.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146032289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Abstracting Concept-Changing Rules for Solving Raven's Progressive Matrix Problems. 求解Raven渐进矩阵问题的抽象概念变换规则。
IF 18.6 Pub Date : 2026-01-21 DOI: 10.1109/TPAMI.2026.3656670
Fan Shi, Bin Li, Xiangyang Xue

The abstract visual reasoning ability in human intelligence benefits discovering underlying rules in the novel environment. Raven's Progressive Matrix (RPM) is a classic test to realize such ability in machine intelligence by selecting from candidates. Recent studies suggest that solving RPM in an answer-generation way boosts a more in-depth understanding of rules. However, existing generative solvers cannot discover the global concept-changing rules without auxiliary supervision (e.g., rule annotations and distractors in candidate sets). To this end, we propose a deep latent variable model for Concept-changing Rule ABstraction (CRAB) by learning interpretable concepts and parsing concept-changing rules in the latent space. With the iterative learning process, CRAB can automatically abstract global rules shared on the dataset on each concept and form the learnable prior knowledge of global rules. CRAB outperforms the baselines trained without auxiliary supervision in the arbitrary-position answer generation task and achieves comparable and even higher accuracy than the compared models trained with auxiliary supervision. Finally, we conduct experiments to illustrate the interpretability of CRAB in concept learning, answer selection, and global rule abstraction.

人类智能中的抽象视觉推理能力有助于在新环境中发现潜在的规则。Raven's Progressive Matrix (RPM)是一种经典的测试,通过从候选对象中进行选择来实现机器智能中的这种能力。最近的研究表明,以答案生成的方式解决RPM可以促进对规则的更深入理解。然而,现有的生成求解器在没有辅助监督(如候选集中的规则注释和干扰物)的情况下无法发现全局的概念变化规则。为此,我们通过学习可解释的概念和解析潜在空间中的概念变化规则,提出了一种用于概念变化规则抽象(CRAB)的深层潜变量模型。通过迭代学习的过程,CRAB可以自动地将数据集上共享的全局规则抽象到每个概念上,形成全局规则的可学习先验知识。在任意位置答案生成任务中,CRAB优于未经辅助监督训练的基线,并且与经过辅助监督训练的模型相比,准确率相当甚至更高。最后,我们通过实验来说明螃蟹在概念学习、答案选择和全局规则抽象方面的可解释性。
{"title":"Abstracting Concept-Changing Rules for Solving Raven's Progressive Matrix Problems.","authors":"Fan Shi, Bin Li, Xiangyang Xue","doi":"10.1109/TPAMI.2026.3656670","DOIUrl":"https://doi.org/10.1109/TPAMI.2026.3656670","url":null,"abstract":"<p><p>The abstract visual reasoning ability in human intelligence benefits discovering underlying rules in the novel environment. Raven's Progressive Matrix (RPM) is a classic test to realize such ability in machine intelligence by selecting from candidates. Recent studies suggest that solving RPM in an answer-generation way boosts a more in-depth understanding of rules. However, existing generative solvers cannot discover the global concept-changing rules without auxiliary supervision (e.g., rule annotations and distractors in candidate sets). To this end, we propose a deep latent variable model for Concept-changing Rule ABstraction (CRAB) by learning interpretable concepts and parsing concept-changing rules in the latent space. With the iterative learning process, CRAB can automatically abstract global rules shared on the dataset on each concept and form the learnable prior knowledge of global rules. CRAB outperforms the baselines trained without auxiliary supervision in the arbitrary-position answer generation task and achieves comparable and even higher accuracy than the compared models trained with auxiliary supervision. Finally, we conduct experiments to illustrate the interpretability of CRAB in concept learning, answer selection, and global rule abstraction.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146021178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Active Adversarial Noise Suppression for Image Forgery Localization. 主动对抗噪声抑制在图像伪造定位中的应用。
IF 18.6 Pub Date : 2026-01-21 DOI: 10.1109/TPAMI.2026.3656742
Rongxuan Peng, Shunquan Tan, Xianbo Mo, Alex C Kot, Jiwu Huang

Recent advances in deep learning have significantly propelled the development of image forgery localization. However, existing models remain highly vulnerable to adversarial attacks: imperceptible noise added to forged images can severely mislead these models. In this paper, we address this challenge with an Adversarial Noise Suppression Module (ANSM) that generates a defensive perturbation to suppress the attack effect of adversarial noise. We observe that forgery-relevant features extracted from adversarial and original forged images exhibit distinct distributions. To bridge this gap, we introduce Forgery-relevant Features Alignment (FFA) as a first-stage training strategy, which reduces distributional discrepancies by minimizing the channel-wise Kullback-Leibler divergence between these features. To further refine the defensive perturbation, we design a second-stage training strategy, termed Mask-guided Refinement (MgR), which incorporates a dual-mask constraint. MgR ensures that the defensive perturbation remains effective for both adversarial and original forged images, recovering forgery localization accuracy to their original level. Extensive experiments across various attack algorithms demonstrate that our method significantly restores the forgery localization model's performance on adversarial images. Notably, when ANSM is applied to original forged images, the performance remains nearly unaffected. To our best knowledge, this is the first report of adversarial defense in image forgery localization tasks. We have released the source code and anti-forensics dataset.

深度学习的最新进展极大地推动了图像伪造定位的发展。然而,现有的模型仍然极易受到对抗性攻击:伪造图像中添加的难以察觉的噪声会严重误导这些模型。在本文中,我们使用对抗噪声抑制模块(ANSM)来解决这一挑战,该模块产生防御性扰动以抑制对抗噪声的攻击效果。我们观察到,从对抗和原始伪造图像中提取的伪造相关特征表现出不同的分布。为了弥补这一差距,我们引入了与伪造相关的特征对齐(FFA)作为第一阶段的训练策略,它通过最小化这些特征之间的渠道Kullback-Leibler分歧来减少分布差异。为了进一步改进防御性扰动,我们设计了第二阶段的训练策略,称为掩码引导的改进(MgR),它包含了双掩码约束。MgR确保防御摄动对对抗和原始伪造图像仍然有效,将伪造定位精度恢复到原始水平。各种攻击算法的大量实验表明,我们的方法显着恢复了伪造定位模型在对抗图像上的性能。值得注意的是,当ANSM应用于原始伪造图像时,性能几乎不受影响。据我们所知,这是图像伪造定位任务中对抗性防御的第一份报告。我们已经发布了源代码和反取证数据集。
{"title":"Active Adversarial Noise Suppression for Image Forgery Localization.","authors":"Rongxuan Peng, Shunquan Tan, Xianbo Mo, Alex C Kot, Jiwu Huang","doi":"10.1109/TPAMI.2026.3656742","DOIUrl":"https://doi.org/10.1109/TPAMI.2026.3656742","url":null,"abstract":"<p><p>Recent advances in deep learning have significantly propelled the development of image forgery localization. However, existing models remain highly vulnerable to adversarial attacks: imperceptible noise added to forged images can severely mislead these models. In this paper, we address this challenge with an Adversarial Noise Suppression Module (ANSM) that generates a defensive perturbation to suppress the attack effect of adversarial noise. We observe that forgery-relevant features extracted from adversarial and original forged images exhibit distinct distributions. To bridge this gap, we introduce Forgery-relevant Features Alignment (FFA) as a first-stage training strategy, which reduces distributional discrepancies by minimizing the channel-wise Kullback-Leibler divergence between these features. To further refine the defensive perturbation, we design a second-stage training strategy, termed Mask-guided Refinement (MgR), which incorporates a dual-mask constraint. MgR ensures that the defensive perturbation remains effective for both adversarial and original forged images, recovering forgery localization accuracy to their original level. Extensive experiments across various attack algorithms demonstrate that our method significantly restores the forgery localization model's performance on adversarial images. Notably, when ANSM is applied to original forged images, the performance remains nearly unaffected. To our best knowledge, this is the first report of adversarial defense in image forgery localization tasks. We have released the source code and anti-forensics dataset.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146021125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE transactions on pattern analysis and machine intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1