首页 > 最新文献

International Joint Conference on Artificial Intelligence最新文献

英文 中文
SeRO: Self-Supervised Reinforcement Learning for Recovery from Out-of-Distribution Situations SeRO:自监督强化学习从非分布情况中恢复
Pub Date : 2023-08-01 DOI: 10.24963/ijcai.2023/432
Chan Kim, JaeKyung Cho, C. Bobda, Seung-Woo Seo, Seong-Woo Kim
Robotic agents trained using reinforcement learning have the problem of taking unreliable actions in an out-of-distribution (OOD) state. Agents can easily become OOD in real-world environments because it is almost impossible for them to visit and learn the entire state space during training. Unfortunately, unreliable actions do not ensure that agents perform their original tasks successfully. Therefore, agents should be able to recognize whether they are in OOD states and learn how to return to the learned state distribution rather than continue to take unreliable actions. In this study, we propose a novel method for retraining agents to recover from OOD situations in a self-supervised manner when they fall into OOD states. Our in-depth experimental results demonstrate that our method substantially improves the agent’s ability to recover from OOD situations in terms of sample efficiency and restoration of the performance for the original tasks. Moreover, we show that our method can retrain the agent to recover from OOD situations even when in-distribution states are difficult to visit through exploration. Code and supplementary materials are available at https://github.com/SNUChanKim/SeRO.
使用强化学习训练的机器人代理存在在非分布状态下采取不可靠行动的问题。在现实环境中,智能体很容易成为OOD,因为在训练过程中,它们几乎不可能访问和学习整个状态空间。不幸的是,不可靠的操作不能确保代理成功执行其原始任务。因此,agent应该能够识别自己是否处于OOD状态,并学习如何返回到学习到的状态分布,而不是继续采取不可靠的动作。在这项研究中,我们提出了一种新的方法来重新训练智能体,当它们陷入OOD状态时,以一种自我监督的方式从OOD状态中恢复过来。我们的深入实验结果表明,我们的方法在样本效率和原始任务性能恢复方面大大提高了智能体从OOD情况中恢复的能力。此外,我们证明了我们的方法可以重新训练智能体从OOD情况中恢复,即使在分布状态难以通过探索访问的情况下。代码和补充材料可在https://github.com/SNUChanKim/SeRO上获得。
{"title":"SeRO: Self-Supervised Reinforcement Learning for Recovery from Out-of-Distribution Situations","authors":"Chan Kim, JaeKyung Cho, C. Bobda, Seung-Woo Seo, Seong-Woo Kim","doi":"10.24963/ijcai.2023/432","DOIUrl":"https://doi.org/10.24963/ijcai.2023/432","url":null,"abstract":"Robotic agents trained using reinforcement learning have the problem of taking unreliable actions in an out-of-distribution (OOD) state. Agents can easily become OOD in real-world environments because it is almost impossible for them to visit and learn the entire state space during training. Unfortunately, unreliable actions do not ensure that agents perform their original tasks successfully. Therefore, agents should be able to recognize whether they are in OOD states and learn how to return to the learned state distribution rather than continue to take unreliable actions. In this study, we propose a novel method for retraining agents to recover from OOD situations in a self-supervised manner when they fall into OOD states. Our in-depth experimental results demonstrate that our method substantially improves the agent’s ability to recover from OOD situations in terms of sample efficiency and restoration of the performance for the original tasks. Moreover, we show that our method can retrain the agent to recover from OOD situations even when in-distribution states are difficult to visit through exploration. Code and supplementary materials are available at https://github.com/SNUChanKim/SeRO.","PeriodicalId":394530,"journal":{"name":"International Joint Conference on Artificial Intelligence","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128078158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian Optimization with Switching Cost: Regret Analysis and Lookahead Variants 具有切换代价的贝叶斯优化:后悔分析和前瞻变量
Pub Date : 2023-08-01 DOI: 10.24963/ijcai.2023/446
Peng Liu, Haowei Wang, Wei Qiyu
Bayesian Optimization (BO) has recently received increasing attention due to its efficiency in optimizing expensive-to-evaluate functions. For some practical problems, it is essential to consider the path-dependent switching cost between consecutive sampling locations given a total traveling budget. For example, when using a drone to locate cracks in a building wall or search for lost survivors in the wild, the search path needs to be efficiently planned given the limited battery power of the drone. Tackling such problems requires a careful cost-benefit analysis of candidate locations and balancing exploration and exploitation. In this work, we formulate such a problem as a constrained Markov Decision Process (MDP) and solve it by proposing a new distance-adjusted multi-step look-ahead acquisition function, the distUCB, and using rollout approximation. We also provide a theoretical regret analysis of the distUCB-based Bayesian optimization algorithm. In addition, the empirical performance of the proposed algorithm is tested based on both synthetic and real data experiments, and it shows that our cost-aware non-myopic algorithm performs better than other popular alternatives.
近年来,贝叶斯优化(BO)因其在优化昂贵函数方面的效率而受到越来越多的关注。对于一些实际问题,必须考虑给定总行程预算的连续采样点之间的路径依赖切换成本。例如,当使用无人机定位建筑物墙壁上的裂缝或在野外搜寻失踪的幸存者时,由于无人机的电池电量有限,需要有效地规划搜索路径。解决这些问题需要对候选地点进行仔细的成本效益分析,并平衡勘探和开采。在这项工作中,我们将这样的问题表述为约束马尔可夫决策过程(MDP),并通过提出一个新的距离调整多步前瞻获取函数distUCB和使用rollout逼近来解决它。我们还对基于distucb的贝叶斯优化算法进行了理论遗憾分析。此外,基于合成和真实数据实验对本文算法的经验性能进行了测试,结果表明本文算法的成本感知非近视算法的性能优于其他流行的替代算法。
{"title":"Bayesian Optimization with Switching Cost: Regret Analysis and Lookahead Variants","authors":"Peng Liu, Haowei Wang, Wei Qiyu","doi":"10.24963/ijcai.2023/446","DOIUrl":"https://doi.org/10.24963/ijcai.2023/446","url":null,"abstract":"Bayesian Optimization (BO) has recently received increasing attention due to its efficiency in optimizing expensive-to-evaluate functions. For some practical problems, it is essential to consider the path-dependent switching cost between consecutive sampling locations given a total traveling budget. For example, when using a drone to locate cracks in a building wall or search for lost survivors in the wild, the search path needs to be efficiently planned given the limited battery power of the drone. Tackling such problems requires a careful cost-benefit analysis of candidate locations and balancing exploration and exploitation. In this work, we formulate such a problem as a constrained Markov Decision Process (MDP) and solve it by proposing a new distance-adjusted multi-step look-ahead acquisition function, the distUCB, and using rollout approximation. We also provide a theoretical regret analysis of the distUCB-based Bayesian optimization algorithm. In addition, the empirical performance of the proposed algorithm is tested based on both synthetic and real data experiments, and it shows that our cost-aware non-myopic algorithm performs better than other popular alternatives.","PeriodicalId":394530,"journal":{"name":"International Joint Conference on Artificial Intelligence","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132525800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Front-to-End Bidirectional Heuristic Search with Consistent Heuristics: Enumerating and Evaluating Algorithms and Bounds 具有一致启发式的前端到端双向启发式搜索:枚举和评估算法和边界
Pub Date : 2023-08-01 DOI: 10.24963/ijcai.2023/625
Lior Siag, Shahaf S. Shperberg, Ariel Felner, Nathan R Sturtevant
Recent research on bidirectional heuristic search (BiHS) is based on the must-expand pairs theory (MEP theory), which describes which pairs of nodes must be expanded during the search to guarantee the optimality of solutions. A separate line of research in BiHS has proposed algorithms that use lower bounds that are derived from consistent heuristics during search. This paper links these two directions, providing a comprehensive unifying view and showing that both existing and novel algorithms can be derived from the MEP theory. An extended set of bounds is formulated, encompassing both previously discovered bounds and new ones. Finally, the bounds are empirically evaluated by their contribution to the efficiency of the search
双向启发式搜索(BiHS)的最新研究是基于必须展开对理论(MEP理论),该理论描述了在搜索过程中必须展开哪些节点对以保证解的最优性。BiHS的另一条研究路线提出了使用搜索过程中从一致启发式推导出的下界的算法。本文将这两个方向联系起来,提供了一个全面统一的观点,并表明现有的和新的算法都可以从MEP理论中推导出来。一个扩展的边界集被制定,包括以前发现的边界和新的边界。最后,根据它们对搜索效率的贡献对边界进行经验评估
{"title":"Front-to-End Bidirectional Heuristic Search with Consistent Heuristics: Enumerating and Evaluating Algorithms and Bounds","authors":"Lior Siag, Shahaf S. Shperberg, Ariel Felner, Nathan R Sturtevant","doi":"10.24963/ijcai.2023/625","DOIUrl":"https://doi.org/10.24963/ijcai.2023/625","url":null,"abstract":"Recent research on bidirectional heuristic search (BiHS) is based on the must-expand pairs theory (MEP theory), which describes which pairs of nodes must be expanded during the search to guarantee the optimality of solutions. A separate line of research in BiHS has proposed algorithms that use lower bounds that are derived from consistent heuristics during search. This paper links these two directions, providing a comprehensive unifying view and showing that both existing and novel algorithms can be derived from the MEP theory. An extended set of bounds is formulated, encompassing both previously discovered bounds and new ones. Finally, the bounds are empirically evaluated by their contribution to the efficiency of the search","PeriodicalId":394530,"journal":{"name":"International Joint Conference on Artificial Intelligence","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133314020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Plansformer Tool: Demonstrating Generation of Symbolic Plans Using Transformers 变压器工具:示范生成的符号计划使用变压器
Pub Date : 2023-08-01 DOI: 10.24963/ijcai.2023/839
Vishal Pallagani, Bharath Muppasani, Biplav Srivastava, F. Rossi, L. Horesh, K. Murugesan, Andrea Loreggia, F. Fabiano, Rony Joseph, Yathin Kethepalli
Plansformer is a novel tool that utilizes a fine-tuned language model based on transformer architecture to generate symbolic plans. Transformers are a type of neural network architecture that have been shown to be highly effective in a range of natural language processing tasks. Unlike traditional planning systems that use heuristic-based search strategies, Plansformer is fine-tuned on specific classical planning domains to generate high-quality plans that are both fluent and feasible. Plansformer takes the domain and problem files as input (in PDDL) and outputs a sequence of actions that can be executed to solve the problem. We demonstrate the effectiveness of Plansformer on a variety of benchmark problems and provide both qualitative and quantitative results obtained during our evaluation, including its limitations. Plansformer has the potential to significantly improve the efficiency and effectiveness of planning in various domains, from logistics and scheduling to natural language processing and human-computer interaction. In addition, we provide public access to Plansformer via a website as well as an API endpoint; this enables other researchers to utilize our tool for planning and execution. The demo video is available at https://youtu.be/_1rlctCGsrk
plantransformer是一种新颖的工具,它利用基于变压器架构的微调语言模型来生成符号规划。变形金刚是一种神经网络架构,已被证明在一系列自然语言处理任务中非常有效。与使用启发式搜索策略的传统规划系统不同,plantransformer对特定的经典规划领域进行了微调,以生成既流畅又可行的高质量规划。plantransformer将域和问题文件作为输入(在PDDL中),并输出一系列可以执行以解决问题的操作。我们展示了plantransformer在各种基准问题上的有效性,并提供了在评估过程中获得的定性和定量结果,包括其局限性。plantransformer有潜力显著提高各个领域的规划效率和有效性,从物流和调度到自然语言处理和人机交互。此外,我们通过网站和API端点提供对plantransformer的公共访问;这使其他研究人员能够利用我们的工具进行计划和执行。演示视频可在https://youtu.be/_1rlctCGsrk上获得
{"title":"Plansformer Tool: Demonstrating Generation of Symbolic Plans Using Transformers","authors":"Vishal Pallagani, Bharath Muppasani, Biplav Srivastava, F. Rossi, L. Horesh, K. Murugesan, Andrea Loreggia, F. Fabiano, Rony Joseph, Yathin Kethepalli","doi":"10.24963/ijcai.2023/839","DOIUrl":"https://doi.org/10.24963/ijcai.2023/839","url":null,"abstract":"Plansformer is a novel tool that utilizes a fine-tuned language model based on transformer architecture to generate symbolic plans. Transformers are a type of neural network architecture that have been shown to be highly effective in a range of natural language processing tasks. Unlike traditional planning systems that use heuristic-based search strategies, Plansformer is fine-tuned on specific classical planning domains to generate high-quality plans that are both fluent and feasible. Plansformer takes the domain and problem files as input (in PDDL) and outputs a sequence of actions that can be executed to solve the problem. We demonstrate the effectiveness of Plansformer on a variety of benchmark problems and provide both qualitative and quantitative results obtained during our evaluation, including its limitations. Plansformer has the potential to significantly improve the efficiency and effectiveness of planning in various domains, from logistics and scheduling to natural language processing and human-computer interaction. In addition, we provide public access to Plansformer via a website as well as an API endpoint; this enables other researchers to utilize our tool for planning and execution. The demo video is available at https://youtu.be/_1rlctCGsrk","PeriodicalId":394530,"journal":{"name":"International Joint Conference on Artificial Intelligence","volume":"5 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132287389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Annealing Genetic-based Preposition Substitution for Text Rubbish Example Generation 基于退火遗传的介词替换文本垃圾样例生成
Pub Date : 2023-08-01 DOI: 10.24963/ijcai.2023/569
Chen Li, Xinghao Yang, Baodi Liu, Weifeng Liu, Honglong Chen
Modern Natural Language Processing (NLP) models expose under-sensitivity towards text rubbish examples. The text rubbish example is the heavily modified input text which is nonsensical to humans but does not change the model’s prediction. Prior work crafts rubbish examples by iteratively deleting words and determining the deletion order with beam search. However, the produced rubbish examples usually cause a reduction in model confidence and sometimes deliver human-readable text. To address these problems, we propose an Annealing Genetic based Preposition Substitution (AGPS) algorithm for text rubbish sample generation with two major merits. Firstly, the AGPS crafts rubbish text examples by substituting input words with meaningless prepositions instead of directly removing them, which brings less degradation to the model’s confidence. Secondly, we design an Annealing Genetic algorithm to optimize the word replacement priority, which allows the Genetic Algorithm (GA) to jump out the local optima with probabilities. This is significant in achieving better objectives, i.e., a high word modification rate and a high model confidence. Experimental results on five popular datasets manifest the superiority of AGPS compared with the baseline and expose the fact: the NLP models can not really understand the semantics of sentences, as they give the same prediction with even higher confidence for the nonsensical preposition sequences.
现代自然语言处理(NLP)模型对文本垃圾样本的敏感性不足。文本垃圾示例是大量修改的输入文本,这些文本对人类来说是无意义的,但不会改变模型的预测。先前的工作是通过迭代删除单词和用波束搜索确定删除顺序来生成垃圾样例。然而,产生的垃圾示例通常会导致模型置信度降低,有时会提供人类可读的文本。为了解决这些问题,我们提出了一种基于退火遗传的介词替换(AGPS)算法用于文本垃圾样本生成,该算法具有两个主要优点。首先,AGPS通过用无意义的介词代替输入词来制作垃圾文本样例,而不是直接删除它们,这对模型的置信度降低较小。其次,我们设计了一种退火遗传算法来优化单词替换优先级,使遗传算法(GA)能够以概率跳出局部最优。这对于实现更好的目标非常重要,例如,高单词修改率和高模型置信度。在5个流行数据集上的实验结果显示了AGPS与基线相比的优势,并揭示了一个事实:NLP模型并不能真正理解句子的语义,因为它们对无意义介词序列给出了相同的预测,甚至更高的置信度。
{"title":"Annealing Genetic-based Preposition Substitution for Text Rubbish Example Generation","authors":"Chen Li, Xinghao Yang, Baodi Liu, Weifeng Liu, Honglong Chen","doi":"10.24963/ijcai.2023/569","DOIUrl":"https://doi.org/10.24963/ijcai.2023/569","url":null,"abstract":"Modern Natural Language Processing (NLP) models expose under-sensitivity towards text rubbish examples. The text rubbish example is the heavily modified input text which is nonsensical to humans but does not change the model’s prediction. Prior work crafts rubbish examples by iteratively deleting words and determining the deletion order with beam search. However, the produced rubbish examples usually cause a reduction in model confidence and sometimes deliver human-readable text. To address these problems, we propose an Annealing Genetic based Preposition Substitution (AGPS) algorithm for text rubbish sample generation with two major merits. Firstly, the AGPS crafts rubbish text examples by substituting input words with meaningless prepositions instead of directly removing them, which brings less degradation to the model’s confidence. Secondly, we design an Annealing Genetic algorithm to optimize the word replacement priority, which allows the Genetic Algorithm (GA) to jump out the local optima with probabilities. This is significant in achieving better objectives, i.e., a high word modification rate and a high model confidence. Experimental results on five popular datasets manifest the superiority of AGPS compared with the baseline and expose the fact: the NLP models can not really understand the semantics of sentences, as they give the same prediction with even higher confidence for the nonsensical preposition sequences.","PeriodicalId":394530,"journal":{"name":"International Joint Conference on Artificial Intelligence","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134532812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic Recognition of the General-Purpose Communicative Functions Defined by the ISO 24617-2 Standard for Dialog Act Annotation (Extended Abstract) ISO 24617-2对话动作注释标准中通用交际功能的自动识别(扩展摘要)
Pub Date : 2023-08-01 DOI: 10.24963/ijcai.2023/788
Eugénio Ribeiro, Ricardo Ribeiro, David Martins de Matos
From the perspective of a dialog system, the identification of the intention behind the segments in a dialog is important, as it provides cues regarding the information present in the segments and how they should be interpreted. The ISO 24617-2 standard for dialog act annotation defines a hierarchically organized set of general-purpose communicative functions that correspond to different intentions that are relevant in the context of a dialog. In this paper, we explore the automatic recognition of these functions. To do so, we propose to adapt existing approaches to dialog act recognition, so that they can deal with the hierarchical classification problem. More specifically, we propose the use of an end-to-end hierarchical network with cascading outputs and maximum a posteriori path estimation to predict the communicative function at each level of the hierarchy, preserve the dependencies between the functions in the path, and decide at which level to stop. Additionally, we rely on transfer learning processes to address the data scarcity problem. Our experiments on the DialogBank show that this approach outperforms both flat and hierarchical approaches based on multiple classifiers and that each of its components plays an important role in the recognition of general-purpose communicative functions.
从对话系统的角度来看,识别对话片段背后的意图非常重要,因为它提供了关于片段中存在的信息以及如何解释它们的线索。对话行为注释的ISO 24617-2标准定义了一组分层组织的通用交流功能,这些功能对应于对话上下文中相关的不同意图。在本文中,我们对这些函数的自动识别进行了探讨。为此,我们提出对现有的对话行为识别方法进行改进,使其能够处理层次分类问题。更具体地说,我们建议使用具有级联输出和最大后测路径估计的端到端分层网络来预测每一层的通信功能,保留路径中功能之间的依赖关系,并决定在哪一层停止。此外,我们依靠迁移学习过程来解决数据稀缺问题。我们在DialogBank上的实验表明,这种方法优于基于多个分类器的扁平和分层方法,并且它的每个组件在通用交际功能的识别中都起着重要作用。
{"title":"Automatic Recognition of the General-Purpose Communicative Functions Defined by the ISO 24617-2 Standard for Dialog Act Annotation (Extended Abstract)","authors":"Eugénio Ribeiro, Ricardo Ribeiro, David Martins de Matos","doi":"10.24963/ijcai.2023/788","DOIUrl":"https://doi.org/10.24963/ijcai.2023/788","url":null,"abstract":"From the perspective of a dialog system, the identification of the intention behind the segments in a dialog is important, as it provides cues regarding the information present in the segments and how they should be interpreted. The ISO 24617-2 standard for dialog act annotation defines a hierarchically organized set of general-purpose communicative functions that correspond to different intentions that are relevant in the context of a dialog. In this paper, we explore the automatic recognition of these functions. To do so, we propose to adapt existing approaches to dialog act recognition, so that they can deal with the hierarchical classification problem. More specifically, we propose the use of an end-to-end hierarchical network with cascading outputs and maximum a posteriori path estimation to predict the communicative function at each level of the hierarchy, preserve the dependencies between the functions in the path, and decide at which level to stop. Additionally, we rely on transfer learning processes to address the data scarcity problem. Our experiments on the DialogBank show that this approach outperforms both flat and hierarchical approaches based on multiple classifiers and that each of its components plays an important role in the recognition of general-purpose communicative functions.","PeriodicalId":394530,"journal":{"name":"International Joint Conference on Artificial Intelligence","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134072340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fluid Dynamics-Inspired Network for Infrared Small Target Detection 基于流体动力学的红外小目标检测网络
Pub Date : 2023-08-01 DOI: 10.24963/ijcai.2023/66
Tianxiang Chen, Q. Chu, B. Liu, Nenghai Yu
Most infrared small target detection (ISTD) networks focus on building effective neural blocks or feature fusion modules but none describes the ISTD process from the image evolution perspective. The directional evolution of image pixels influenced by convolution, pooling and surrounding pixels is analogous to the movement of fluid elements constrained by surrounding variables ang particles. Inspired by this, we explore a novel research routine by abstracting the movement of pixels in the ISTD process as the flow of fluid in fluid dynamics (FD). Specifically, a new Fluid Dynamics-Inspired Network (FDI-Net) is devised for ISTD. Based on Taylor Central Difference (TCD) method, the TCD feature extraction block is designed, where convolution and Transformer structures are combined for local and global information. The pixel motion equation during the ISTD process is derived from the Navier–Stokes (N-S) equation, constructing a N-S Refinement Module that refines extracted features with edge details. Thus, the TCD feature extraction block determines the primary movement direction of pixels during detection, while the N-S Refinement Module corrects some skewed directions of the pixel stream to supplement the edge details. Experiments on IRSTD-1k and SIRST demonstrate that our method achieves SOTA performance in terms of evaluation metrics.
大多数红外小目标检测(ISTD)网络都侧重于构建有效的神经块或特征融合模块,但没有一个从图像进化的角度描述ISTD过程。受卷积、池化和周围像素影响的图像像素的方向演化类似于受周围变量和粒子约束的流体元素的运动。受此启发,我们探索了一种新的研究方法,将ISTD过程中像素的运动抽象为流体动力学(FD)中的流体流动。具体来说,针对ISTD设计了一种新的流体动力学激励网络(FDI-Net)。在Taylor中心差分(TCD)方法的基础上,设计了TCD特征提取块,将卷积和Transformer结构相结合,提取局部和全局信息。ISTD过程中的像素运动方程由Navier-Stokes (N-S)方程导出,构建N-S细化模块,对提取的特征进行边缘细节细化。因此,TCD特征提取块在检测时确定像素的主要运动方向,而N-S细化模块对像素流的一些偏斜方向进行校正,以补充边缘细节。在IRSTD-1k和SIRST上的实验表明,我们的方法在评估指标方面达到了SOTA的性能。
{"title":"Fluid Dynamics-Inspired Network for Infrared Small Target Detection","authors":"Tianxiang Chen, Q. Chu, B. Liu, Nenghai Yu","doi":"10.24963/ijcai.2023/66","DOIUrl":"https://doi.org/10.24963/ijcai.2023/66","url":null,"abstract":"Most infrared small target detection (ISTD) networks focus on building effective neural blocks or feature fusion modules but none describes the ISTD process from the image evolution perspective. The directional evolution of image pixels influenced by convolution, pooling and surrounding pixels is analogous to the movement of fluid elements constrained by surrounding variables ang particles. Inspired by this, we explore a novel research routine by abstracting the movement of pixels in the ISTD process as the flow of fluid in fluid dynamics (FD). Specifically, a new Fluid Dynamics-Inspired Network (FDI-Net) is devised for ISTD. Based on Taylor Central Difference (TCD) method, the TCD feature extraction block is designed, where convolution and Transformer structures are combined for local and global information. The pixel motion equation during the ISTD process is derived from the Navier–Stokes (N-S) equation, constructing a N-S Refinement Module that refines extracted features with edge details. Thus, the TCD feature extraction block determines the primary movement direction of pixels during detection, while the N-S Refinement Module corrects some skewed directions of the pixel stream to supplement the edge details. Experiments on IRSTD-1k and SIRST demonstrate that our method achieves SOTA performance in terms of evaluation metrics.","PeriodicalId":394530,"journal":{"name":"International Joint Conference on Artificial Intelligence","volume":"283 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122960653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AI Planning for Hybrid Systems 混合系统的人工智能规划
Pub Date : 2023-08-01 DOI: 10.24963/ijcai.2023/805
Enrico Scala
When planning the tasks of some physical entities that need to perform actions in the world (e.g., a Robot) it is necessary to take into account quite complex models for ensuring that the plan is actually executable. Indeed the state of these systems evolves according to potentially non-linear dynamics where interdependent discrete and continuous changes happen over the entire course of the task. Systems of this kind are typically compactly represented in planning using languages mixing propositional logic and mathematics. However, these languages are still poorly understood and exploited. What are the difficulties for planning in these settings? How can we build systems that can scale up over realistically sized problems? What are the domains which can benefit from these languages? This short paper shows the main two ingredients that are needed to build a heuristic search planner, outline the main impact that such techniques have on application, and provide some open challenges. These models and relative planners hold the promise to deliver explainable AI solutions that do not rely on large amounts of data.
当计划一些需要在世界中执行动作的物理实体(例如,机器人)的任务时,有必要考虑相当复杂的模型,以确保计划实际上是可执行的。事实上,这些系统的状态是根据潜在的非线性动态发展的,其中相互依赖的离散和连续的变化发生在整个任务过程中。这类系统通常用混合了命题逻辑和数学的语言在规划中紧凑地表示。然而,人们对这些语言的理解和利用仍然很少。在这些情况下进行规划的困难是什么?我们如何构建系统,使其能够在实际规模的问题上扩大规模?哪些领域可以从这些语言中受益?这篇短文展示了构建启发式搜索规划器所需的两个主要成分,概述了此类技术对应用程序的主要影响,并提供了一些开放的挑战。这些模型和相关规划者有望提供可解释的人工智能解决方案,而不依赖于大量数据。
{"title":"AI Planning for Hybrid Systems","authors":"Enrico Scala","doi":"10.24963/ijcai.2023/805","DOIUrl":"https://doi.org/10.24963/ijcai.2023/805","url":null,"abstract":"When planning the tasks of some physical entities that need to perform actions in the world (e.g., a Robot) it is necessary to take into account quite complex models for ensuring that the plan is actually executable. Indeed the state of these systems evolves according to potentially non-linear dynamics where interdependent discrete and continuous changes happen over the entire course of the task. Systems of this kind are typically compactly represented in planning using languages mixing propositional logic and mathematics. However, these languages are still poorly understood and exploited. What are the difficulties for planning in these settings? How can we build systems that can scale up over realistically sized problems? What are the domains which can benefit from these languages? This short paper shows the main two ingredients that are needed to build a heuristic search planner, outline the main impact that such techniques have on application, and provide some open challenges. These models and relative planners hold the promise to deliver explainable AI solutions that do not rely on large amounts of data.","PeriodicalId":394530,"journal":{"name":"International Joint Conference on Artificial Intelligence","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124380776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SSML-QNet: Scale-Separative Metric Learning Quadruplet Network for Multi-modal Image Patch Matching SSML-QNet:用于多模态图像Patch匹配的尺度分离度量学习四重网络
Pub Date : 2023-08-01 DOI: 10.24963/ijcai.2023/511
Xiuwei Zhang, Yi Sun, Yamin Han, Yanping Li, Hanlin Yin, Yinghui Xing, Yanning Zhang
Multi-modal image matching is very challenging due to the significant diversities in visual appearance of different modal images. Typically, the existing well-performed methods mainly focus on learning invariant and discriminative features for measuring the relation between multi-modal image pairs. However, these methods often take the features as a whole and largely overlook the fact that different scale features for a same image pair may have different similarity, which may lead to sub-optimal results only. In this work, we propose a Scale-Separative Metric Learning Quadruplet network (SSML-QNet) for multi-modal image patch matching. Specifically, SSML-QNet can extract both relevant and irrelevant features of imaging modality with the proposed quadruplet network architecture. Then, the proposed Scale-Separative Metric Learning module separately encodes the similarity of different scale features with the pyramid structure. And for each scale, cross-modal consistent features are extracted and measured by coordinate and channel-wise attention sequentially. This makes our network robust to appearance divergence caused by different imaging mechanism. Experiments on the benchmark dataset (VIS-NIR, VIS-LWIR, Optical-SAR, and Brown) have verified that the proposed SSML-QNet is able to outperform other state-of-the-art methods. Furthermore, the cross-dataset transferring experiments on these four datasets also have shown that the proposed method has powerful ability of cross-dataset transferring.
由于不同模态图像在视觉外观上的显著差异,多模态图像匹配非常具有挑战性。通常,现有的性能较好的方法主要集中在学习不变特征和判别特征来测量多模态图像对之间的关系。然而,这些方法往往将特征作为一个整体,很大程度上忽略了同一图像对的不同尺度特征可能具有不同的相似度,这可能会导致次优结果。在这项工作中,我们提出了一种用于多模态图像补丁匹配的尺度分离度量学习四重网络(SSML-QNet)。具体而言,SSML-QNet可以通过提出的四元网络架构提取成像模态的相关和不相关特征。然后,提出的尺度分离度量学习模块分别用金字塔结构对不同尺度特征的相似性进行编码。对于每个尺度,分别通过坐标和通道关注提取和测量跨模态一致性特征。这使得我们的网络对不同成像机制引起的外观差异具有鲁棒性。在基准数据集(VIS-NIR, VIS-LWIR, Optical-SAR和Brown)上的实验已经验证了所提出的SSML-QNet能够优于其他最先进的方法。此外,在这4个数据集上进行的跨数据集传输实验也表明,该方法具有强大的跨数据集传输能力。
{"title":"SSML-QNet: Scale-Separative Metric Learning Quadruplet Network for Multi-modal Image Patch Matching","authors":"Xiuwei Zhang, Yi Sun, Yamin Han, Yanping Li, Hanlin Yin, Yinghui Xing, Yanning Zhang","doi":"10.24963/ijcai.2023/511","DOIUrl":"https://doi.org/10.24963/ijcai.2023/511","url":null,"abstract":"Multi-modal image matching is very challenging due to the significant diversities in visual appearance of different modal images. Typically, the existing well-performed methods mainly focus on learning invariant and discriminative features for measuring the relation between multi-modal image pairs. However, these methods often take the features as a whole and largely overlook the fact that different scale features for a same image pair may have different similarity, which may lead to sub-optimal results only. In this work, we propose a Scale-Separative Metric Learning Quadruplet network (SSML-QNet) for multi-modal image patch matching. Specifically, SSML-QNet can extract both relevant and irrelevant features of imaging modality with the proposed quadruplet network architecture. Then, the proposed Scale-Separative Metric Learning module separately encodes the similarity of different scale features with the pyramid structure. And for each scale, cross-modal consistent features are extracted and measured by coordinate and channel-wise attention sequentially. This makes our network robust to appearance divergence caused by different imaging mechanism. Experiments on the benchmark dataset (VIS-NIR, VIS-LWIR, Optical-SAR, and Brown) have verified that the proposed SSML-QNet is able to outperform other state-of-the-art methods. Furthermore, the cross-dataset transferring experiments on these four datasets also have shown that the proposed method has powerful ability of cross-dataset transferring.","PeriodicalId":394530,"journal":{"name":"International Joint Conference on Artificial Intelligence","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125433364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Beyond Homophily: Robust Graph Anomaly Detection via Neural Sparsification 超越同态:基于神经稀疏化的鲁棒图异常检测
Pub Date : 2023-08-01 DOI: 10.24963/ijcai.2023/234
Zheng Gong, Guifeng Wang, Ying Sun, Qi Liu, Yuting Ning, H. Xiong, Jingyu Peng
Recently, graph-based anomaly detection (GAD) has attracted rising attention due to its effectiveness in identifying anomalies in relational and structured data. Unfortunately, the performance of most existing GAD methods suffers from the inherent structural noises of graphs induced by hidden anomalies connected with considerable benign nodes. In this work, we propose SparseGAD, a novel GAD framework that sparsifies the structures of target graphs to effectively reduce noises and collaboratively learns node representations. It then robustly detects anomalies by uncovering the underlying dependency among node pairs in terms of homophily and heterophily, two essential connection properties of GAD. Extensive experiments on real-world datasets of GAD demonstrate that the proposed framework achieves significantly better detection quality compared with the state-of-the-art methods, even when the graph is heavily attacked. Code will be available at https://github.com/KellyGong/SparseGAD.git.
近年来,基于图的异常检测(GAD)因其在识别关系数据和结构化数据中的异常方面的有效性而受到越来越多的关注。不幸的是,大多数现有的GAD方法的性能受到隐藏异常与大量良性节点连接所引起的图的固有结构噪声的影响。在这项工作中,我们提出了一种新的GAD框架SparseGAD,它简化了目标图的结构,以有效地降低噪声并协同学习节点表示。然后,它通过揭示节点对之间在同质性和异质性方面的潜在依赖关系(GAD的两个基本连接特性)来健壮地检测异常。在GAD的真实数据集上进行的大量实验表明,即使在图受到严重攻击的情况下,与最先进的方法相比,所提出的框架也实现了更好的检测质量。代码将在https://github.com/KellyGong/SparseGAD.git上提供。
{"title":"Beyond Homophily: Robust Graph Anomaly Detection via Neural Sparsification","authors":"Zheng Gong, Guifeng Wang, Ying Sun, Qi Liu, Yuting Ning, H. Xiong, Jingyu Peng","doi":"10.24963/ijcai.2023/234","DOIUrl":"https://doi.org/10.24963/ijcai.2023/234","url":null,"abstract":"Recently, graph-based anomaly detection (GAD) has attracted rising attention due to its effectiveness in identifying anomalies in relational and structured data. Unfortunately, the performance of most existing GAD methods suffers from the inherent structural noises of graphs induced by hidden anomalies connected with considerable benign nodes. In this work, we propose SparseGAD, a novel GAD framework that sparsifies the structures of target graphs to effectively reduce noises and collaboratively learns node representations. It then robustly detects anomalies by uncovering the underlying dependency among node pairs in terms of homophily and heterophily, two essential connection properties of GAD. Extensive experiments on real-world datasets of GAD demonstrate that the proposed framework achieves significantly better detection quality compared with the state-of-the-art methods, even when the graph is heavily attacked. Code will be available at https://github.com/KellyGong/SparseGAD.git.","PeriodicalId":394530,"journal":{"name":"International Joint Conference on Artificial Intelligence","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131590309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Joint Conference on Artificial Intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1