International Joint Conference on Artificial Intelligence最新文献

英文中文

A Bitwise GAC Algorithm for Alldifferent Constraints 一种适用于所有不同约束的按位GAC算法

International Joint Conference on Artificial Intelligence

Pub Date : 2023-08-01 DOI: 10.24963/ijcai.2023/221

Z. Li, Yao-Ming Wang, Zhanshan Li

The generalized arc consistency (GAC) algorithm is the prevailing solution for alldifferent constraint problems. The core part of GAC for alldifferent constraints is excavating and enumerating all the strongly connected components (SCCs) of the graph model. This causes a large amount of complex data structures to maintain the node information, leading to a large overhead both in time and memory space. More critically, the complexity of the data structures further precludes the coordination of different optimization schemes for GAC. To solve this problem, the key observation of this paper is that the GAC algorithm only cares whether a node of the graph model is in an SCC or not, rather than which SCCs it belongs to. Based on this observation, we propose AllDiffbit, which employs bitwise data structures and operations to efficiently determine if a node is in an SCC. This greatly reduces the corresponding overhead, and enhances the ability to incorporate existing optimizations to work in a synergistic way. Our experiments show that AllDiffbit outperforms the state-of-the-art GAC algorithms over 60%.

广义弧一致性(GAC)算法是求解各种约束问题的主流算法。针对所有不同约束条件的GAC的核心部分是挖掘和枚举图模型的所有强连接组件(scc)。这将导致大量复杂的数据结构来维护节点信息，从而导致大量的时间和内存空间开销。更为关键的是，数据结构的复杂性进一步阻碍了GAC不同优化方案的协调。为了解决这个问题，本文的关键观察是GAC算法只关心图模型的一个节点是否在一个SCC中，而不关心它属于哪个SCC。基于这种观察，我们提出了AllDiffbit，它采用位数据结构和操作来有效地确定节点是否在SCC中。这大大减少了相应的开销，并增强了合并现有优化以协同方式工作的能力。我们的实验表明，AllDiffbit优于最先进的GAC算法60%以上。

引用次数: 0

Diagram Visual Grounding: Learning to See with Gestalt-Perceptual Attention 图解视觉基础:学习用格式塔知觉注意看

International Joint Conference on Artificial Intelligence

Pub Date : 2023-08-01 DOI: 10.24963/ijcai.2023/93

Xin Hu, Lingling Zhang, Jun Liu, Xinyu Zhang, Wenjun Wu, Qianying Wang

Diagram visual grounding aims to capture the correlation between language expression and local objects in the diagram, and plays an important role in the applications like textbook question answering and cross-modal retrieval. Most diagrams consist of several colors and simple geometries. This results in sparse low-level visual features, which further aggravates the gap between low-level visual and high-level semantic features of diagrams. The phenomenon brings challenges to the diagram visual grounding. To solve the above issues, we propose a gestalt-perceptual attention model to align the diagram objects and language expressions. For low-level visual features, inspired by the gestalt that simulates human visual system, we build a gestalt-perception graph network to make up the features learned by the traditional backbone network. For high-level semantic features, we design a multi-modal context attention mechanism to facilitate the interaction between diagrams and language expressions, so as to enhance the semantics of diagrams. Finally, guided by diagram features and linguistic embedding, the target query is gradually decoded to generate the coordinates of the referred object. By conducting comprehensive experiments on diagrams and natural images, we demonstrate that the proposed model achieves superior performance over the competitors. Our code will be released at https://github.com/AIProCode/GPA.

图的视觉基础旨在捕捉语言表达与图中局部对象之间的相关性，在教科书问答和跨模态检索等应用中发挥着重要作用。大多数图表由几种颜色和简单的几何图形组成。这导致了稀疏的低级视觉特征，进一步加剧了图的低级视觉特征和高级语义特征之间的差距。这种现象给图表的视觉基础带来了挑战。为了解决上述问题，我们提出了一种格式塔-知觉注意模型来对齐图对象和语言表达。对于底层的视觉特征，受完形模拟人类视觉系统的启发，我们构建了一个完形感知图网络来弥补传统骨干网络学习到的特征。对于高级语义特征，我们设计了多模态上下文注意机制，促进图与语言表达式之间的交互，从而增强图的语义。最后，在图特征和语言嵌入的引导下，对目标查询进行逐步解码，生成参考对象的坐标。通过对图表和自然图像的综合实验，我们证明了该模型的性能优于竞争对手。我们的代码将在https://github.com/AIProCode/GPA上发布。

{"title":"Diagram Visual Grounding: Learning to See with Gestalt-Perceptual Attention","authors":"Xin Hu, Lingling Zhang, Jun Liu, Xinyu Zhang, Wenjun Wu, Qianying Wang","doi":"10.24963/ijcai.2023/93","DOIUrl":"https://doi.org/10.24963/ijcai.2023/93","url":null,"abstract":"Diagram visual grounding aims to capture the correlation between language expression and local objects in the diagram, and plays an important role in the applications like textbook question answering and cross-modal retrieval. Most diagrams consist of several colors and simple geometries. This results in sparse low-level visual features, which further aggravates the gap between low-level visual and high-level semantic features of diagrams. The phenomenon brings challenges to the diagram visual grounding. To solve the above issues, we propose a gestalt-perceptual attention model to align the diagram objects and language expressions. For low-level visual features, inspired by the gestalt that simulates human visual system, we build a gestalt-perception graph network to make up the features learned by the traditional backbone network. For high-level semantic features, we design a multi-modal context attention mechanism to facilitate the interaction between diagrams and language expressions, so as to enhance the semantics of diagrams. Finally, guided by diagram features and linguistic embedding, the target query is gradually decoded to generate the coordinates of the referred object. By conducting comprehensive experiments on diagrams and natural images, we demonstrate that the proposed model achieves superior performance over the competitors. Our code will be released at https://github.com/AIProCode/GPA.","PeriodicalId":394530,"journal":{"name":"International Joint Conference on Artificial Intelligence","volume":"53 91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121494132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

AI and Decision Support for Sustainable Socio-Ecosystems 可持续社会生态系统的人工智能和决策支持

International Joint Conference on Artificial Intelligence

Pub Date : 2023-08-01 DOI: 10.24963/ijcai.2023/707

Dimitri Justeau‐Allaire

The conservation and the restoration of biodiversity, in accordance with human well-being, is a necessary condition for the realization of several Sustainable Development Goals. However, there is still an important gap between biodiversity research and the management of natural areas. This research project aims to reduce this gap by proposing spatial planning methods that robustly and accurately integrate socio-ecological issues. Artificial intelligence, and notably Constraint Programming, will play a central role and will make it possible to remove the methodological obstacles that prevent us from properly addressing the complexity and heterogeneity of sustainability issues in the management of ecosystems. The whole will be articulated in three axes: (i) integrate socio-ecological dynamics into spatial planning, (ii) rely on adequate landscape metrics in spatial planning, (iii) scaling up spatial planning methods performances. The main study context of this project is the sustainable management of tropical forests, with a particular focus on New Caledonia and West Africa.

保护和恢复生物多样性符合人类福祉，是实现若干可持续发展目标的必要条件。然而，生物多样性研究与自然区域管理之间仍然存在着重要的差距。本研究项目旨在通过提出稳健而准确地整合社会生态问题的空间规划方法来缩小这一差距。人工智能，特别是约束规划，将发挥核心作用，并将有可能消除方法上的障碍，这些障碍使我们无法正确解决生态系统管理中可持续性问题的复杂性和异质性。整体将在三个轴上阐述:(i)将社会生态动态纳入空间规划;(ii)在空间规划中依赖适当的景观指标;(iii)扩大空间规划方法的性能。这个项目的主要研究范围是热带森林的可持续管理，特别侧重于新喀里多尼亚和西非。

引用次数: 0

CADParser: A Learning Approach of Sequence Modeling for B-Rep CAD CADParser:一种面向B-Rep CAD的序列建模学习方法

International Joint Conference on Artificial Intelligence

Pub Date : 2023-08-01 DOI: 10.24963/ijcai.2023/200

Shengdi Zhou, Tianyi Tang, Bin Zhou

Computer-Aided Design (CAD) plays a crucial role in industrial manufacturing by providing geometry information and the construction workflow for manufactured objects. The construction information enables effective re-editing of parametric CAD models. While boundary representation (B-Rep) is the standard format for representing geometry structures, JSON format is an alternative due to the lack of uniform criteria for storing the construction workflow. Regrettably, most CAD models available on the Internet only offer geometry information, omitting the construction procedure and hampering creation efficiency. This paper proposes a learning approach CADParser to infer the underlying modeling sequences given a B-Rep CAD model. It achieves this by treating the CAD geometry structure as a graph and the construction workflow as a sequence. Since the existing CAD dataset only contains two operations (i.e., Sketch and Extrusion), limiting the diversity of the CAD model creation, we also introduce a large-scale dataset incorporating a more comprehensive range of operations such as Revolution, Fillet, and Chamfer. Each model includes both the geometry structure and the construction sequences. Extensive experiments demonstrate that our method can compete with the existing state-of-the-art methods quantitatively and qualitatively. Data is available at https://drive.google.com/CADParserData.

计算机辅助设计(CAD)通过提供制造对象的几何信息和施工流程，在工业制造中起着至关重要的作用。构造信息使参数化CAD模型的有效重新编辑成为可能。边界表示(B-Rep)是表示几何结构的标准格式，由于缺乏存储构造工作流的统一标准，JSON格式是另一种选择。遗憾的是，互联网上的CAD模型大多只提供几何信息，省略了施工过程，影响了创建效率。本文提出了一种学习方法CADParser来推断给定B-Rep CAD模型的底层建模序列。它通过将CAD几何结构视为图形，将施工工作流视为序列来实现这一目标。由于现有的CAD数据集只包含两种操作(即草图和挤压)，限制了CAD模型创建的多样性，我们还引入了一个包含更全面操作范围的大规模数据集，如旋转、圆角和倒角。每个模型都包括几何结构和构造序列。大量的实验表明，我们的方法可以在数量和质量上与现有的最先进的方法竞争。相关数据可从https://drive.google.com/CADParserData获取。

{"title":"CADParser: A Learning Approach of Sequence Modeling for B-Rep CAD","authors":"Shengdi Zhou, Tianyi Tang, Bin Zhou","doi":"10.24963/ijcai.2023/200","DOIUrl":"https://doi.org/10.24963/ijcai.2023/200","url":null,"abstract":"Computer-Aided Design (CAD) plays a crucial role in industrial manufacturing by providing geometry information and the construction workflow for manufactured objects. The construction information enables effective re-editing of parametric CAD models. While boundary representation (B-Rep) is the standard format for representing geometry structures, JSON format is an alternative due to the lack of uniform criteria for storing the construction workflow. Regrettably, most CAD models available on the Internet only offer geometry information, omitting the construction procedure and hampering creation efficiency. This paper proposes a learning approach CADParser to infer the underlying modeling sequences given a B-Rep CAD model. It achieves this by treating the CAD geometry structure as a graph and the construction workflow as a sequence. Since the existing CAD dataset only contains two operations (i.e., Sketch and Extrusion), limiting the diversity of the CAD model creation, we also introduce a large-scale dataset incorporating a more comprehensive range of operations such as Revolution, Fillet, and Chamfer. Each model includes both the geometry structure and the construction sequences. Extensive experiments demonstrate that our method can compete with the existing state-of-the-art methods quantitatively and qualitatively. Data is available at https://drive.google.com/CADParserData.","PeriodicalId":394530,"journal":{"name":"International Joint Conference on Artificial Intelligence","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124440232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-Scale Subgraph Contrastive Learning 多尺度子图对比学习

International Joint Conference on Artificial Intelligence

Pub Date : 2023-08-01 DOI: 10.24963/ijcai.2023/246

Yanbei Liu, Yu Zhao, Xiao Wang, Lei Geng, Zhitao Xiao

Graph-level contrastive learning, aiming to learn the representations for each graph by contrasting two augmented graphs, has attracted considerable attention. Previous studies usually simply assume that a graph and its augmented graph as a positive pair, otherwise as a negative pair. However, it is well known that graph structure is always complex and multi-scale, which gives rise to a fundamental question: after graph augmentation, will the previous assumption still hold in reality? By an experimental analysis, we discover the semantic information of an augmented graph structure may be not consistent as original graph structure, and whether two augmented graphs are positive or negative pairs is highly related with the multi-scale structures. Based on this finding, we propose a multi-scale subgraph contrastive learning architecture which is able to characterize the fine-grained semantic information. Specifically, we generate global and local views at different scales based on subgraph sampling, and construct multiple contrastive relationships according to their semantic associations to provide richer self-supervised signals. Extensive experiments and parametric analyzes on eight graph classification real-world datasets well demonstrate the effectiveness of the proposed method.

图级对比学习旨在通过对比两个增广图来学习每个图的表示，已经引起了人们的广泛关注。以往的研究通常简单地假设一个图及其增广图为正对，否则为负对。然而，众所周知，图的结构总是复杂和多尺度的，这就产生了一个基本的问题:在图增广之后，前面的假设在现实中是否仍然成立?通过实验分析，我们发现增广图结构的语义信息可能与原图结构不一致，两个增广图是正对还是负对与多尺度结构高度相关。基于这一发现，我们提出了一种能够表征细粒度语义信息的多尺度子图对比学习架构。具体而言，我们基于子图采样生成不同尺度的全局和局部视图，并根据它们的语义关联构建多个对比关系，以提供更丰富的自监督信号。在8个图分类真实数据集上的大量实验和参数分析很好地证明了该方法的有效性。

{"title":"Multi-Scale Subgraph Contrastive Learning","authors":"Yanbei Liu, Yu Zhao, Xiao Wang, Lei Geng, Zhitao Xiao","doi":"10.24963/ijcai.2023/246","DOIUrl":"https://doi.org/10.24963/ijcai.2023/246","url":null,"abstract":"Graph-level contrastive learning, aiming to learn the representations for each graph by contrasting two augmented graphs, has attracted considerable attention. Previous studies usually simply assume that a graph and its augmented graph as a positive pair, otherwise as a negative pair. However, it is well known that graph structure is always complex and multi-scale, which gives rise to a fundamental question: after graph augmentation, will the previous assumption still hold in reality? By an experimental analysis, we discover the semantic information of an augmented graph structure may be not consistent as original graph structure, and whether two augmented graphs are positive or negative pairs is highly related with the multi-scale structures. Based on this finding, we propose a multi-scale subgraph contrastive learning architecture which is able to characterize the fine-grained semantic information. Specifically, we generate global and local views at different scales based on subgraph sampling, and construct multiple contrastive relationships according to their semantic associations to provide richer self-supervised signals. Extensive experiments and parametric analyzes on eight graph classification real-world datasets well demonstrate the effectiveness of the proposed method.","PeriodicalId":394530,"journal":{"name":"International Joint Conference on Artificial Intelligence","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124104716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Exploring Multilingual Intent Dynamics and Applications 探索多语言意图动态及其应用

International Joint Conference on Artificial Intelligence

Pub Date : 2023-08-01 DOI: 10.24963/ijcai.2023/818

Ankan Mullick

Multilingual Intent Detection and explore its different characteristics are major field of study for last few years. But, detection of intention dynamics from text or voice, especially in the Indian multilingual contexts, is a challenging task. So, my first research question is on intent detection and then I work on the application in Indian Multilingual Healthcare scenario. Speech dialogue systems are designed by a pre-defined set of intents to perform user specified tasks. Newer intentions may surfaceover time that call for retraining. However, the newer intents may not be explicitly announced and need to be inferred dynamically.Hence, here are two crucial jobs: (a) recognizing newly emergent intents; and (b) annotating the data of the new intents in orderto effectively retrain the underlying classifier. The tasks become specially challenging when a large number of new intents emergesimultaneously and there is a limited budget of manual annotation. We develop MNID (Multiple Novel Intent Detection), a clusterbased framework that can identify multiple novel intents while optimized human annotation cost. Empirical findings on numerousbenchmark datasets (of varying sizes) show that MNID surpasses the baseline approaches in terms of accuracy and F1-score by wisely allocating the budget for annotation. We apply intent detection approach on different domains in Indian multilingual scenarios -healthcare, finance etc. The creation of advanced NLU healthcare systems is threatened by the lack of data and technology constraints for resource-poor languages in developing nations like India. We evaluate the current state of several cutting-edge language models used in the healthcare with the goal of detecting query intents and corresponding entities. We conduct comprehensive trials on anumber of models different realistic contexts, and we investigate the practical relevance depending on budget and the availability ofdata on English.

多语言意图检测及其不同特征的探索是近年来研究的热点。但是，从文本或语音中检测意图动态，特别是在印度多语言环境中，是一项具有挑战性的任务。所以，我的第一个研究问题是意图检测，然后我研究在印度多语言医疗场景中的应用。语音对话系统是通过一组预定义的意图来设计的，目的是执行用户指定的任务。随着时间的推移，新的意图可能会浮出水面，需要重新培训。然而，更新的意图可能不会显式宣布，需要动态推断。因此，这里有两个关键的工作:(a)识别新出现的意图;(b)对新意图的数据进行注释，以便有效地重新训练底层分类器。当大量的新意图同时出现并且手工注释的预算有限时，任务变得特别具有挑战性。我们开发了mmid (Multiple Novel Intent Detection)，这是一个基于集群的框架，可以在优化人工注释成本的同时识别多个新意图。在众多基准数据集(不同大小)上的实证研究结果表明，通过明智地分配注释预算，mid在准确性和f1分数方面超过了基线方法。我们将意图检测方法应用于印度多语言场景中的不同领域-医疗保健，金融等。在印度等发展中国家，由于资源贫乏的语言缺乏数据和技术限制，先进的NLU医疗保健系统的创建受到了威胁。我们评估了医疗保健中使用的几种尖端语言模型的当前状态，目的是检测查询意图和相应的实体。我们对不同现实背景下的许多模型进行了全面的试验，并根据预算和英语数据的可用性调查了实际相关性。

{"title":"Exploring Multilingual Intent Dynamics and Applications","authors":"Ankan Mullick","doi":"10.24963/ijcai.2023/818","DOIUrl":"https://doi.org/10.24963/ijcai.2023/818","url":null,"abstract":"Multilingual Intent Detection and explore its different characteristics are major field of study for last few years. But, detection of intention dynamics from text or voice, especially in the Indian multilingual contexts, is a challenging task. So, my first research question is on intent detection and then I work on the application in Indian Multilingual Healthcare scenario. Speech dialogue systems are designed by a pre-defined set of intents to perform user specified tasks. Newer intentions may surface\u0000\u0000over time that call for retraining. However, the newer intents may not be explicitly announced and need to be inferred dynamically.\u0000\u0000Hence, here are two crucial jobs: (a) recognizing newly emergent intents; and (b) annotating the data of the new intents in order\u0000\u0000to effectively retrain the underlying classifier. The tasks become specially challenging when a large number of new intents emerge\u0000\u0000simultaneously and there is a limited budget of manual annotation. We develop MNID (Multiple Novel Intent Detection), a cluster\u0000\u0000based framework that can identify multiple novel intents while optimized human annotation cost. Empirical findings on numerous\u0000\u0000benchmark datasets (of varying sizes) show that MNID surpasses the baseline approaches in terms of accuracy and F1-score by wisely allocating the budget for annotation. We apply intent detection approach on different domains in Indian multilingual scenarios -\u0000\u0000healthcare, finance etc. The creation of advanced NLU healthcare systems is threatened by the lack of data and technology constraints for resource-poor languages in developing nations like India. We evaluate the current state of several cutting-edge language models used in the healthcare with the goal of detecting query intents and corresponding entities. We conduct comprehensive trials on a\u0000\u0000number of models different realistic contexts, and we investigate the practical relevance depending on budget and the availability of\u0000\u0000data on English.","PeriodicalId":394530,"journal":{"name":"International Joint Conference on Artificial Intelligence","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127785152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A False Sense of Security (Extended Abstract) 虚假的安全感(扩展摘要)

International Joint Conference on Artificial Intelligence

Pub Date : 2023-08-01 DOI: 10.24963/ijcai.2023/770

P. Bonatti

The growing literature on confidentiality in knowledge representation and reasoning sometimes may cause a false sense of security, due to lack of details aboutthe attacker, and some misconceptions about security-related concepts. This paperanalyzes the vulnerabilities of some recent knowledge protection methods to increase the awareness about their actual effectiveness and their mutual differences.

由于缺乏关于攻击者的详细信息，以及对安全相关概念的一些误解，越来越多的关于知识表示和推理中的机密性的文献有时可能会导致错误的安全感。本文分析了目前一些知识保护方法的漏洞，以提高人们对其实际有效性和相互差异的认识。

引用次数: 0

Sub-Band Based Attention for Robust Polyp Segmentation 基于子带的息肉鲁棒分割算法

International Joint Conference on Artificial Intelligence

Pub Date : 2023-08-01 DOI: 10.24963/ijcai.2023/82

Xianyong Fang, Yu Shi, Qingqing Guo, Linbo Wang, Zhengyi Liu

This article proposes a novel spectral domain based solution to the challenging polyp segmentation. The main contribution is based on an interesting finding of the significant existence of the middle frequency sub-band during the CNN process. Consequently, a Sub-Band based Attention (SBA) module is proposed, which uniformly adopts either the high or middle sub-bands of the encoder features to boost the decoder features and thus concretely improve the feature discrimination. A strong encoder supplying informative sub-bands is also very important, while we highly value the local-and-global information enriched CNN features. Therefore, a Transformer Attended Convolution (TAC) module as the main encoder block is introduced. It takes the Transformer features to boost the CNN features with stronger long-range object contexts. The combination of SBA and TAC leads to a novel polyp segmentation framework, SBA-Net. It adopts TAC to effectively obtain encoded features which also input to SBA, so that efficient sub-bands based attention maps can be generated for progressively decoding the bottleneck features. Consequently, SBA-Net can achieve the robust polyp segmentation, as the experimental results demonstrate.

本文提出了一种新的基于谱域的息肉分割方法。主要贡献是基于一个有趣的发现，即在CNN过程中显著存在中频子带。为此，提出了一种基于子带的注意(Sub-Band based Attention, SBA)模块，该模块统一采用编码器特征的高或中子带来增强译码器特征，从而具体提高特征识别率。提供信息子带的强大编码器也非常重要，同时我们高度重视局部和全局信息丰富的CNN特征。因此，引入变压器参与卷积(TAC)模块作为主要的编码器模块。它使用Transformer特性来增强具有更强远程对象上下文的CNN特性。结合SBA和TAC，形成了一种新的息肉分割框架SBA- net。采用TAC有效获取编码特征，并将编码特征输入到SBA中，生成高效的基于子带的注意图，对瓶颈特征进行逐级解码。实验结果表明，SBA-Net可以实现对息肉的鲁棒性分割。

{"title":"Sub-Band Based Attention for Robust Polyp Segmentation","authors":"Xianyong Fang, Yu Shi, Qingqing Guo, Linbo Wang, Zhengyi Liu","doi":"10.24963/ijcai.2023/82","DOIUrl":"https://doi.org/10.24963/ijcai.2023/82","url":null,"abstract":"This article proposes a novel spectral domain based solution to the challenging polyp segmentation. The main contribution is based on an interesting finding of the significant existence of the middle frequency sub-band during the CNN process. Consequently, a Sub-Band based Attention (SBA) module is proposed, which uniformly adopts either the high or middle sub-bands of the encoder features to boost the decoder features and thus concretely improve the feature discrimination. A strong encoder supplying informative sub-bands is also very important, while we highly value the local-and-global information enriched CNN features. Therefore, a Transformer Attended Convolution (TAC) module as the main encoder block is introduced. It takes the Transformer features to boost the CNN features with stronger long-range object contexts. The combination of SBA and TAC leads to a novel polyp segmentation framework, SBA-Net. It adopts TAC to effectively obtain encoded features which also input to SBA, so that efficient sub-bands based attention maps can be generated for progressively decoding the bottleneck features. Consequently, SBA-Net can achieve the robust polyp segmentation, as the experimental results demonstrate.","PeriodicalId":394530,"journal":{"name":"International Joint Conference on Artificial Intelligence","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125602710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Spike Count Maximization for Neuromorphic Vision Recognition 神经形态视觉识别的峰值计数最大化

International Joint Conference on Artificial Intelligence

Pub Date : 2023-08-01 DOI: 10.24963/ijcai.2023/473

Jianxiong Tang, Jianhuang Lai, Xiaohua Xie, Lingxiao Yang

Spiking Neural Networks (SNNs) are the promising models of neuromorphic vision recognition. The mean square error (MSE) and cross-entropy (CE) losses are widely applied to supervise the training of SNNs on neuromorphic datasets. However, the relevance between the output spike counts and predictions is not well modeled by the existing loss functions. This paper proposes a Spike Count Maximization (SCM) training approach for the SNN-based neuromorphic vision recognition model based on optimizing the output spike counts. The SCM is achieved by structural risk minimization (SRM) and a specially designed spike counting loss. The spike counting loss counts the output spikes of the SNN by using the L0-norm, and the SRM maximizes the distance between the margin boundaries of the classifier to ensure the generalization of the model. The SCM is non-smooth and non-differentiable, and we design a two-stage algorithm with fast convergence to solve the problem. Experiment results demonstrate that the SCM performs satisfactorily in most cases. Using the output spikes for prediction, the accuracies of SCM are 2.12%~16.50% higher than the popular training losses on the CIFAR10-DVS dataset. The code is available at https://github.com/TJXTT/SCM-SNN.

脉冲神经网络(SNNs)是神经形态视觉识别中很有前途的模型。均方误差(MSE)和交叉熵(CE)损失被广泛应用于监督snn在神经形态数据集上的训练。然而，输出尖峰计数和预测之间的相关性并没有很好地由现有的损失函数建模。针对基于snn的神经形态视觉识别模型，提出了一种基于输出尖峰数优化的尖峰数最大化训练方法。单片机是通过结构风险最小化(SRM)和特殊设计的尖峰计数损耗来实现的。尖峰计数损失利用l0范数对SNN的输出尖峰进行计数，SRM通过最大化分类器的边际边界之间的距离来保证模型的泛化。单片机是非光滑不可微的，设计了一种快速收敛的两阶段算法来解决该问题。实验结果表明，该单片机在大多数情况下都具有令人满意的性能。使用输出尖峰进行预测，SCM的准确率比CIFAR10-DVS数据集上流行的训练损失高2.12%~16.50%。代码可在https://github.com/TJXTT/SCM-SNN上获得。

{"title":"Spike Count Maximization for Neuromorphic Vision Recognition","authors":"Jianxiong Tang, Jianhuang Lai, Xiaohua Xie, Lingxiao Yang","doi":"10.24963/ijcai.2023/473","DOIUrl":"https://doi.org/10.24963/ijcai.2023/473","url":null,"abstract":"Spiking Neural Networks (SNNs) are the promising models of neuromorphic vision recognition. The mean square error (MSE) and cross-entropy (CE) losses are widely applied to supervise the training of SNNs on neuromorphic datasets. However, the relevance between the output spike counts and predictions is not well modeled by the existing loss functions. This paper proposes a Spike Count Maximization (SCM) training approach for the SNN-based neuromorphic vision recognition model based on optimizing the output spike counts. The SCM is achieved by structural risk minimization (SRM) and a specially designed spike counting loss. The spike counting loss counts the output spikes of the SNN by using the L0-norm, and the SRM maximizes the distance between the margin boundaries of the classifier to ensure the generalization of the model. The SCM is non-smooth and non-differentiable, and we design a two-stage algorithm with fast convergence to solve the problem. Experiment results demonstrate that the SCM performs satisfactorily in most cases. Using the output spikes for prediction, the accuracies of SCM are 2.12%~16.50% higher than the popular training losses on the CIFAR10-DVS dataset. The code is available at https://github.com/TJXTT/SCM-SNN.","PeriodicalId":394530,"journal":{"name":"International Joint Conference on Artificial Intelligence","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122271890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Label Enhancement via Joint Implicit Representation Clustering 基于联合隐式表示聚类的标签增强

International Joint Conference on Artificial Intelligence

Pub Date : 2023-08-01 DOI: 10.24963/ijcai.2023/447

Yunan Lu, Weiwei Li, Xiuyi Jia

Label distribution is an effective label form to portray label polysemy (i.e., the cases that an instance can be described by multiple labels simultaneously). However, the expensive annotating cost of label distributions limits its application to a wider range of practical tasks. Therefore, LE (label enhancement) techniques are extensively studied to solve this problem. Existing LE algorithms mostly estimate label distributions by the instance relation or the label relation. However, they suffer from biased instance relations, limited model capabilities, or suboptimal local label correlations. Therefore, in this paper, we propose a deep generative model called JRC to simultaneously learn and cluster the joint implicit representations of both features and labels, which can be used to improve any existing LE algorithm involving the instance relation or local label correlations. Besides, we develop a novel label distribution recovery module, and then integrate it with JRC model, thus constituting a novel generative label enhancement model that utilizes the learned joint implicit representations and instance clusters in a principled way. Finally, extensive experiments validate our proposal.

标签分布是描述标签多义(即一个实例可以被多个标签同时描述的情况)的一种有效的标签形式。然而，标签分布昂贵的注释成本限制了它在更广泛的实际任务中的应用。因此，LE(标签增强)技术被广泛研究来解决这个问题。现有的LE算法大多通过实例关系或标签关系来估计标签分布。然而，它们受到有偏差的实例关系、有限的模型功能或次优的局部标签相关性的影响。因此，在本文中，我们提出了一种称为JRC的深度生成模型来同时学习和聚类特征和标签的联合隐式表示，该模型可用于改进现有的任何涉及实例关系或局部标签相关性的LE算法。此外，我们开发了一种新的标签分布恢复模块，并将其与JRC模型集成，从而构成了一种新的生成式标签增强模型，该模型有原则地利用了学习到的联合隐式表示和实例聚类。最后，大量的实验验证了我们的建议。

{"title":"Label Enhancement via Joint Implicit Representation Clustering","authors":"Yunan Lu, Weiwei Li, Xiuyi Jia","doi":"10.24963/ijcai.2023/447","DOIUrl":"https://doi.org/10.24963/ijcai.2023/447","url":null,"abstract":"Label distribution is an effective label form to portray label polysemy (i.e., the cases that an instance can be described by multiple labels simultaneously). However, the expensive annotating cost of label distributions limits its application to a wider range of practical tasks. Therefore, LE (label enhancement) techniques are extensively studied to solve this problem. Existing LE algorithms mostly estimate label distributions by the instance relation or the label relation. However, they suffer from biased instance relations, limited model capabilities, or suboptimal local label correlations. Therefore, in this paper, we propose a deep generative model called JRC to simultaneously learn and cluster the joint implicit representations of both features and labels, which can be used to improve any existing LE algorithm involving the instance relation or local label correlations. Besides, we develop a novel label distribution recovery module, and then integrate it with JRC model, thus constituting a novel generative label enhancement model that utilizes the learned joint implicit representations and instance clusters in a principled way. Finally, extensive experiments validate our proposal.","PeriodicalId":394530,"journal":{"name":"International Joint Conference on Artificial Intelligence","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131949138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

International Joint Conference on Artificial Intelligence

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀