AI Communications最新文献

英文中文

Decision-making under uncertainty for multi-robot systems 多机器人系统的不确定决策

IF 0.8 4区计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

AI Communications

Pub Date : 2022-09-01 DOI: 10.3233/aic-220118

Bruno Lacerda, Anna Gautier, Alex Rutherford, A. Stephens, Charlie Street, N. Hawes

In this overview paper, we present the work of the Goal-Oriented Long-Lived Systems Lab on multi-robot systems. We address multi-robot systems from a decision-making under uncertainty perspective, proposing approaches that explicitly reason about the inherent uncertainty of action execution, and how such stochasticity affects multi-robot coordination. To develop effective decision-making approaches, we take a special focus on (i) temporal uncertainty, in particular of action execution; (ii) the ability to provide rich guarantees of performance, both at a local (robot) level and at a global (team) level; and (iii) scaling up to systems with real-world impact. We summarise several pieces of work and highlight how they address the challenges above, and also hint at future research directions.

在这篇综述文章中，我们介绍了目标导向的长寿命系统实验室在多机器人系统上的工作。我们从不确定性下的决策角度研究多机器人系统，提出了明确解释动作执行固有不确定性的方法，以及这种随机性如何影响多机器人协调。为了开发有效的决策方法，我们特别关注(i)时间的不确定性，特别是行动执行的不确定性;(ii)在本地(机器人)层面和全球(团队)层面提供丰富的性能保证的能力;(iii)扩展到具有实际影响的系统。我们总结了几项工作，并强调了他们如何应对上述挑战，并暗示了未来的研究方向。

引用次数: 1

Person re-identification based on multi-scale global feature and weight-driven part feature 基于多尺度全局特征和权重驱动部分特征的人物再识别

IF 0.8 4区计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

AI Communications

Pub Date : 2022-08-24 DOI: 10.3233/aic-210258

Qingwei Tang, Pu Yan, Jie Chen, Hui Shao, Fuyu Wang, G. Wang

Person re-identification (ReID) is a crucial task in identifying pedestrians of interest across multiple surveillance camera views. ReID methods in recent years have shown that using global features or part features of the pedestrian is extremely effective, but many models do not have further design models to make more reasonable use of global and part features. A new model is proposed to use global features more rationally and extract more fine-grained part features. Specifically, our model captures global features by using a multi-scale attention global feature extraction module, and we design a new context-based adaptive part feature extraction module to consider continuity between different body parts of pedestrians. In addition, we have added additional enhancement modules to the model to enhance its performance. Experiments show that our model achieves competitive results on the Market1501, Dukemtmc-ReID, and MSMT17 datasets. The ablation experiments demonstrate the effectiveness of each module of our model. The code of our model is available at: https://github.com/davidtqw/Person-Re-Identification.

人员再识别(ReID)是在多个监控摄像机视图中识别感兴趣的行人的关键任务。近年来的ReID方法表明，利用行人的全局特征或部分特征是非常有效的，但许多模型没有进一步设计模型来更合理地利用全局特征和部分特征。为了更合理地利用全局特征，提取更细粒度的零件特征，提出了一种新的模型。具体来说，我们的模型通过使用多尺度注意力全局特征提取模块来捕获全局特征，并且我们设计了一个新的基于上下文的自适应部分特征提取模块来考虑行人不同身体部位之间的连续性。此外，我们还为模型添加了额外的增强模块，以增强其性能。实验表明，我们的模型在Market1501、Dukemtmc-ReID和MSMT17数据集上取得了具有竞争力的结果。烧蚀实验验证了模型各模块的有效性。我们的模型的代码可在:https://github.com/davidtqw/Person-Re-Identification。

引用次数: 0

Cross-form efficient attention pyramidal network for semantic image segmentation 面向语义图像分割的十字形高效注意力金字塔网络

IF 0.8 4区计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

AI Communications

Pub Date : 2022-08-09 DOI: 10.3233/aic-210266

Anamika Maurya, S. Chand

Although convolutional neural networks (CNNs) are leading the way in semantic segmentation, standard methods still have some flaws. First, there is feature redundancy and less discriminating feature representations. Second, the number of effective multi-scale features is limited. In this paper, we aim to solve these constraints with the proposed network that utilizes two effective pre-trained models as an encoder. We develop a cross-form attention pyramid that acquires semantically rich multi-scale information from local and global priors. A spatial-wise attention module is introduced to further enhance the segmentation findings. It highlights more discriminating regions of low-level features to focus on significant location information. We demonstrate the efficacy of the proposed network on three datasets, including IDD Lite, PASCAL VOC 2012, and CamVid. Our model achieves a mIoU score of 70.7% on the IDD Lite, 83.98% on the PASCAL VOC 2012, and 73.8% on the CamVid dataset.

虽然卷积神经网络(cnn)在语义分割方面处于领先地位，但标准方法仍然存在一些缺陷。首先，存在特征冗余和较少的区分特征表示。其次，有效的多尺度特征数量有限。在本文中，我们的目标是通过使用两个有效的预训练模型作为编码器的网络来解决这些约束。我们开发了一个交叉形式的注意力金字塔，从局部和全局先验中获取语义丰富的多尺度信息。引入了空间关注模块，进一步增强了分割结果。它突出了低级特征的更多区分区域，以专注于重要的位置信息。我们在三个数据集(包括IDD Lite、PASCAL VOC 2012和CamVid)上证明了所提出的网络的有效性。我们的模型在IDD Lite上的mIoU得分为70.7%，在PASCAL VOC 2012上为83.98%，在CamVid数据集上为73.8%。

引用次数: 1

Deep Reinforcement Learning for Multi-Agent Interaction 多智能体交互的深度强化学习

IF 0.8 4区计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

AI Communications

Pub Date : 2022-08-02 DOI: 10.48550/arXiv.2208.01769

I. Ahmed, Cillian Brewitt, Ignacio Carlucho, Filippos Christianos, Mhairi Dunion, Elliot Fosong, Samuel Garcin, Shangmin Guo, Balint Gyevnar, Trevor A. McInroe, Georgios Papoudakis, Arrasy Rahman, Lukas Schafer, Massimiliano Tamborski, Giuseppe Vecchio, Cheng Wang, Stefano V. Albrecht

The development of autonomous agents which can interact with other agents to accomplish a given task is a core area of research in artificial intelligence and machine learning. Towards this goal, the Autonomous Agents Research Group develops novel machine learning algorithms for autonomous systems control, with a specific focus on deep reinforcement learning and multi-agent reinforcement learning. Research problems include scalable learning of coordinated agent policies and inter-agent communication; reasoning about the behaviours, goals, and composition of other agents from limited observations; and sample-efficient learning based on intrinsic motivation, curriculum learning, causal inference, and representation learning. This article provides a broad overview of the ongoing research portfolio of the group and discusses open problems for future directions.

自主智能体的发展是人工智能和机器学习研究的一个核心领域，它可以与其他智能体相互作用来完成给定的任务。为了实现这一目标，自主代理研究小组开发了用于自主系统控制的新型机器学习算法，特别关注深度强化学习和多代理强化学习。研究问题包括协调代理策略的可扩展学习和代理间通信;从有限的观察中推断其他主体的行为、目标和组成;以及基于内在动机、课程学习、因果推理和表征学习的样本高效学习。本文提供了该小组正在进行的研究组合的广泛概述，并讨论了未来方向的开放问题。

引用次数: 6

Perspectives on the System-level Design of a Safe Autonomous Driving Stack 安全自动驾驶堆栈系统级设计展望

IF 0.8 4区计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

AI Communications

Pub Date : 2022-07-29 DOI: 10.48550/arXiv.2208.00096

Majd Hawasly, Jonathan Sadeghi, Morris Antonello, Stefano V. Albrecht, John Redford, S. Ramamoorthy

Achieving safe and robust autonomy is the key bottleneck on the path towards broader adoption of autonomous vehicles technology. This motivates going beyond extrinsic metrics such as miles between disengagement, and calls for approaches that embody safety by design. In this paper, we address some aspects of this challenge, with emphasis on issues of motion planning and prediction. We do this through description of novel approaches taken to solving selected sub-problems within an autonomous driving stack, in the process introducing the design philosophy being adopted within Five. This includes safe-by-design planning, interpretable as well as verifiable prediction, and modelling of perception errors to enable effective sim-to-real and real-to-sim transfer within the testing pipeline of a realistic autonomous system.

实现安全和强大的自动驾驶是自动驾驶汽车技术广泛应用的关键瓶颈。这促使我们超越外在指标(如脱离距离)，并呼吁通过设计体现安全的方法。在本文中，我们解决了这一挑战的一些方面，重点是运动规划和预测问题。我们通过描述解决自动驾驶堆栈中选定子问题的新方法来实现这一点，并在此过程中介绍了Five所采用的设计理念。这包括安全的设计规划、可解释和可验证的预测，以及感知误差建模，以实现在现实自主系统的测试管道中有效的模拟到真实和真实到模拟的传输。

引用次数: 1

An argumentative approach for handling inconsistency in prioritized Datalog± ontologies 处理优先数据本体论中不一致的论证方法

IF 0.8 4区计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

AI Communications

Pub Date : 2022-07-18 DOI: 10.3233/aic-220087

Loan Ho, S. Arch-int, Erman Acar, S. Schlobach, N. Arch-int

Prioritized Datalog ± is a well-studied formalism for modelling ontological knowledge and data, and has a success story in many applications in the (Semantic) Web and in other domains. Since the information content on the Web is both inherently context-dependent and frequently updated, the occurrence of a logical inconsistency is often inevitable. This phenomenon has led the research community to develop various types of inconsistency-tolerant semantics over the last few decades. Although the study of query answering under inconsistency-tolerant semantics is well-understood, the problem of explaining query answering under such semantics took considerably less attention, especially in the scenario where the facts are prioritized. In this paper, we aim to fill this gap. More specifically, we use Dung’s abstract argumentation framework to address the problem of explaining inconsistency-tolerant query answering in Datalog ± KB where facts are prioritized, or preordered. We clarify the relationship between preferred repair semantics and various notions of extensions for argumentation frameworks. The strength of such argumentation-based approach is the explainability; users can more easily understand why different points of views are conflicting and why the query answer is entailed (or not) under different semantics. To this end we introduce the formal notion of a dialogical explanation, and show how it can be used to both explain showing why query results hold and not hold according to the known semantics in inconsistent Datalog ± knowledge bases.

优先数据表()是一种经过充分研究的用于建模本体论知识和数据的形式化方法，在(语义)Web和其他领域的许多应用中都取得了成功。由于Web上的信息内容本质上依赖于上下文并且经常更新，因此逻辑不一致的发生通常是不可避免的。在过去的几十年里，这种现象导致研究团体开发了各种类型的不一致容忍语义。尽管对不一致容忍语义下查询应答的研究已经得到了很好的理解，但在这种语义下解释查询应答的问题却很少受到关注，特别是在事实优先的情况下。在本文中，我们的目标是填补这一空白。更具体地说，我们使用Dung的抽象论证框架来解决Datalog±KB中解释不一致查询回答的问题，其中事实是优先级或预定的。我们澄清了首选修复语义和论证框架的各种扩展概念之间的关系。这种基于论证的方法的优势在于可解释性;用户可以更容易地理解为什么不同的观点是冲突的，以及为什么查询答案包含(或不包含)在不同的语义下。为此，我们引入了对话解释的形式化概念，并展示了如何使用它来解释显示为什么根据不一致Datalog±知识库中的已知语义，查询结果成立或不成立。

{"title":"An argumentative approach for handling inconsistency in prioritized Datalog± ontologies","authors":"Loan Ho, S. Arch-int, Erman Acar, S. Schlobach, N. Arch-int","doi":"10.3233/aic-220087","DOIUrl":"https://doi.org/10.3233/aic-220087","url":null,"abstract":"Prioritized Datalog ± is a well-studied formalism for modelling ontological knowledge and data, and has a success story in many applications in the (Semantic) Web and in other domains. Since the information content on the Web is both inherently context-dependent and frequently updated, the occurrence of a logical inconsistency is often inevitable. This phenomenon has led the research community to develop various types of inconsistency-tolerant semantics over the last few decades. Although the study of query answering under inconsistency-tolerant semantics is well-understood, the problem of explaining query answering under such semantics took considerably less attention, especially in the scenario where the facts are prioritized. In this paper, we aim to fill this gap. More specifically, we use Dung’s abstract argumentation framework to address the problem of explaining inconsistency-tolerant query answering in Datalog ± KB where facts are prioritized, or preordered. We clarify the relationship between preferred repair semantics and various notions of extensions for argumentation frameworks. The strength of such argumentation-based approach is the explainability; users can more easily understand why different points of views are conflicting and why the query answer is entailed (or not) under different semantics. To this end we introduce the formal notion of a dialogical explanation, and show how it can be used to both explain showing why query results hold and not hold according to the known semantics in inconsistent Datalog ± knowledge bases.","PeriodicalId":50835,"journal":{"name":"AI Communications","volume":"6 1","pages":"243-267"},"PeriodicalIF":0.8,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81898610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Improved YOLOv3 detection method for PCB plug-in solder joint defects based on ordered probability density weighting and attention mechanism 基于有序概率密度加权和注意机制的PCB插件焊点缺陷YOLOv3检测方法的改进

IF 0.8 4区计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

AI Communications

Pub Date : 2022-07-15 DOI: 10.3233/aic-210245

Zheng Wang, Wenbin Chen, Taifu Li, Shaolin Zhang, Rui Xiong

Printed Circuit Board (PCB) is the heart component of electronic products, and its defect detection is the basic requirement of PCB quality control in the production process. Traditional visual detection methods need artificial design features, so their detection accuracy is poor, and the rate of missed and false detection is high. To solve the above problems, this paper proposes an improved YOLOv3 (You Only Look Once) detection method for PCB plug-in solder spot defects based on the combination of the ordered probability density weighting and the attention mechanism. First, in order to obtain a higher priority priori box, the ordered probability density weighting (OWA) method is used to optimize the multiple sets of a priori boxes generated by K-means. Then, to get more effective feature information, the Squeeze-and-Excitation mechanism (SE) is added to the backbone network. In the feature detection network, the Convolutional Block Attention Module (CBAM) attention mechanism is joined, at the same time in the inspection network output layer three layer feature are fusions. Finally, in order to accelerate the convergence speed of model and improve the accuracy of the model, the network loss function was improved by using the generalized joint generalized intersection over union (GIoU), and the COCO data model was applied to PCB solder spot defect training by transfer learning method. After testing, the average detection accuracy of improved network is improved from 84.35% to 96.69%, and the improved network has better convergence than the original network. The study shows that the improved method based on YOLOv3 is more suitable for industrial application of PCB plug-in solder spot defect detection.

印刷电路板(PCB)是电子产品的心脏部件，其缺陷检测是生产过程中PCB质量控制的基本要求。传统的视觉检测方法需要人工设计特征，检测精度较差，漏检率和误检率较高。针对上述问题，本文提出了一种基于有序概率密度加权和注意机制相结合的PCB插件焊点缺陷改进YOLOv3 (You Only Look Once)检测方法。首先，为了获得优先级更高的先验盒，采用有序概率密度加权(OWA)方法对K-means生成的多组先验盒进行优化。然后，为了获得更有效的特征信息，在骨干网中加入了挤压激励机制(SE)。在特征检测网络中加入了卷积块注意模块(CBAM)的注意机制，同时在检测网络输出层中对三层特征进行融合。最后，为了加快模型的收敛速度，提高模型的精度，采用广义联合广义交联(GIoU)对网络损失函数进行改进，并将COCO数据模型应用于PCB焊点缺陷的迁移学习训练。经过测试，改进后的网络平均检测准确率由84.35%提高到96.69%，收敛性优于原网络。研究表明，基于YOLOv3的改进方法更适合PCB插件焊点缺陷检测的工业应用。

{"title":"Improved YOLOv3 detection method for PCB plug-in solder joint defects based on ordered probability density weighting and attention mechanism","authors":"Zheng Wang, Wenbin Chen, Taifu Li, Shaolin Zhang, Rui Xiong","doi":"10.3233/aic-210245","DOIUrl":"https://doi.org/10.3233/aic-210245","url":null,"abstract":"Printed Circuit Board (PCB) is the heart component of electronic products, and its defect detection is the basic requirement of PCB quality control in the production process. Traditional visual detection methods need artificial design features, so their detection accuracy is poor, and the rate of missed and false detection is high. To solve the above problems, this paper proposes an improved YOLOv3 (You Only Look Once) detection method for PCB plug-in solder spot defects based on the combination of the ordered probability density weighting and the attention mechanism. First, in order to obtain a higher priority priori box, the ordered probability density weighting (OWA) method is used to optimize the multiple sets of a priori boxes generated by K-means. Then, to get more effective feature information, the Squeeze-and-Excitation mechanism (SE) is added to the backbone network. In the feature detection network, the Convolutional Block Attention Module (CBAM) attention mechanism is joined, at the same time in the inspection network output layer three layer feature are fusions. Finally, in order to accelerate the convergence speed of model and improve the accuracy of the model, the network loss function was improved by using the generalized joint generalized intersection over union (GIoU), and the COCO data model was applied to PCB solder spot defect training by transfer learning method. After testing, the average detection accuracy of improved network is improved from 84.35% to 96.69%, and the improved network has better convergence than the original network. The study shows that the improved method based on YOLOv3 is more suitable for industrial application of PCB plug-in solder spot defect detection.","PeriodicalId":50835,"journal":{"name":"AI Communications","volume":"58 1","pages":"171-186"},"PeriodicalIF":0.8,"publicationDate":"2022-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90254002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Highlights of AI Research in Europe 欧洲人工智能研究的亮点

IF 0.8 4区计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

AI Communications

Pub Date : 2022-07-14 DOI: 10.3233/aic-229002

S. Schockaert, R. Peñaloza

AI Communications is the ofﬁcial partner journal of the European Association for Artiﬁcial Intelligence (EurAI), as reﬂected among others in its subtitle: the European Journal on Artiﬁcial Intelligence. EurAI is a society of societies; that is, the members of EurAI are national AI societies from all across Europe. To strengthen the connection between AI Communications and EurAI, in July 2021 we have invited each of these EurAI member societies to nominate one paper, reﬂecting the best research from within their society during the preceding year. Each society was free to select the criteria for their nominations. Overall, the call led to expressions of interest from 10 societies, 7 of which resulted in a submission for this special issue. All the submissions went through a fast-tracked version of the normal peer review process, which led to the acceptance of the 5 papers in this special issue. These papers, and their nominating societies, are: these works of research in

《人工智能通讯》是欧洲人工智能协会(EurAI)的官方合作伙伴期刊，其副标题“欧洲人工智能期刊”也反映了这一点。EurAI是一个社会的社会;也就是说，EurAI的成员是来自欧洲各地的国家人工智能协会。为了加强AI Communications和EurAI之间的联系，我们已于2021年7月邀请这些EurAI成员协会提名一篇论文，反映其协会在前一年的最佳研究。每个协会都可以自由选择其提名的标准。总的来说，有10个社团表达了兴趣，其中7个社团为本期特刊提交了作品。所有的投稿都经过了一个快速的同行评议程序，这使得本期特刊中的5篇论文被接受。这些论文和它们的提名协会是:这些研究成果

引用次数: 0

Channel attention and multi-scale graph neural networks for skeleton-based action recognition 基于骨架的动作识别的通道关注和多尺度图神经网络

IF 0.8 4区计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

AI Communications

Pub Date : 2022-07-13 DOI: 10.3233/aic-210250

Ronghao Dang, Chengju Liu, Meilin Liu, Qi Chen

3D skeleton data has been widely used in action recognition as the skeleton-based method has achieved good performance in complex dynamic environments. The rise of spatio-temporal graph convolutions has attracted much attention to use graph convolution to extract spatial and temporal features together in the field of skeleton-based action recognition. However, due to the huge difference in the focus of spatial and temporal features, it is difficult to improve the efficiency of extracting the spatiotemporal features. In this paper, we propose a channel attention and multi-scale neural network (CA-MSN) for skeleton-based action recognition with a series of spatio-temporal extraction modules. We exploit the relationship of body joints hierarchically through two modules, i.e., a spatial module which uses the residual GCN network with the channel attention block to extract the high-level spatial features, and a temporal module which uses the multi-scale TCN network to extract the temporal features at different scales. We perform extensive experiments on both the NTU-RGBD60 and NTU-RGBD120 datasets to verify the effectiveness of our network. The comparison results show that our method achieves the state-of-the-art performance with the competitive computing speed. In order to test the application effect of our CA-MSN model, we design a multi-task tandem network consisting of 2D pose estimation, 2D to 3D pose regression and skeleton action recognition model. The end-to-end (RGB video-to-action type) recognition effect is demonstrated. The code is available at https://github.com/Rh-Dang/CA-MSN-action-recognition.git.

由于基于骨骼的方法在复杂的动态环境中取得了良好的性能，三维骨骼数据在动作识别中得到了广泛的应用。随着时空图卷积的兴起，利用图卷积同时提取时空特征在基于骨架的动作识别领域受到了广泛关注。然而，由于时空特征的重点存在巨大差异，难以提高时空特征提取的效率。在本文中，我们提出了一种基于通道关注和多尺度神经网络(CA-MSN)的骨骼动作识别方法，该方法具有一系列时空提取模块。我们通过两个模块对人体关节之间的关系进行分层挖掘，即空间模块使用残差GCN网络和通道注意块提取高级空间特征，时间模块使用多尺度TCN网络提取不同尺度的时间特征。我们在NTU-RGBD60和NTU-RGBD120数据集上进行了大量的实验来验证我们的网络的有效性。对比结果表明，我们的方法在具有竞争力的计算速度下达到了最先进的性能。为了验证CA-MSN模型的应用效果，我们设计了一个由二维姿态估计、二维到三维姿态回归和骨骼动作识别模型组成的多任务串联网络。演示了端到端(RGB视频到动作类型)识别效果。代码可在https://github.com/Rh-Dang/CA-MSN-action-recognition.git上获得。

{"title":"Channel attention and multi-scale graph neural networks for skeleton-based action recognition","authors":"Ronghao Dang, Chengju Liu, Meilin Liu, Qi Chen","doi":"10.3233/aic-210250","DOIUrl":"https://doi.org/10.3233/aic-210250","url":null,"abstract":"3D skeleton data has been widely used in action recognition as the skeleton-based method has achieved good performance in complex dynamic environments. The rise of spatio-temporal graph convolutions has attracted much attention to use graph convolution to extract spatial and temporal features together in the field of skeleton-based action recognition. However, due to the huge difference in the focus of spatial and temporal features, it is difficult to improve the efficiency of extracting the spatiotemporal features. In this paper, we propose a channel attention and multi-scale neural network (CA-MSN) for skeleton-based action recognition with a series of spatio-temporal extraction modules. We exploit the relationship of body joints hierarchically through two modules, i.e., a spatial module which uses the residual GCN network with the channel attention block to extract the high-level spatial features, and a temporal module which uses the multi-scale TCN network to extract the temporal features at different scales. We perform extensive experiments on both the NTU-RGBD60 and NTU-RGBD120 datasets to verify the effectiveness of our network. The comparison results show that our method achieves the state-of-the-art performance with the competitive computing speed. In order to test the application effect of our CA-MSN model, we design a multi-task tandem network consisting of 2D pose estimation, 2D to 3D pose regression and skeleton action recognition model. The end-to-end (RGB video-to-action type) recognition effect is demonstrated. The code is available at https://github.com/Rh-Dang/CA-MSN-action-recognition.git.","PeriodicalId":50835,"journal":{"name":"AI Communications","volume":"106 1","pages":"187-205"},"PeriodicalIF":0.8,"publicationDate":"2022-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82457126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Multiple refinement and integration network for Salient Object Detection 显著目标检测的多重细化集成网络

IF 0.8 4区计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

AI Communications

Pub Date : 2022-05-10 DOI: 10.3233/aic-210273

Chao Dai, Chen Pan, W. He, Hanqi Sun

The purpose of the salient object detection (SOD) task is to suppress the background noise and segment the salient foreground regions. Some previous methods considered the strategies of background suppression and multi-level feature fusion. Other methods encountered the problem that single-scale convolution features are difficult to capture the correct object size. This paper reconsiders the above problems and proposes a comprehensive solution to achieve SOD for improving the detection performance and ensuring relatively fewer parameters. First, it is difficult to achieve a better refinement effect through only one refinement operation. To this end, a multi-scale denoising module (MSDM) and multi-pooling refinement module (MPRM) are proposed to jointly complete the refinement task of multi-level features. Besides, it is difficult to fully integrate complementary features through only one feature integration operation. Mutual learning module (MLM) is proposed to preliminarily integrate multi-level features. To reduce information redundancy, multi-attention (MA) mechanism is used to assist further integration. The proposed algorithm is called multiple refinement and integration network (MRINet). Experimental results on five benchmark datasets show that MRINet outperforms state-of-the-art methods on multiple evaluation metrics. Moreover, our ResNet-based algorithm only contains 25.202 million parameters, which is less than other ResNet-based algorithms and can run at over 37 fps on a single GPU. The code will be available at https://github.com/dc3234/MRINet.

显著目标检测(SOD)任务的目的是抑制背景噪声，分割显著前景区域。以前的一些方法考虑了背景抑制和多层次特征融合策略。其他方法遇到的问题是，单尺度卷积特征难以捕获正确的对象大小。本文对上述问题进行了重新思考，提出了一种实现超氧化物歧化酶的综合解决方案，既提高了检测性能，又保证了相对较少的参数。首先，仅通过一次细化操作很难达到较好的细化效果。为此，提出了多尺度去噪模块(MSDM)和多池细化模块(MPRM)，共同完成多层次特征的细化任务。此外，仅通过一次特征集成操作难以充分整合互补特征。相互学习模块(Mutual learning module, MLM)的提出是为了初步整合多层次特征。为了减少信息冗余，采用多注意(MA)机制辅助进一步集成。该算法被称为多重优化与集成网络(MRINet)。在五个基准数据集上的实验结果表明，MRINet在多个评估指标上优于最先进的方法。此外，我们基于resnet的算法仅包含2520.2万个参数，比其他基于resnet的算法少，并且可以在单个GPU上以超过37 fps的速度运行。代码可在https://github.com/dc3234/MRINet上获得。

{"title":"Multiple refinement and integration network for Salient Object Detection","authors":"Chao Dai, Chen Pan, W. He, Hanqi Sun","doi":"10.3233/aic-210273","DOIUrl":"https://doi.org/10.3233/aic-210273","url":null,"abstract":"The purpose of the salient object detection (SOD) task is to suppress the background noise and segment the salient foreground regions. Some previous methods considered the strategies of background suppression and multi-level feature fusion. Other methods encountered the problem that single-scale convolution features are difficult to capture the correct object size. This paper reconsiders the above problems and proposes a comprehensive solution to achieve SOD for improving the detection performance and ensuring relatively fewer parameters. First, it is difficult to achieve a better refinement effect through only one refinement operation. To this end, a multi-scale denoising module (MSDM) and multi-pooling refinement module (MPRM) are proposed to jointly complete the refinement task of multi-level features. Besides, it is difficult to fully integrate complementary features through only one feature integration operation. Mutual learning module (MLM) is proposed to preliminarily integrate multi-level features. To reduce information redundancy, multi-attention (MA) mechanism is used to assist further integration. The proposed algorithm is called multiple refinement and integration network (MRINet). Experimental results on five benchmark datasets show that MRINet outperforms state-of-the-art methods on multiple evaluation metrics. Moreover, our ResNet-based algorithm only contains 25.202 million parameters, which is less than other ResNet-based algorithms and can run at over 37 fps on a single GPU. The code will be available at https://github.com/dc3234/MRINet.","PeriodicalId":50835,"journal":{"name":"AI Communications","volume":"247 1","pages":"31-44"},"PeriodicalIF":0.8,"publicationDate":"2022-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76756365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

AI Communications

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀