2018 IEEE International Conference on Agents (ICA)最新文献

英文中文

Part I: Agent Application 第一部分:代理申请

2018 IEEE International Conference on Agents (ICA)

Pub Date : 2018-07-01 DOI: 10.1109/agents.2018.8460070

引用次数: 0

Detection of Features Affording a Certain Action via Analysis of CNN 通过对CNN的分析发现具有一定作用的特征

2018 IEEE International Conference on Agents (ICA)

Pub Date : 2018-07-01 DOI: 10.1109/AGENTS.2018.8460062

Yusuke Nakata, Yuki Kitazato, S. Arai

Ideal products offer proper usages to users intuitively, and a usage perceived by a user is called an affordance. We aim to identify product features that induce the affordance of a specific action. We propose a method that identifies those affordance features without the need of an expert's knowledge of a domain. Using a dataset of a product's image and an affordance perceived by the product's user, the proposed method identifies those affordance features. The proposed method consists of three steps. First, we train a convolutional neural network (CNN) to predict a product's affordance. Second, according to the analysis of a trained CNN, we enumerate candidates for affordance features. Third, we use three metrics to verify and evaluate the candidates for features. By taking an affordance of “sit” as an example, our experiment showed that the proposed method does successfully identify affordance features.

理想的产品会直观地为用户提供合适的用法，而用户感知到的用法就称为功能。我们的目标是确定产品的特征，诱导一个特定的行动。我们提出了一种方法来识别这些功能，而不需要专家的领域知识。该方法使用产品图像数据集和产品用户感知到的可视性来识别这些可视性特征。该方法分为三个步骤。首先，我们训练卷积神经网络(CNN)来预测产品的可视性。其次，根据对训练好的CNN的分析，我们列举了候选的功能特征。第三，我们使用三个度量来验证和评估候选特性。以“sit”为例，我们的实验表明，所提出的方法能够成功地识别出提供性特征。

引用次数: 0

Personalized Recommendation Considering Secondary Implicit Feedback 考虑二次隐式反馈的个性化推荐

2018 IEEE International Conference on Agents (ICA)

Pub Date : 2018-07-01 DOI: 10.1109/AGENTS.2018.8460053

Siyuan Liu, Qiong Wu, C. Miao

In e-commerce, recommendation is an essential feature to provide users with potentially interesting items to purchase. However, people are often faced with an unpleasant situation, where the recommended items are simply the ones similar to what they have purchased previously. One of the main reasons is that existing recommender systems in e-commerce mainly utilize primary implicit feedback (i.e., purchase history) for recommendation. Little attention has been paid to secondary implicit feedback (e.g., viewing items, adding items to shopping cart, adding items to favorite list, etc), which captures users' potential interests that may not be reflected in their purchase history. We therefore propose a personalized recommendation approach to combine the primary and secondary implicit feedback to generate the recommendation list, which is optimized towards a Bayesian objective criterion for personalized ranking. Experiments with a large-scale real-world e-commerce dataset show that the proposed approach presents a superior performance in comparison with the state-of-the-art baselines.

在电子商务中，推荐是为用户提供潜在感兴趣的商品的基本功能。然而，人们经常会面临一种不愉快的情况，即推荐的商品与他们以前购买的商品相似。其中一个主要原因是现有的电子商务推荐系统主要利用初级隐式反馈(即购买历史)进行推荐。次要的隐性反馈(例如，查看商品、将商品添加到购物车、将商品添加到收藏列表等)很少受到关注，这些反馈捕捉了用户的潜在兴趣，而这些兴趣可能没有反映在他们的购买历史中。因此，我们提出了一种个性化推荐方法，将主、次隐式反馈结合起来生成推荐列表，并针对个性化排名的贝叶斯客观标准进行优化。大规模真实电子商务数据集的实验表明，与最先进的基线相比，所提出的方法具有优越的性能。

引用次数: 5

Cloning Strategies from Trading Records using Agent-based Reinforcement Learning Algorithm 基于agent的强化学习算法的交易记录克隆策略

2018 IEEE International Conference on Agents (ICA)

Pub Date : 2018-07-01 DOI: 10.1109/AGENTS.2018.8460078

Chiao-Ting Chen, An-Pin Chen, Szu-Hao Huang

Investment decision making is considered as a series of complicated processes, which are difficult to be analyzed and imitated. Given large amounts of trading records with rich expert knowledge in financial domain, extracting its original decision logics and cloning the trading strategies are also quite challenging. In this paper, an agent-based reinforcement learning (RL) system is proposed to mimic professional trading strategies. The concept of continuous Markov decision process (MDP) in RL is similar to the trading decision making in financial time series data. With the specific-designed RL components, including states, actions, and rewards for financial applications, policy gradient method can successfully imitate the expert's strategies. In order to improve the convergence of RL agent in such highly dynamic environment, a pre-trained model based on supervised learning is transferred to the deep policy networks. The experimental results show that the proposed system can reproduce around eighty percent trading decisions both in training and testing stages. With the discussion of the tradeoff between explorations and model updating, this paper tried to fine-tuning the system parameters to get reasonable results. Finally, an advanced strategy is proposed to dynamically adjust the number of explorations in each episode to achieve better results.

投资决策被认为是一系列复杂的过程，是难以分析和模仿的。由于金融领域中存在大量的交易记录和丰富的专家知识，提取交易记录的原始决策逻辑和克隆交易策略也具有很大的挑战性。本文提出了一种基于智能体的强化学习(RL)系统来模拟专业交易策略。RL中的连续马尔可夫决策过程(MDP)的概念类似于金融时间序列数据中的交易决策。通过特定设计的RL组件，包括金融应用的状态、行动和奖励，策略梯度方法可以成功地模仿专家的策略。为了提高RL智能体在高动态环境下的收敛性，将基于监督学习的预训练模型转移到深度策略网络中。实验结果表明，该系统在训练和测试阶段都能再现80%左右的交易决策。通过对探索和模型更新之间权衡的讨论，本文尝试对系统参数进行微调以获得合理的结果。最后，提出了一种先进的策略来动态调整每集的探索次数，以达到更好的效果。

{"title":"Cloning Strategies from Trading Records using Agent-based Reinforcement Learning Algorithm","authors":"Chiao-Ting Chen, An-Pin Chen, Szu-Hao Huang","doi":"10.1109/AGENTS.2018.8460078","DOIUrl":"https://doi.org/10.1109/AGENTS.2018.8460078","url":null,"abstract":"Investment decision making is considered as a series of complicated processes, which are difficult to be analyzed and imitated. Given large amounts of trading records with rich expert knowledge in financial domain, extracting its original decision logics and cloning the trading strategies are also quite challenging. In this paper, an agent-based reinforcement learning (RL) system is proposed to mimic professional trading strategies. The concept of continuous Markov decision process (MDP) in RL is similar to the trading decision making in financial time series data. With the specific-designed RL components, including states, actions, and rewards for financial applications, policy gradient method can successfully imitate the expert's strategies. In order to improve the convergence of RL agent in such highly dynamic environment, a pre-trained model based on supervised learning is transferred to the deep policy networks. The experimental results show that the proposed system can reproduce around eighty percent trading decisions both in training and testing stages. With the discussion of the tradeoff between explorations and model updating, this paper tried to fine-tuning the system parameters to get reasonable results. Finally, an advanced strategy is proposed to dynamically adjust the number of explorations in each episode to achieve better results.","PeriodicalId":248901,"journal":{"name":"2018 IEEE International Conference on Agents (ICA)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124166859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15

Probabilistic Guided Exploration for Reinforcement Learning in Self-Organizing Neural Networks 自组织神经网络强化学习的概率引导探索

2018 IEEE International Conference on Agents (ICA)

Pub Date : 2018-07-01 DOI: 10.1109/AGENTS.2018.8460067

Peng Wang, W. Zhou, Di Wang, A. Tan

Exploration is essential in reinforcement learning, which expands the search space of potential solutions to a given problem for performance evaluations. Specifically, carefully designed exploration strategy may help the agent learn faster by taking the advantage of what it has learned previously. However, many reinforcement learning mechanisms still adopt simple exploration strategies, which select actions in a pure random manner among all the feasible actions. In this paper, we propose novel mechanisms to improve the existing knowledge-based exploration strategy based on a probabilistic guided approach to select actions. We conduct extensive experiments in a Minefield navigation simulator and the results show that our proposed probabilistic guided exploration approach significantly improves the convergence rate.

探索在强化学习中是必不可少的，它扩展了对给定问题的潜在解决方案的搜索空间，以进行性能评估。具体来说，精心设计的探索策略可以帮助智能体通过利用之前学到的东西来更快地学习。然而，许多强化学习机制仍然采用简单的探索策略，在所有可行的行动中以纯随机的方式选择行动。在本文中，我们提出了一种新的机制来改进现有的基于知识的探索策略，该策略基于概率指导方法来选择动作。我们在雷区导航模拟器上进行了大量的实验，结果表明我们提出的概率导向勘探方法显著提高了收敛速度。

引用次数: 5

Metric for Evaluating Negotiation Process in Automated Negotiation 自动协商中协商过程评价指标

2018 IEEE International Conference on Agents (ICA)

Pub Date : 2018-07-01 DOI: 10.1109/AGENTS.2018.8460127

Xun Tang, Takayuki Ito

Given the growing interest in automated negotiation, the search for effective strategies has produced a variety of different negotiation agents[8]. The Automated Negotiating Agents Competition(ANAC) has been annually held since 2010. The ANAC is an international competition that challenges researchers to design the agents that are able to operate effectively in different kinds of scenarios. In this competition, researchers analyze the negotiation agents from the aspects of utility, social welfare, distance to Nash solution, distance to Pareto and so on. Most of the analyses are based on negotiation results. Actually, the negotiation process can affect the negotiation results. To reach an agreement, agents analyze bids opponents have proposed and use a threshold function to decide how to compromise, this will determine the negotiation results. In this paper, we will describe a method called “negotiating efficiency” to evaluate the negotiation process. We will also explain how the process can affect the result.

鉴于人们对自动谈判的兴趣日益浓厚，对有效策略的探索产生了各种不同的谈判代理[8]。自动谈判代理大赛(ANAC)自2010年起每年举办一次。ANAC是一项国际竞赛，挑战研究人员设计能够在不同场景下有效运行的代理。在本次竞赛中，研究者从效用、社会福利、到纳什解的距离、到帕累托的距离等方面对谈判主体进行了分析。大多数分析都是基于谈判结果。实际上，谈判过程会影响谈判结果。为了达成协议，代理人分析对手提出的出价，并使用阈值函数来决定如何妥协，这将决定谈判结果。在本文中，我们将描述一种称为“谈判效率”的方法来评估谈判过程。我们还将解释过程如何影响结果。

引用次数: 1

Autonomous Agents in Snake Game via Deep Reinforcement Learning 基于深度强化学习的Snake博弈自主智能体

2018 IEEE International Conference on Agents (ICA)

Pub Date : 2018-07-01 DOI: 10.1109/AGENTS.2018.8460004

Zhepei Wei, D. Wang, M. Zhang, A. Tan, C. Miao, You Zhou

Since DeepMind pioneered a deep reinforcement learning (DRL) model to play the Atari games, DRL has become a commonly adopted method to enable the agents to learn complex control policies in various video games. However, similar approaches may still need to be improved when applied to more challenging scenarios, where reward signals are sparse and delayed. In this paper, we develop a refined DRL model to enable our autonomous agent to play the classical Snake Game, whose constraint gets stricter as the game progresses. Specifically, we employ a convolutional neural network (CNN) trained with a variant of Q-learning. Moreover, we propose a carefully designed reward mechanism to properly train the network, adopt a training gap strategy to temporarily bypass training after the location of the target changes, and introduce a dual experience replay method to categorize different experiences for better training efficacy. The experimental results show that our agent outperforms the baseline model and surpasses human-level performance in terms of playing the Snake Game.

自从DeepMind开创了一种深度强化学习(DRL)模型来玩Atari游戏以来，DRL已经成为一种常用的方法，使智能体能够在各种视频游戏中学习复杂的控制策略。然而，当应用于更具挑战性的场景时，类似的方法可能仍然需要改进，其中奖励信号稀疏且延迟。在本文中，我们开发了一个改进的DRL模型，使我们的自主智能体能够玩经典的Snake游戏，随着游戏的进行，其约束变得更加严格。具体来说，我们使用了一个经过q学习变体训练的卷积神经网络(CNN)。此外，我们提出了精心设计的奖励机制，以适当地训练网络，采用训练间隙策略，在目标位置发生变化后暂时绕过训练，并引入双重经验重播方法，对不同的经验进行分类，以获得更好的训练效果。实验结果表明，我们的智能体在玩蛇游戏方面的表现优于基线模型，超过了人类水平的表现。

引用次数: 11

Part V: Reasoning & Learning 第五部分:推理与学习

2018 IEEE International Conference on Agents (ICA)

Pub Date : 2018-07-01 DOI: 10.1109/agents.2018.8460020

引用次数: 0

Proceedings: 2018 IEEE International Conference on Agents (ICA) 2018 IEEE代理国际会议(ICA)

2018 IEEE International Conference on Agents (ICA)

Pub Date : 2018-07-01 DOI: 10.1109/agents.2018.8459969

引用次数: 0

Toward a Misrepresentation Game with Ambiguous Preferences 关于具有模糊偏好的错误陈述博弈

2018 IEEE International Conference on Agents (ICA)

Pub Date : 2018-07-01 DOI: 10.1109/AGENTS.2018.8459932

Masahiro Nishi, Naoki Fukuta

In this paper, we show an analysis on a Misrepresentation Game with ambiguous preferences. A Misrepresentation Game is a game that sometimes an agent obtains higher utility than truth-telling on a preference-elicitation based fair division negotiation by misrepresenting their preferences while it is still difficult to be noticed by the counterpart. We investigate whether we can generate mechanisms for fair negotiations which avoids incentives to make misrepresentations by using a way of automated design of mechanisms.

在本文中，我们展示了一个具有模糊偏好的错误陈述博弈的分析。虚假陈述博弈是指在基于偏好诱导的公平分配谈判中，代理人有时通过虚假陈述自己的偏好而获得比诚实陈述更高的效用，但仍然难以被对方注意到的博弈。我们研究是否可以通过使用机制自动设计的方式来产生公平谈判机制，从而避免做出虚假陈述的动机。

引用次数: 0

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2018 IEEE International Conference on Agents (ICA)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀