首页 > 最新文献

2020 Second International Conference on Transdisciplinary AI (TransAI)最新文献

英文 中文
Artificial Intelligence Aided Training in Ping Pong Sport Education 人工智能在乒乓球运动教育中的辅助训练
Pub Date : 2020-09-01 DOI: 10.1109/TransAI49837.2020.00012
Kevin Ma
Recently, artificial intelligence has made huge strides in sports analysis. This paper attempts to focus this technology into table tennis with a real-time machine learning system that enables individual ping pong players to have independent training. This system enables table tennis players to maintain the benefits of training with a coach, without the physical presence of one. This, of course, also helps to practice social distancing under present situations. Our system uses a SensorTile development hardware and embedded workbench software to collect real time sensor data, using a variety of MEMS sensors such as accelerometers, gyroscopes, and magnetometers. Therefore, the mounted SensorTile system can detect the motion and orientation of the table tennis racket. We used machine learning (ML) methods to perform real-time table tennis stroke classification producing accurate classification results. Using this proposed machine learning system, players now have an effective training machine that is able to tell them if their strokes are accurate. This also reduces private coaching time in an attempt to limit unnecessary exposure, while still allowing players to receive feedback to improve their game.
最近,人工智能在体育分析方面取得了巨大的进步。本文试图通过一个实时机器学习系统将这项技术应用到乒乓球运动中,使乒乓球运动员能够独立训练。这个系统使乒乓球运动员能够在没有教练的情况下保持与教练一起训练的好处。当然,这也有助于在当前情况下保持社交距离。我们的系统使用SensorTile开发硬件和嵌入式工作台软件来收集实时传感器数据,使用各种MEMS传感器,如加速度计,陀螺仪和磁力计。因此,安装的SensorTile系统可以检测乒乓球拍的运动和方向。我们使用机器学习(ML)方法进行实时乒乓球击球分类,产生准确的分类结果。使用这个提议的机器学习系统,球员现在有了一个有效的训练机器,能够告诉他们他们的击球是否准确。这也减少了私人教练的时间,以减少不必要的曝光,同时仍然允许玩家获得反馈以改进他们的游戏。
{"title":"Artificial Intelligence Aided Training in Ping Pong Sport Education","authors":"Kevin Ma","doi":"10.1109/TransAI49837.2020.00012","DOIUrl":"https://doi.org/10.1109/TransAI49837.2020.00012","url":null,"abstract":"Recently, artificial intelligence has made huge strides in sports analysis. This paper attempts to focus this technology into table tennis with a real-time machine learning system that enables individual ping pong players to have independent training. This system enables table tennis players to maintain the benefits of training with a coach, without the physical presence of one. This, of course, also helps to practice social distancing under present situations. Our system uses a SensorTile development hardware and embedded workbench software to collect real time sensor data, using a variety of MEMS sensors such as accelerometers, gyroscopes, and magnetometers. Therefore, the mounted SensorTile system can detect the motion and orientation of the table tennis racket. We used machine learning (ML) methods to perform real-time table tennis stroke classification producing accurate classification results. Using this proposed machine learning system, players now have an effective training machine that is able to tell them if their strokes are accurate. This also reduces private coaching time in an attempt to limit unnecessary exposure, while still allowing players to receive feedback to improve their game.","PeriodicalId":151527,"journal":{"name":"2020 Second International Conference on Transdisciplinary AI (TransAI)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123857552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Online local communities with motifs 有主题的在线本地社区
Pub Date : 2020-09-01 DOI: 10.1109/TransAI49837.2020.00014
Mrudula Murali, Katerina Potika, C. Pollett
A community in a network is a set of nodes that are densely and closely connected within the set, yet sparsely connected to nodes outside of it. Detecting communities in large networks helps solve many real-world problems. However, detecting such communities in a complex network by focusing on the whole network is costly. Instead, one can focus on finding overlapping communities starting from one or more seed nodes of interest. Moreover, on the online setting the network is given as a stream of higher order structures, i.e., triangles of nodes to be clustered into communities.In this paper, we propose an on online local graph community detection algorithm that uses motifs, such as triangles of nodes. We provide experimental results and compare it to another algorithm named COEUS. We use two public datasets, one of Amazon data and the other of DBLP data. Furthermore, we create and experiment on a new dataset that consists of web pages and their links by using the Internet Archive. This latter dataset provides insights to better understand how working with motifs is different than working with edges.
网络中的社区是一组节点,这些节点在集合内紧密相连,但与集合外的节点之间的连接很少。在大型网络中检测社区有助于解决许多现实世界的问题。然而,在一个复杂的网络中,通过关注整个网络来检测这样的社区是昂贵的。相反,人们可以专注于从一个或多个感兴趣的种子节点开始寻找重叠的社区。此外,在在线设置下,网络被给定为高阶结构的流,即节点的三角形被聚类成社区。在本文中,我们提出了一种在线的局部图社区检测算法,该算法使用了节点三角形等主题。我们提供了实验结果,并与另一种名为COEUS的算法进行了比较。我们使用两个公共数据集,一个是Amazon数据,另一个是DBLP数据。此外,我们使用Internet Archive创建并实验了一个由网页及其链接组成的新数据集。后一种数据集提供了更好地理解处理图案与处理边缘的不同之处的见解。
{"title":"Online local communities with motifs","authors":"Mrudula Murali, Katerina Potika, C. Pollett","doi":"10.1109/TransAI49837.2020.00014","DOIUrl":"https://doi.org/10.1109/TransAI49837.2020.00014","url":null,"abstract":"A community in a network is a set of nodes that are densely and closely connected within the set, yet sparsely connected to nodes outside of it. Detecting communities in large networks helps solve many real-world problems. However, detecting such communities in a complex network by focusing on the whole network is costly. Instead, one can focus on finding overlapping communities starting from one or more seed nodes of interest. Moreover, on the online setting the network is given as a stream of higher order structures, i.e., triangles of nodes to be clustered into communities.In this paper, we propose an on online local graph community detection algorithm that uses motifs, such as triangles of nodes. We provide experimental results and compare it to another algorithm named COEUS. We use two public datasets, one of Amazon data and the other of DBLP data. Furthermore, we create and experiment on a new dataset that consists of web pages and their links by using the Internet Archive. This latter dataset provides insights to better understand how working with motifs is different than working with edges.","PeriodicalId":151527,"journal":{"name":"2020 Second International Conference on Transdisciplinary AI (TransAI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126056621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Artifact Detection in Endoscopic Video with Deep Convolutional Neural Networks 基于深度卷积神经网络的内窥镜视频伪影检测
Pub Date : 2020-09-01 DOI: 10.1109/TransAI49837.2020.00007
Chenxi Zhang, Ning Zhang, Dechun Wang, Yu Cao, Benyuan Liu
Gastrointestinal cancer is a common and deadly disease that affects many people in the world. In 2019, Gastrointestinal cancer was the most common cancer and the second leading cause of death in the US. Detecting gastrointestinal cancer during the early stage is the most effective way to improve the survival rate. One of the commonly used clinical procedures for early detection of gastrointestinal cancer is endoscopy. The main challenge of a high-quality endoscopy operation is the presence of various forms of artifacts during the operation, e.g., pixel saturation, motion blur, defocus, specular reflections, bubbles, fluid, debris. These artifacts not only increase the difficulty in examining the underlying tissues during diagnosis but also affect the post-analysis methods required for follow-ups (e.g., video mosaicking for follow-ups and archival purposes and video-frame retrieval for reporting). Also, the presence of these artifacts often interferes with the computer-aided diagnosis of various lesions in endoscopy. The Convolutional Neural Network (CNN) based object detection methods have proved to be an effective approach for nature image object detection and colonoscopy applications (e.g., polyp detection). However, fewer efforts have been devoted to endoscopic artifact detection due to the lack of training data. In this paper, we use data from the EAD2019 challenge and investigate the performance of two improved CNN-based methods for seven-class endoscopic artifact detection (EAD). Experiment results show that our proposed objection detectors based on SSD and Faster-RCNN significantly outperform the baseline.
胃肠癌是一种常见的致命疾病,影响着世界上许多人。2019年,胃肠道癌症是美国最常见的癌症,也是第二大死亡原因。早期发现胃肠道肿瘤是提高生存率的最有效途径。内镜检查是早期发现胃肠道肿瘤常用的临床检查方法之一。高质量内窥镜手术的主要挑战是手术过程中存在各种形式的伪影,例如像素饱和、运动模糊、散焦、镜面反射、气泡、流体、碎片。这些伪影不仅增加了诊断过程中检查底层组织的难度,而且还影响了随访所需的后期分析方法(例如,用于随访和存档目的的视频拼接以及用于报告的视频帧检索)。此外,这些伪影的存在通常会干扰内窥镜检查中各种病变的计算机辅助诊断。基于卷积神经网络(CNN)的目标检测方法已被证明是自然图像目标检测和结肠镜检查应用(如息肉检测)的有效方法。然而,由于缺乏训练数据,致力于内窥镜伪影检测的努力较少。在本文中,我们使用来自EAD2019挑战的数据,并研究了两种改进的基于cnn的七类内窥镜伪影检测(EAD)方法的性能。实验结果表明,我们提出的基于SSD和Faster-RCNN的目标检测器的性能明显优于基线。
{"title":"Artifact Detection in Endoscopic Video with Deep Convolutional Neural Networks","authors":"Chenxi Zhang, Ning Zhang, Dechun Wang, Yu Cao, Benyuan Liu","doi":"10.1109/TransAI49837.2020.00007","DOIUrl":"https://doi.org/10.1109/TransAI49837.2020.00007","url":null,"abstract":"Gastrointestinal cancer is a common and deadly disease that affects many people in the world. In 2019, Gastrointestinal cancer was the most common cancer and the second leading cause of death in the US. Detecting gastrointestinal cancer during the early stage is the most effective way to improve the survival rate. One of the commonly used clinical procedures for early detection of gastrointestinal cancer is endoscopy. The main challenge of a high-quality endoscopy operation is the presence of various forms of artifacts during the operation, e.g., pixel saturation, motion blur, defocus, specular reflections, bubbles, fluid, debris. These artifacts not only increase the difficulty in examining the underlying tissues during diagnosis but also affect the post-analysis methods required for follow-ups (e.g., video mosaicking for follow-ups and archival purposes and video-frame retrieval for reporting). Also, the presence of these artifacts often interferes with the computer-aided diagnosis of various lesions in endoscopy. The Convolutional Neural Network (CNN) based object detection methods have proved to be an effective approach for nature image object detection and colonoscopy applications (e.g., polyp detection). However, fewer efforts have been devoted to endoscopic artifact detection due to the lack of training data. In this paper, we use data from the EAD2019 challenge and investigate the performance of two improved CNN-based methods for seven-class endoscopic artifact detection (EAD). Experiment results show that our proposed objection detectors based on SSD and Faster-RCNN significantly outperform the baseline.","PeriodicalId":151527,"journal":{"name":"2020 Second International Conference on Transdisciplinary AI (TransAI)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132087247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Graph Theory–The Case for Investigating Corruption and Modern Slavery through Suspicious Employment Data 图论——通过可疑就业数据调查腐败和现代奴隶制的案例
Pub Date : 2020-09-01 DOI: 10.1109/TransAI49837.2020.00025
Felicity Gerry, Joseph R. Barr, Peter Shaw
This poster uses the mathematics of networks in the novel context of corporate reporting of slavery in supply chains as a method to meet corporate obligations to respect human rights. For those corporates considering risks such as liability for slavery in supply chains, using graph theory, which is capable of sampling affinity in data-bases, can ‘value add’ due diligence by scoring identity and veracity.
这张海报在企业报告供应链奴隶制的新背景下使用网络数学,作为履行企业尊重人权义务的一种方法。对于那些考虑供应链中奴役责任等风险的公司,使用图论,它能够在数据库中采样亲和力,可以通过对身份和准确性进行评分来“增值”尽职调查。
{"title":"Graph Theory–The Case for Investigating Corruption and Modern Slavery through Suspicious Employment Data","authors":"Felicity Gerry, Joseph R. Barr, Peter Shaw","doi":"10.1109/TransAI49837.2020.00025","DOIUrl":"https://doi.org/10.1109/TransAI49837.2020.00025","url":null,"abstract":"This poster uses the mathematics of networks in the novel context of corporate reporting of slavery in supply chains as a method to meet corporate obligations to respect human rights. For those corporates considering risks such as liability for slavery in supply chains, using graph theory, which is capable of sampling affinity in data-bases, can ‘value add’ due diligence by scoring identity and veracity.","PeriodicalId":151527,"journal":{"name":"2020 Second International Conference on Transdisciplinary AI (TransAI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129056719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Skeleton-Based Detection of Abnormalities in Human Actions Using Graph Convolutional Networks 基于骨骼的基于图卷积网络的人类行为异常检测
Pub Date : 2020-09-01 DOI: 10.1109/TransAI49837.2020.00030
Bruce X. B. Yu, Yan Liu, Keith C. C. Chan
Human action abnormality detection has been attempted by various sensors for application domains like rehabilitation, healthcare, and assisted living. Since the release of motion sensors that ease the human body skeleton retrieval, skeleton-based human action recognition has recently been an active topic in the area of artificial intelligence. Unlike human action recognition, human action abnormality detection is an emerging field that aims to detect the incorrect action from the same action class. Graph convolutional network has been widely adopted for human action recognition. However, to the best of our knowledge, whether it could be effective for the task of human action abnormality detection has not been attempted. To advance prior work in the emerging field of human action abnormality detection, we propose a novel method that uses graph convolutional network to detect abnormal actions in skeleton data. To validate the effectiveness of our proposed method, we conduct extensive experiments on a public dataset called UI-PRMD. Based on the experimental results, our proposed method achieved superior action abnormality detection performance comparing with existing deep learning methods.
在康复、医疗保健和辅助生活等应用领域,各种传感器已经尝试了人体动作异常检测。自运动传感器的问世以来,基于骨骼的人体动作识别已成为人工智能领域的研究热点。与人类动作识别不同,人类动作异常检测是一个新兴领域,其目的是检测同一动作类别中的不正确动作。图卷积网络已被广泛应用于人体动作识别。然而,据我们所知,它是否能有效地用于人类行为异常检测的任务还没有尝试过。为了推进人类动作异常检测这一新兴领域的前期工作,我们提出了一种利用图卷积网络检测骨骼数据异常动作的新方法。为了验证我们提出的方法的有效性,我们在一个名为UI-PRMD的公共数据集上进行了广泛的实验。实验结果表明,与现有的深度学习方法相比,本文提出的方法具有更好的动作异常检测性能。
{"title":"Skeleton-Based Detection of Abnormalities in Human Actions Using Graph Convolutional Networks","authors":"Bruce X. B. Yu, Yan Liu, Keith C. C. Chan","doi":"10.1109/TransAI49837.2020.00030","DOIUrl":"https://doi.org/10.1109/TransAI49837.2020.00030","url":null,"abstract":"Human action abnormality detection has been attempted by various sensors for application domains like rehabilitation, healthcare, and assisted living. Since the release of motion sensors that ease the human body skeleton retrieval, skeleton-based human action recognition has recently been an active topic in the area of artificial intelligence. Unlike human action recognition, human action abnormality detection is an emerging field that aims to detect the incorrect action from the same action class. Graph convolutional network has been widely adopted for human action recognition. However, to the best of our knowledge, whether it could be effective for the task of human action abnormality detection has not been attempted. To advance prior work in the emerging field of human action abnormality detection, we propose a novel method that uses graph convolutional network to detect abnormal actions in skeleton data. To validate the effectiveness of our proposed method, we conduct extensive experiments on a public dataset called UI-PRMD. Based on the experimental results, our proposed method achieved superior action abnormality detection performance comparing with existing deep learning methods.","PeriodicalId":151527,"journal":{"name":"2020 Second International Conference on Transdisciplinary AI (TransAI)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122222911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Group-specific models of healthcare workers’ well-being using iterative participant clustering 使用迭代参与者聚类的医疗工作者幸福感的群体特定模型
Pub Date : 2020-09-01 DOI: 10.1109/TransAI49837.2020.00026
Vinesh Ravuri, Projna Paromita, Karel Mundnich, Amrutha Nadarajan, Brandon M. Booth, Shrikanth S. Narayanan, Theodora Chaspari
Healthcare workers often experience stress and burnout due to the demanding job responsibilities and long work hours. Ambulatory monitoring devices, such as wearable and environmental sensors, combined with machine learning algorithms can afford us a better understanding of the naturalistic onset and evolution of stress and emotional reactivity in real-life with valuable implications in behavioral interventions. However, the typically large degree of inter-subject variability, due to individual differences in responses and behaviors, makes it difficult for machine learning models to robustly learn behavioral signal patterns and adequately generalize to unseen individuals. In this study, we design group-specific models of well-being (i.e., stress, sleep, positive affect, negative affect) and contextual outcomes (i.e., type of activity) based on real-life multimodal longitudinal data collected in situ from healthcare workers in a hospital environment. Group-specific models are constructed by learning an initial model based on all individuals and subsequently refining the model for a specific group of participants. Participants are originally grouped based on the feature space constructed by the multimodal data, while the original grouping is iteratively refined using the learned multimodal representations of the group-specific models. The results from this study indicate that in the majority of cases the proposed group-specific models, learned through iterative participant clustering, outperform the baseline systems, which involve general models learned based on all participants, as well as group-specific models without iterative participant clustering. This study provides promising results for predicting psychological and behavioral factors that affect the well-being of healthcare workers and lays the foundation toward ambulatory real-life assessment and interventions.
由于高要求的工作职责和长时间的工作,卫生保健工作者经常感到压力和倦怠。动态监测设备,如可穿戴和环境传感器,与机器学习算法相结合,可以让我们更好地了解现实生活中压力和情绪反应的自然发生和进化,对行为干预具有重要意义。然而,由于反应和行为的个体差异,通常很大程度的主体间可变性使得机器学习模型难以稳健地学习行为信号模式并充分推广到未见过的个体。在这项研究中,我们设计了特定群体的幸福感模型(即压力、睡眠、积极影响、消极影响)和情境结果(即活动类型),这些模型基于在医院环境中从医护人员那里现场收集的现实生活中的多模态纵向数据。群体特定模型是通过学习基于所有个体的初始模型,然后为特定的参与者群体改进模型来构建的。参与者最初是基于多模态数据构建的特征空间进行分组的,而原始分组是使用学习到的组特定模型的多模态表示进行迭代改进的。本研究的结果表明,在大多数情况下,通过迭代参与者聚类学习的特定于群体的模型优于基于所有参与者学习的一般模型以及没有迭代参与者聚类的特定于群体的模型的基线系统。本研究为预测影响医护人员幸福感的心理和行为因素提供了有希望的结果,并为门诊现实生活评估和干预奠定了基础。
{"title":"Group-specific models of healthcare workers’ well-being using iterative participant clustering","authors":"Vinesh Ravuri, Projna Paromita, Karel Mundnich, Amrutha Nadarajan, Brandon M. Booth, Shrikanth S. Narayanan, Theodora Chaspari","doi":"10.1109/TransAI49837.2020.00026","DOIUrl":"https://doi.org/10.1109/TransAI49837.2020.00026","url":null,"abstract":"Healthcare workers often experience stress and burnout due to the demanding job responsibilities and long work hours. Ambulatory monitoring devices, such as wearable and environmental sensors, combined with machine learning algorithms can afford us a better understanding of the naturalistic onset and evolution of stress and emotional reactivity in real-life with valuable implications in behavioral interventions. However, the typically large degree of inter-subject variability, due to individual differences in responses and behaviors, makes it difficult for machine learning models to robustly learn behavioral signal patterns and adequately generalize to unseen individuals. In this study, we design group-specific models of well-being (i.e., stress, sleep, positive affect, negative affect) and contextual outcomes (i.e., type of activity) based on real-life multimodal longitudinal data collected in situ from healthcare workers in a hospital environment. Group-specific models are constructed by learning an initial model based on all individuals and subsequently refining the model for a specific group of participants. Participants are originally grouped based on the feature space constructed by the multimodal data, while the original grouping is iteratively refined using the learned multimodal representations of the group-specific models. The results from this study indicate that in the majority of cases the proposed group-specific models, learned through iterative participant clustering, outperform the baseline systems, which involve general models learned based on all participants, as well as group-specific models without iterative participant clustering. This study provides promising results for predicting psychological and behavioral factors that affect the well-being of healthcare workers and lays the foundation toward ambulatory real-life assessment and interventions.","PeriodicalId":151527,"journal":{"name":"2020 Second International Conference on Transdisciplinary AI (TransAI)","volume":" 28","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120832209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Modeling Survival in model-based Reinforcement Learning 基于模型的强化学习中的生存建模
Pub Date : 2020-04-18 DOI: 10.1109/TransAI49837.2020.00009
Saeed Moazami, P. Doerschuk
Although recent model-free reinforcement learning algorithms have been shown to be capable of mastering complicated decision-making tasks, the sample complexity of these methods has remained a hurdle to utilizing them in many real-world applications. In this regard, model-based reinforcement learning proposes some remedies. Yet, inherently, model-based methods are more computationally expensive and susceptible to sub-optimality. One reason is that model-generated data are always less accurate than real data, and this often leads to inaccurate transition and reward function models. With the aim to mitigate this problem, this work presents the notion of survival by discussing cases in which the agent’s goal is to survive and its analogy to maximizing the expected rewards. To that end, a substitute model for the reward function approximator is introduced that learns to avoid terminal states rather than to maximize accumulated rewards from safe states. Focusing on terminal states, as a small fraction of state-space, reduces the training effort drastically. Next, a model-based reinforcement learning method is proposed (Survive) to train an agent to avoid dangerous states through a safety map model built upon temporal credit assignment in the vicinity of terminal states. Finally, the performance of the presented algorithm is investigated, along with a comparison between the proposed and current methods.
尽管最近的无模型强化学习算法已被证明能够掌握复杂的决策任务,但这些方法的样本复杂性仍然是在许多实际应用中使用它们的障碍。在这方面,基于模型的强化学习提出了一些补救措施。然而,从本质上讲,基于模型的方法在计算上更昂贵,并且容易受到次优性的影响。一个原因是模型生成的数据总是不如真实数据准确,这通常会导致不准确的转换和奖励函数模型。为了缓解这个问题,本工作通过讨论代理的目标是生存的情况及其与最大化预期奖励的类比,提出了生存的概念。为此,引入了奖励函数逼近器的替代模型,该模型学习避免终端状态,而不是从安全状态中最大化累积奖励。关注终端状态,作为状态空间的一小部分,极大地减少了训练的工作量。接下来,提出了一种基于模型的强化学习方法(survival),通过在终端状态附近建立基于时间信用分配的安全地图模型来训练智能体避免危险状态。最后,对所提算法的性能进行了研究,并与现有方法进行了比较。
{"title":"Modeling Survival in model-based Reinforcement Learning","authors":"Saeed Moazami, P. Doerschuk","doi":"10.1109/TransAI49837.2020.00009","DOIUrl":"https://doi.org/10.1109/TransAI49837.2020.00009","url":null,"abstract":"Although recent model-free reinforcement learning algorithms have been shown to be capable of mastering complicated decision-making tasks, the sample complexity of these methods has remained a hurdle to utilizing them in many real-world applications. In this regard, model-based reinforcement learning proposes some remedies. Yet, inherently, model-based methods are more computationally expensive and susceptible to sub-optimality. One reason is that model-generated data are always less accurate than real data, and this often leads to inaccurate transition and reward function models. With the aim to mitigate this problem, this work presents the notion of survival by discussing cases in which the agent’s goal is to survive and its analogy to maximizing the expected rewards. To that end, a substitute model for the reward function approximator is introduced that learns to avoid terminal states rather than to maximize accumulated rewards from safe states. Focusing on terminal states, as a small fraction of state-space, reduces the training effort drastically. Next, a model-based reinforcement learning method is proposed (Survive) to train an agent to avoid dangerous states through a safety map model built upon temporal credit assignment in the vicinity of terminal states. Finally, the performance of the presented algorithm is investigated, along with a comparison between the proposed and current methods.","PeriodicalId":151527,"journal":{"name":"2020 Second International Conference on Transdisciplinary AI (TransAI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123729642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2020 Second International Conference on Transdisciplinary AI (TransAI)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1