多核处理器上的值迭代

2020 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT) Pub Date : 2020-12-09 DOI:10.1109/ISSPIT51521.2020.9408773

Anuj K. Jain, S. Sahni

{"title":"多核处理器上的值迭代","authors":"Anuj K. Jain, S. Sahni","doi":"10.1109/ISSPIT51521.2020.9408773","DOIUrl":null,"url":null,"abstract":"Value Iteration (VI) is a powerful, though time consuming, approach to solve reinforcement learning problems modeled as Markov Decision Processes (MDPs). In this paper, we explore strategies to run the sate-of-the-art cache efficient algorithm for VI developed by us [1], [2] on a multicore processor. We demonstrate a speedup of up to 2.59 on a 10-core multiprocessor using 20 threads on popular benchmark data. The speedup for the parallelized portion of the computation is up to 5.89.","PeriodicalId":111385,"journal":{"name":"2020 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)","volume":"513 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Value Iteration on Multicore Processors\",\"authors\":\"Anuj K. Jain, S. Sahni\",\"doi\":\"10.1109/ISSPIT51521.2020.9408773\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Value Iteration (VI) is a powerful, though time consuming, approach to solve reinforcement learning problems modeled as Markov Decision Processes (MDPs). In this paper, we explore strategies to run the sate-of-the-art cache efficient algorithm for VI developed by us [1], [2] on a multicore processor. We demonstrate a speedup of up to 2.59 on a 10-core multiprocessor using 20 threads on popular benchmark data. The speedup for the parallelized portion of the computation is up to 5.89.\",\"PeriodicalId\":111385,\"journal\":{\"name\":\"2020 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)\",\"volume\":\"513 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISSPIT51521.2020.9408773\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISSPIT51521.2020.9408773","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

值迭代(VI)是一种强大但耗时的方法，用于解决以马尔可夫决策过程(mdp)为模型的强化学习问题。在本文中，我们探讨了在多核处理器上运行由我们[1]，[2]开发的最先进的VI缓存高效算法的策略。我们在流行的基准测试数据上使用20个线程，在10核多处理器上演示了高达2.59的加速。计算的并行化部分的加速高达5.89。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Value Iteration on Multicore Processors

Value Iteration (VI) is a powerful, though time consuming, approach to solve reinforcement learning problems modeled as Markov Decision Processes (MDPs). In this paper, we explore strategies to run the sate-of-the-art cache efficient algorithm for VI developed by us [1], [2] on a multicore processor. We demonstrate a speedup of up to 2.59 on a 10-core multiprocessor using 20 threads on popular benchmark data. The speedup for the parallelized portion of the computation is up to 5.89.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)

自引率

0.00%

发文量