{"title":"多核处理器上的值迭代","authors":"Anuj K. Jain, S. Sahni","doi":"10.1109/ISSPIT51521.2020.9408773","DOIUrl":null,"url":null,"abstract":"Value Iteration (VI) is a powerful, though time consuming, approach to solve reinforcement learning problems modeled as Markov Decision Processes (MDPs). In this paper, we explore strategies to run the sate-of-the-art cache efficient algorithm for VI developed by us [1], [2] on a multicore processor. We demonstrate a speedup of up to 2.59 on a 10-core multiprocessor using 20 threads on popular benchmark data. The speedup for the parallelized portion of the computation is up to 5.89.","PeriodicalId":111385,"journal":{"name":"2020 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)","volume":"513 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Value Iteration on Multicore Processors\",\"authors\":\"Anuj K. Jain, S. Sahni\",\"doi\":\"10.1109/ISSPIT51521.2020.9408773\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Value Iteration (VI) is a powerful, though time consuming, approach to solve reinforcement learning problems modeled as Markov Decision Processes (MDPs). In this paper, we explore strategies to run the sate-of-the-art cache efficient algorithm for VI developed by us [1], [2] on a multicore processor. We demonstrate a speedup of up to 2.59 on a 10-core multiprocessor using 20 threads on popular benchmark data. The speedup for the parallelized portion of the computation is up to 5.89.\",\"PeriodicalId\":111385,\"journal\":{\"name\":\"2020 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)\",\"volume\":\"513 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISSPIT51521.2020.9408773\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISSPIT51521.2020.9408773","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Value Iteration (VI) is a powerful, though time consuming, approach to solve reinforcement learning problems modeled as Markov Decision Processes (MDPs). In this paper, we explore strategies to run the sate-of-the-art cache efficient algorithm for VI developed by us [1], [2] on a multicore processor. We demonstrate a speedup of up to 2.59 on a 10-core multiprocessor using 20 threads on popular benchmark data. The speedup for the parallelized portion of the computation is up to 5.89.