{"title":"大数据中马尔可夫模型的异构计算","authors":"M. Malita, G. Popescu, G. Stefan","doi":"10.1109/CSCI49370.2019.00279","DOIUrl":null,"url":null,"abstract":"Many Big Data problems, Markov Model related included, are solved using heterogenous systems: host + parallel programmable accelerator. The current solutions for the accelerator part - for example, GPU used as GPGPU - provide limited accelerations due to some architectural constraints. The paper introduces the use of a programmable parallel accelerator able to perform efficient vector and matrix operations avoiding the limitations of the current systems designed using \"of-theshelf\" solutions. Our main result is an architecture whose actual performance is a much higher percentage from its peak performance than those of the consecrated accelerators. The performance improvements we offer come from the following two features: the addition of a reduction network at the output of a linear array of cells and an appropriate use of a serial register distributed along the same linear array of cells. Thus, for a n-state Markov Model, instead of a solution with the size in O(n2) and an acceleration in O(n2=logn), we offer an accelerator with the size in O(n) and the acceleration in O(n).","PeriodicalId":103662,"journal":{"name":"2019 International Conference on Computational Science and Computational Intelligence (CSCI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Heterogeneous Computing for Markov Models in Big Data\",\"authors\":\"M. Malita, G. Popescu, G. Stefan\",\"doi\":\"10.1109/CSCI49370.2019.00279\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Many Big Data problems, Markov Model related included, are solved using heterogenous systems: host + parallel programmable accelerator. The current solutions for the accelerator part - for example, GPU used as GPGPU - provide limited accelerations due to some architectural constraints. The paper introduces the use of a programmable parallel accelerator able to perform efficient vector and matrix operations avoiding the limitations of the current systems designed using \\\"of-theshelf\\\" solutions. Our main result is an architecture whose actual performance is a much higher percentage from its peak performance than those of the consecrated accelerators. The performance improvements we offer come from the following two features: the addition of a reduction network at the output of a linear array of cells and an appropriate use of a serial register distributed along the same linear array of cells. Thus, for a n-state Markov Model, instead of a solution with the size in O(n2) and an acceleration in O(n2=logn), we offer an accelerator with the size in O(n) and the acceleration in O(n).\",\"PeriodicalId\":103662,\"journal\":{\"name\":\"2019 International Conference on Computational Science and Computational Intelligence (CSCI)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on Computational Science and Computational Intelligence (CSCI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CSCI49370.2019.00279\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Computational Science and Computational Intelligence (CSCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSCI49370.2019.00279","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Heterogeneous Computing for Markov Models in Big Data
Many Big Data problems, Markov Model related included, are solved using heterogenous systems: host + parallel programmable accelerator. The current solutions for the accelerator part - for example, GPU used as GPGPU - provide limited accelerations due to some architectural constraints. The paper introduces the use of a programmable parallel accelerator able to perform efficient vector and matrix operations avoiding the limitations of the current systems designed using "of-theshelf" solutions. Our main result is an architecture whose actual performance is a much higher percentage from its peak performance than those of the consecrated accelerators. The performance improvements we offer come from the following two features: the addition of a reduction network at the output of a linear array of cells and an appropriate use of a serial register distributed along the same linear array of cells. Thus, for a n-state Markov Model, instead of a solution with the size in O(n2) and an acceleration in O(n2=logn), we offer an accelerator with the size in O(n) and the acceleration in O(n).