{"title":"批处理线性程序在GPU上的同时求解","authors":"Amit Gurung, Rajarshi Ray","doi":"10.1145/3297663.3310308","DOIUrl":null,"url":null,"abstract":"Linear Programs (LPs) appear in a large number of applications. Offloading the LP solving tasks to a GPU is viable to accelerate an application's performance. Existing work on offloading and solving an LP on a GPU shows that performance can be accelerated only for large LPs (typically 500 constraints, 500 variables and above). This paper is motivated from applications having to solve small LPs but many of them. Existing techniques fail to accelerate such applications using GPU. We propose a batched LP solver in CUDA to accelerate such applications and demonstrate its utility in a use case - state-space exploration of models of control systems design. A performance comparison of The batched LP solver against sequential solving in CPU using the open source solver GLPK (GNU Linear Programming Kit) and the CPLEX solver from IBM is also shown. The evaluation on selected LP benchmarks from the Netlib repository displays a maximum speed-up of 95x and 5x with respect to CPLEX and GLPK solver respectively, for a batch of 1e5 LPs.","PeriodicalId":273447,"journal":{"name":"Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Simultaneous Solving of Batched Linear Programs on a GPU\",\"authors\":\"Amit Gurung, Rajarshi Ray\",\"doi\":\"10.1145/3297663.3310308\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Linear Programs (LPs) appear in a large number of applications. Offloading the LP solving tasks to a GPU is viable to accelerate an application's performance. Existing work on offloading and solving an LP on a GPU shows that performance can be accelerated only for large LPs (typically 500 constraints, 500 variables and above). This paper is motivated from applications having to solve small LPs but many of them. Existing techniques fail to accelerate such applications using GPU. We propose a batched LP solver in CUDA to accelerate such applications and demonstrate its utility in a use case - state-space exploration of models of control systems design. A performance comparison of The batched LP solver against sequential solving in CPU using the open source solver GLPK (GNU Linear Programming Kit) and the CPLEX solver from IBM is also shown. The evaluation on selected LP benchmarks from the Netlib repository displays a maximum speed-up of 95x and 5x with respect to CPLEX and GLPK solver respectively, for a batch of 1e5 LPs.\",\"PeriodicalId\":273447,\"journal\":{\"name\":\"Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering\",\"volume\":\"34 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-02-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3297663.3310308\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3297663.3310308","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
摘要
线性规划(lp)有大量的应用。将LP求解任务卸载到GPU上对于加速应用程序的性能是可行的。在GPU上卸载和解决LP的现有工作表明,只有在大型LP(通常是500个约束,500个变量及以上)下才能加速性能。本文的动机来自于必须解决小型lp的应用程序,但其中有很多。现有技术无法使用GPU加速此类应用程序。我们在CUDA中提出了一个批处理LP求解器,以加速此类应用,并展示其在控制系统设计模型的用例-状态空间探索中的实用性。还显示了使用开源求解器GLPK (GNU Linear Programming Kit)和IBM的CPLEX求解器在CPU中对批处理LP求解器与顺序求解的性能比较。对Netlib存储库中选定的LP基准的评估显示,对于一批1e5 LP,相对于CPLEX和GLPK求解器,最大速度分别提高了95倍和5倍。
Simultaneous Solving of Batched Linear Programs on a GPU
Linear Programs (LPs) appear in a large number of applications. Offloading the LP solving tasks to a GPU is viable to accelerate an application's performance. Existing work on offloading and solving an LP on a GPU shows that performance can be accelerated only for large LPs (typically 500 constraints, 500 variables and above). This paper is motivated from applications having to solve small LPs but many of them. Existing techniques fail to accelerate such applications using GPU. We propose a batched LP solver in CUDA to accelerate such applications and demonstrate its utility in a use case - state-space exploration of models of control systems design. A performance comparison of The batched LP solver against sequential solving in CPU using the open source solver GLPK (GNU Linear Programming Kit) and the CPLEX solver from IBM is also shown. The evaluation on selected LP benchmarks from the Netlib repository displays a maximum speed-up of 95x and 5x with respect to CPLEX and GLPK solver respectively, for a batch of 1e5 LPs.