{"title":"TC-QR: Tensor Core-based QR Solver for Efficient GPU-based Vector Fitting","authors":"V. Kukutla, Ramachandra Achar, Wai Kong Lee","doi":"10.1109/SPI57109.2023.10145528","DOIUrl":null,"url":null,"abstract":"Vector Fitting (VF) is widely used for system identification via rational function approximation from tabulated data of high-speed modules. Since the algorithm is iterative in nature, minimizing its computational cost and parallel efficiency on mixed CPU and GPU environments is critical in reducing the overall time needed for convergence. In this paper, a novel Tensor-core based QR decomposition method is introduced to provide significant speedups to the most computationally expensive steps in the VF process, QR factorization and the solution to a set of linear equations, exploiting the GPU platforms with Tensor Core architectures.","PeriodicalId":281134,"journal":{"name":"2023 IEEE 27th Workshop on Signal and Power Integrity (SPI)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 27th Workshop on Signal and Power Integrity (SPI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPI57109.2023.10145528","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Vector Fitting (VF) is widely used for system identification via rational function approximation from tabulated data of high-speed modules. Since the algorithm is iterative in nature, minimizing its computational cost and parallel efficiency on mixed CPU and GPU environments is critical in reducing the overall time needed for convergence. In this paper, a novel Tensor-core based QR decomposition method is introduced to provide significant speedups to the most computationally expensive steps in the VF process, QR factorization and the solution to a set of linear equations, exploiting the GPU platforms with Tensor Core architectures.