GPU optimization techniques to accelerate optiGAN-a particle simulation GAN.

IF 6.3 2区物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Machine Learning Science and Technology Pub Date : 2024-06-01 Epub Date: 2024-06-13 DOI:10.1088/2632-2153/ad51c9

Anirudh Srikanth, Carlotta Trigila, Emilie Roncali

{"title":"GPU optimization techniques to accelerate optiGAN-a particle simulation GAN.","authors":"Anirudh Srikanth, Carlotta Trigila, Emilie Roncali","doi":"10.1088/2632-2153/ad51c9","DOIUrl":null,"url":null,"abstract":"<p><p>The demand for specialized hardware to train AI models has increased in tandem with the increase in the model complexity over the recent years. Graphics processing unit (GPU) is one such hardware that is capable of parallelizing operations performed on a large chunk of data. Companies like Nvidia, AMD, and Google have been constantly scaling-up the hardware performance as fast as they can. Nevertheless, there is still a gap between the required processing power and processing capacity of the hardware. To increase the hardware utilization, the software has to be optimized too. In this paper, we present some general GPU optimization techniques we used to efficiently train the optiGAN model, a Generative Adversarial Network that is capable of generating multidimensional probability distributions of optical photons at the photodetector face in radiation detectors, on an 8GB Nvidia Quadro RTX 4000 GPU. We analyze and compare the performances of all the optimizations based on the execution time and the memory consumed using the Nvidia Nsight Systems profiler tool. The optimizations gave approximately a 4.5x increase in the runtime performance when compared to a naive training on the GPU, without compromising the model performance. Finally we discuss optiGANs future work and how we are planning to scale the model on GPUs.</p>","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"5 2","pages":"027001"},"PeriodicalIF":6.3000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11170465/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning Science and Technology","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1088/2632-2153/ad51c9","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/6/13 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

The demand for specialized hardware to train AI models has increased in tandem with the increase in the model complexity over the recent years. Graphics processing unit (GPU) is one such hardware that is capable of parallelizing operations performed on a large chunk of data. Companies like Nvidia, AMD, and Google have been constantly scaling-up the hardware performance as fast as they can. Nevertheless, there is still a gap between the required processing power and processing capacity of the hardware. To increase the hardware utilization, the software has to be optimized too. In this paper, we present some general GPU optimization techniques we used to efficiently train the optiGAN model, a Generative Adversarial Network that is capable of generating multidimensional probability distributions of optical photons at the photodetector face in radiation detectors, on an 8GB Nvidia Quadro RTX 4000 GPU. We analyze and compare the performances of all the optimizations based on the execution time and the memory consumed using the Nvidia Nsight Systems profiler tool. The optimizations gave approximately a 4.5x increase in the runtime performance when compared to a naive training on the GPU, without compromising the model performance. Finally we discuss optiGANs future work and how we are planning to scale the model on GPUs.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用 GPU 优化技术加速 OptiGAN--粒子模拟 GAN。

近年来，随着人工智能模型复杂性的增加，对训练人工智能模型的专用硬件的需求也随之增加。图形处理器（GPU）就是这样一种能够对大块数据进行并行运算的硬件。Nvidia、AMD 和谷歌等公司一直在以最快的速度提升硬件性能。然而，所需的处理能力与硬件的处理能力之间仍然存在差距。要提高硬件的利用率，就必须对软件进行优化。在本文中，我们介绍了一些通用 GPU 优化技术，这些技术用于在 8GB Nvidia Quadro RTX 4000 GPU 上高效训练 optiGAN 模型，这是一个生成对抗网络，能够生成辐射探测器中光电探测器面上光学光子的多维概率分布。我们使用 Nvidia Nsight Systems profiler 工具，根据执行时间和内存消耗分析和比较了所有优化的性能。与 GPU 上的原始训练相比，优化后的运行时间性能提高了约 4.5 倍，而模型性能并未受到影响。最后，我们将讨论 optiGANs 的未来工作，以及我们计划如何在 GPU 上扩展该模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Machine Learning Science and Technology Computer Science-Artificial Intelligence

CiteScore

9.10

自引率

4.40%

发文量

审稿时长

5 weeks

期刊介绍： Machine Learning Science and Technology is a multidisciplinary open access journal that bridges the application of machine learning across the sciences with advances in machine learning methods and theory as motivated by physical insights. Specifically, articles must fall into one of the following categories: advance the state of machine learning-driven applications in the sciences or make conceptual, methodological or theoretical advances in machine learning with applications to, inspiration from, or motivated by scientific problems.