深度学习框架中性能缺陷的实证研究

Tarek Makkouk, Dong Jae Kim, T. Chen
{"title":"深度学习框架中性能缺陷的实证研究","authors":"Tarek Makkouk, Dong Jae Kim, T. Chen","doi":"10.1109/ICSME55016.2022.00012","DOIUrl":null,"url":null,"abstract":"Machine Learning (ML) and Deep Learning (DL) applications are becoming more popular due to the availability of DL frameworks such as TensorFlow and PyTorch. Therefore, the quality of DL frameworks is essential to ensure DL/ML application quality. Given the computationally expensive nature of DL tasks (e.g., training), performance is a critical aspect of DL frameworks. However, optimizing DL frameworks may have its own unique challenges due to the peculiarities of DL (e.g., hardware integration and the nature of the computation). In this paper, we conduct an empirical study on the performance bugs in DL frameworks. We conduct our study on TensorFlow and PyTorch by identifying the performance and non-performance bugs by mining the GitHub repositories. We find that 1) the proportion of newly reported performance bugs increases faster than fixed performance bugs, and the ratio of performance bugs among all bugs increases over time; 2) performance bugs take more time to fix, have larger fix sizes, and more community engagement (e.g., discussion) compared to non-performance bugs; and 3) we manually derived a taxonomy of 12 categories and 19 sub-categories of the root causes of performance bugs by studying all performance bug fixes. Finally, we present some actionable implications for researchers and developers.","PeriodicalId":300084,"journal":{"name":"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"An Empirical Study on Performance Bugs in Deep Learning Frameworks\",\"authors\":\"Tarek Makkouk, Dong Jae Kim, T. Chen\",\"doi\":\"10.1109/ICSME55016.2022.00012\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine Learning (ML) and Deep Learning (DL) applications are becoming more popular due to the availability of DL frameworks such as TensorFlow and PyTorch. Therefore, the quality of DL frameworks is essential to ensure DL/ML application quality. Given the computationally expensive nature of DL tasks (e.g., training), performance is a critical aspect of DL frameworks. However, optimizing DL frameworks may have its own unique challenges due to the peculiarities of DL (e.g., hardware integration and the nature of the computation). In this paper, we conduct an empirical study on the performance bugs in DL frameworks. We conduct our study on TensorFlow and PyTorch by identifying the performance and non-performance bugs by mining the GitHub repositories. We find that 1) the proportion of newly reported performance bugs increases faster than fixed performance bugs, and the ratio of performance bugs among all bugs increases over time; 2) performance bugs take more time to fix, have larger fix sizes, and more community engagement (e.g., discussion) compared to non-performance bugs; and 3) we manually derived a taxonomy of 12 categories and 19 sub-categories of the root causes of performance bugs by studying all performance bug fixes. Finally, we present some actionable implications for researchers and developers.\",\"PeriodicalId\":300084,\"journal\":{\"name\":\"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSME55016.2022.00012\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSME55016.2022.00012","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

机器学习(ML)和深度学习(DL)应用程序正变得越来越流行,因为深度学习框架(如TensorFlow和PyTorch)的可用性。因此,深度学习框架的质量对于确保深度学习/ML应用程序的质量至关重要。考虑到深度学习任务(例如训练)的计算成本很高,性能是深度学习框架的一个关键方面。然而,由于深度学习的特殊性(例如,硬件集成和计算的性质),优化深度学习框架可能有其独特的挑战。在本文中,我们对深度学习框架中的性能缺陷进行了实证研究。我们通过挖掘GitHub存储库来识别性能和非性能错误,从而对TensorFlow和PyTorch进行研究。我们发现1)新报告的性能bug的比例比固定的性能bug增长得更快,并且性能bug占所有bug的比例随着时间的推移而增加;2)与非性能bug相比,性能bug需要更多的时间来修复,修复规模更大,并且需要更多的社区参与(例如讨论);3)通过研究所有性能bug修复,我们手动导出了性能bug根源的12类和19个子类别的分类。最后,我们为研究人员和开发人员提供了一些可操作的启示。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
An Empirical Study on Performance Bugs in Deep Learning Frameworks
Machine Learning (ML) and Deep Learning (DL) applications are becoming more popular due to the availability of DL frameworks such as TensorFlow and PyTorch. Therefore, the quality of DL frameworks is essential to ensure DL/ML application quality. Given the computationally expensive nature of DL tasks (e.g., training), performance is a critical aspect of DL frameworks. However, optimizing DL frameworks may have its own unique challenges due to the peculiarities of DL (e.g., hardware integration and the nature of the computation). In this paper, we conduct an empirical study on the performance bugs in DL frameworks. We conduct our study on TensorFlow and PyTorch by identifying the performance and non-performance bugs by mining the GitHub repositories. We find that 1) the proportion of newly reported performance bugs increases faster than fixed performance bugs, and the ratio of performance bugs among all bugs increases over time; 2) performance bugs take more time to fix, have larger fix sizes, and more community engagement (e.g., discussion) compared to non-performance bugs; and 3) we manually derived a taxonomy of 12 categories and 19 sub-categories of the root causes of performance bugs by studying all performance bug fixes. Finally, we present some actionable implications for researchers and developers.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
RestTestGen: An Extensible Framework for Automated Black-box Testing of RESTful APIs COBREX: A Tool for Extracting Business Rules from COBOL On the Security of Python Virtual Machines: An Empirical Study The Phantom Menace: Unmasking Security Issues in Evolving Software Impact of Defect Instances for Successful Deep Learning-based Automatic Program Repair
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1