Prevalence of Code Smells in Reinforcement Learning Projects

2023 IEEE/ACM 2nd International Conference on AI Engineering – Software Engineering for AI (CAIN) Pub Date : 2023-03-17 DOI:10.1109/CAIN58948.2023.00013

Nicolás Cardozo, Ivana Dusparic, Christian Cabrera

{"title":"Prevalence of Code Smells in Reinforcement Learning Projects","authors":"Nicolás Cardozo, Ivana Dusparic, Christian Cabrera","doi":"10.1109/CAIN58948.2023.00013","DOIUrl":null,"url":null,"abstract":"Reinforcement Learning (RL) is being increasingly used to learn and adapt application behavior in many domains, including large-scale and safety critical systems, as for example, autonomous driving. With the advent of plug-n-play RL libraries, its applicability has further increased, enabling integration of RL algorithms by users. We note, however, that the majority of such code is not developed by RL engineers, which as a consequence, may lead to poor program quality yielding bugs, suboptimal performance, maintainability, and evolution problems for RL-based projects. In this paper we begin the exploration of this hypothesis, specific to code utilizing RL, analyzing different projects found in the wild, to assess their quality from a software engineering perspective. Our study includes 24 popular RL-based Python projects, analyzed with standard software engineering metrics. Our results, aligned with similar analyses for ML code in general, show that popular and widely reused RL repositories contain many code smells (3.95% of the code base on average), significantly affecting the projects’ maintainability. The most common code smells detected are long method and long method chain, highlighting problems in the definition and interaction of agents. Detected code smells suggest problems in responsibility separation, and the appropriateness of current abstractions for the definition of RL algorithms.","PeriodicalId":175580,"journal":{"name":"2023 IEEE/ACM 2nd International Conference on AI Engineering – Software Engineering for AI (CAIN)","volume":"291 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE/ACM 2nd International Conference on AI Engineering – Software Engineering for AI (CAIN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CAIN58948.2023.00013","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Reinforcement Learning (RL) is being increasingly used to learn and adapt application behavior in many domains, including large-scale and safety critical systems, as for example, autonomous driving. With the advent of plug-n-play RL libraries, its applicability has further increased, enabling integration of RL algorithms by users. We note, however, that the majority of such code is not developed by RL engineers, which as a consequence, may lead to poor program quality yielding bugs, suboptimal performance, maintainability, and evolution problems for RL-based projects. In this paper we begin the exploration of this hypothesis, specific to code utilizing RL, analyzing different projects found in the wild, to assess their quality from a software engineering perspective. Our study includes 24 popular RL-based Python projects, analyzed with standard software engineering metrics. Our results, aligned with similar analyses for ML code in general, show that popular and widely reused RL repositories contain many code smells (3.95% of the code base on average), significantly affecting the projects’ maintainability. The most common code smells detected are long method and long method chain, highlighting problems in the definition and interaction of agents. Detected code smells suggest problems in responsibility separation, and the appropriateness of current abstractions for the definition of RL algorithms.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

代码气味在强化学习项目中的流行

强化学习(RL)正越来越多地用于学习和适应许多领域的应用行为，包括大规模和安全关键系统，例如自动驾驶。随着即插即用RL库的出现，其适用性进一步提高，使用户能够集成RL算法。然而，我们注意到，大多数这样的代码不是由RL工程师开发的，其结果可能导致程序质量差，产生错误，次优性能，可维护性，以及基于RL的项目的进化问题。在本文中，我们开始探索这一假设，具体到利用强化学习的代码，分析在野外发现的不同项目，从软件工程的角度评估它们的质量。我们的研究包括24个流行的基于rl的Python项目，用标准的软件工程指标进行分析。我们的结果与一般ML代码的类似分析一致，表明流行和广泛重用的RL存储库包含许多代码气味(平均占代码库的3.95%)，显著影响项目的可维护性。检测到的最常见的代码气味是长方法和长方法链，突出了代理的定义和交互中的问题。检测到的代码气味表明责任分离方面存在问题，以及当前抽象对RL算法定义的适当性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2023 IEEE/ACM 2nd International Conference on AI Engineering – Software Engineering for AI (CAIN)

自引率

0.00%

发文量

期刊最新文献

safe.trAIn – Engineering and Assurance of a Driverless Regional Train Extensible Modeling Framework for Reliable Machine Learning System Analysis Maintaining and Monitoring AIOps Models Against Concept Drift Conceptualising Software Development Lifecycle for Engineering AI Planning Systems Reproducibility Requires Consolidated Artifacts