利用自监督学习自动检测类图气味

IF 3.1 2区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Automated Software Engineering Pub Date : 2024-03-24 DOI:10.1007/s10515-024-00429-w

Amal Alazba, Hamoud Aljamaan, Mohammad Alshayeb

{"title":"利用自监督学习自动检测类图气味","authors":"Amal Alazba, Hamoud Aljamaan, Mohammad Alshayeb","doi":"10.1007/s10515-024-00429-w","DOIUrl":null,"url":null,"abstract":"<div><p>Design smells are symptoms of poorly designed solutions that may result in several maintenance issues. While various approaches, including traditional machine learning methods, have been proposed and shown to be effective in detecting design smells, they require extensive manually labeled data, which is expensive and challenging to scale. To leverage the vast amount of data that is now accessible, unsupervised semantic feature learning, or learning without requiring manual annotation labor, is essential. The goal of this paper is to propose a design smell detection method that is based on self-supervised learning. We propose Model Representation with Transformers (MoRT) to learn the UML class diagram features by training Transformers to recognize masked keywords. We empirically show how effective the defined proxy task is at learning semantic and structural properties. We thoroughly assess MoRT using four model smells: the Blob, Functional Decomposition, Spaghetti Code, and Swiss Army Knife. Furthermore, we compare our findings with supervised learning and feature-based methods. Finally, we ran a cross-project experiment to assess the generalizability of our approach. Results show that MoRT is highly effective in detecting design smells.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"31 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2024-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automated detection of class diagram smells using self-supervised learning\",\"authors\":\"Amal Alazba, Hamoud Aljamaan, Mohammad Alshayeb\",\"doi\":\"10.1007/s10515-024-00429-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Design smells are symptoms of poorly designed solutions that may result in several maintenance issues. While various approaches, including traditional machine learning methods, have been proposed and shown to be effective in detecting design smells, they require extensive manually labeled data, which is expensive and challenging to scale. To leverage the vast amount of data that is now accessible, unsupervised semantic feature learning, or learning without requiring manual annotation labor, is essential. The goal of this paper is to propose a design smell detection method that is based on self-supervised learning. We propose Model Representation with Transformers (MoRT) to learn the UML class diagram features by training Transformers to recognize masked keywords. We empirically show how effective the defined proxy task is at learning semantic and structural properties. We thoroughly assess MoRT using four model smells: the Blob, Functional Decomposition, Spaghetti Code, and Swiss Army Knife. Furthermore, we compare our findings with supervised learning and feature-based methods. Finally, we ran a cross-project experiment to assess the generalizability of our approach. Results show that MoRT is highly effective in detecting design smells.</p></div>\",\"PeriodicalId\":55414,\"journal\":{\"name\":\"Automated Software Engineering\",\"volume\":\"31 1\",\"pages\":\"\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-03-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Automated Software Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10515-024-00429-w\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Automated Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10515-024-00429-w","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

摘要

设计气味是设计不良的解决方案的症状，可能会导致一些维护问题。虽然包括传统机器学习方法在内的各种方法已被提出，并被证明能有效检测设计气味，但这些方法需要大量人工标注的数据，成本高昂且难以扩展。为了充分利用现在可以获取的大量数据，无监督语义特征学习或无需人工标注的学习就显得至关重要。本文的目标是提出一种基于自监督学习的设计气味检测方法。我们提出了使用变形器的模型表示法（MoRT），通过训练变形器识别屏蔽关键词来学习 UML 类图特征。我们通过经验证明了所定义的代理任务在学习语义和结构属性方面的有效性。我们使用四种模型气味对 MoRT 进行了全面评估：Blob、功能分解、意大利面条代码和瑞士军刀。此外，我们还将评估结果与监督学习和基于特征的方法进行了比较。最后，我们进行了一次跨项目实验，以评估我们方法的可推广性。结果表明，MoRT 在检测设计气味方面非常有效。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Automated detection of class diagram smells using self-supervised learning

Design smells are symptoms of poorly designed solutions that may result in several maintenance issues. While various approaches, including traditional machine learning methods, have been proposed and shown to be effective in detecting design smells, they require extensive manually labeled data, which is expensive and challenging to scale. To leverage the vast amount of data that is now accessible, unsupervised semantic feature learning, or learning without requiring manual annotation labor, is essential. The goal of this paper is to propose a design smell detection method that is based on self-supervised learning. We propose Model Representation with Transformers (MoRT) to learn the UML class diagram features by training Transformers to recognize masked keywords. We empirically show how effective the defined proxy task is at learning semantic and structural properties. We thoroughly assess MoRT using four model smells: the Blob, Functional Decomposition, Spaghetti Code, and Swiss Army Knife. Furthermore, we compare our findings with supervised learning and feature-based methods. Finally, we ran a cross-project experiment to assess the generalizability of our approach. Results show that MoRT is highly effective in detecting design smells.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Automated Software Engineering 工程技术-计算机：软件工程

CiteScore

4.80

自引率

11.80%

发文量

审稿时长

>12 weeks

期刊介绍： This journal details research, tutorial papers, survey and accounts of significant industrial experience in the foundations, techniques, tools and applications of automated software engineering technology. This includes the study of techniques for constructing, understanding, adapting, and modeling software artifacts and processes. Coverage in Automated Software Engineering examines both automatic systems and collaborative systems as well as computational models of human software engineering activities. In addition, it presents knowledge representations and artificial intelligence techniques applicable to automated software engineering, and formal techniques that support or provide theoretical foundations. The journal also includes reviews of books, software, conferences and workshops.