{"title":"轻量级、高效的代码克隆检测技术","authors":"Yasir Giani, Luo Ping, Syed Asad Shah","doi":"10.1109/ACCC58361.2022.00015","DOIUrl":null,"url":null,"abstract":"Code clones make software maintenance more challenging. Detecting bugs in large systems may significantly increase maintenance costs. Despite the fact that several techniques for clone identification have been proposed over the years, the accuracy and scalability of clone detection techniques remain hot research areas. Previously, Akram et al. proposed the DroidCC hybrid technique, where tokens were encoded into MD5 hash values by encoding them into 128-bit fingerprints, and clones were identified by matching identical hash values. Encoding tokens into MD5 hash values take more time due to the large fingerprint size of MD5 hash values. Due to the enormous chunk size, DroidCC cannot achieve higher accuracy. To overcome the weakness of the DroidCC technique, We proposed a novel AYAT a lightweight hybrid technique to detect clones at the fragment level. To speed up the detection process, we converted tokens into 32-bit polynomial values, and we set the chunk size to 5 lines per chunk to improve accuracy. We tested our technique on 10,968 java projects against 4.98 million lines of code. In comparison to the well-known DroidCC technique, it is significantly faster and more efficient. Our examination demonstrates that precision is significantly improved despite sacrificing scalability. AYAT code cloning detection technique has outscored DroidCC in every aspect.","PeriodicalId":285531,"journal":{"name":"2022 3rd Asia Conference on Computers and Communications (ACCC)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"AYAT: A Lightweight and Efficient Code Clone Detection Technique\",\"authors\":\"Yasir Giani, Luo Ping, Syed Asad Shah\",\"doi\":\"10.1109/ACCC58361.2022.00015\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Code clones make software maintenance more challenging. Detecting bugs in large systems may significantly increase maintenance costs. Despite the fact that several techniques for clone identification have been proposed over the years, the accuracy and scalability of clone detection techniques remain hot research areas. Previously, Akram et al. proposed the DroidCC hybrid technique, where tokens were encoded into MD5 hash values by encoding them into 128-bit fingerprints, and clones were identified by matching identical hash values. Encoding tokens into MD5 hash values take more time due to the large fingerprint size of MD5 hash values. Due to the enormous chunk size, DroidCC cannot achieve higher accuracy. To overcome the weakness of the DroidCC technique, We proposed a novel AYAT a lightweight hybrid technique to detect clones at the fragment level. To speed up the detection process, we converted tokens into 32-bit polynomial values, and we set the chunk size to 5 lines per chunk to improve accuracy. We tested our technique on 10,968 java projects against 4.98 million lines of code. In comparison to the well-known DroidCC technique, it is significantly faster and more efficient. Our examination demonstrates that precision is significantly improved despite sacrificing scalability. AYAT code cloning detection technique has outscored DroidCC in every aspect.\",\"PeriodicalId\":285531,\"journal\":{\"name\":\"2022 3rd Asia Conference on Computers and Communications (ACCC)\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 3rd Asia Conference on Computers and Communications (ACCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ACCC58361.2022.00015\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 3rd Asia Conference on Computers and Communications (ACCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACCC58361.2022.00015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
AYAT: A Lightweight and Efficient Code Clone Detection Technique
Code clones make software maintenance more challenging. Detecting bugs in large systems may significantly increase maintenance costs. Despite the fact that several techniques for clone identification have been proposed over the years, the accuracy and scalability of clone detection techniques remain hot research areas. Previously, Akram et al. proposed the DroidCC hybrid technique, where tokens were encoded into MD5 hash values by encoding them into 128-bit fingerprints, and clones were identified by matching identical hash values. Encoding tokens into MD5 hash values take more time due to the large fingerprint size of MD5 hash values. Due to the enormous chunk size, DroidCC cannot achieve higher accuracy. To overcome the weakness of the DroidCC technique, We proposed a novel AYAT a lightweight hybrid technique to detect clones at the fragment level. To speed up the detection process, we converted tokens into 32-bit polynomial values, and we set the chunk size to 5 lines per chunk to improve accuracy. We tested our technique on 10,968 java projects against 4.98 million lines of code. In comparison to the well-known DroidCC technique, it is significantly faster and more efficient. Our examination demonstrates that precision is significantly improved despite sacrificing scalability. AYAT code cloning detection technique has outscored DroidCC in every aspect.