Yuan Yuan, W. Kong, Gang Hou, Yan Hu, Masahiko Watanabe, Akira Fukuda
{"title":"From Local to Global Semantic Clone Detection","authors":"Yuan Yuan, W. Kong, Gang Hou, Yan Hu, Masahiko Watanabe, Akira Fukuda","doi":"10.1109/DSA.2019.00012","DOIUrl":null,"url":null,"abstract":"Clone detection detects similar code fragments (refer to as clone code) in software products. It can help with software optimization and maintenance. Code clone detection can be divided into textual, lexical, syntactic and semantic levels. The existing technologies have achieved many good results in the first three levels, but no significant results have been obtained in semantic clone detection. In this paper, we propose a novel semantic level clone detection approach. We use the control flow graph (CFG) as an intermediate representation of the program method, combining the classical dynamic time warping (DTW) algorithm in the field of speech recognition with two deep neural network models (bidirectional RNN autoencoder and graph convolutional network (GCN)) to detect semantic level clone from local to global. We experimented with a dataset consisting of five large-scale real-world systems and a code corpus containing a large number of programming problems. The experimental results show that our approach can achieve good results in detecting both local and global semantic clone.","PeriodicalId":342719,"journal":{"name":"2019 6th International Conference on Dependable Systems and Their Applications (DSA)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 6th International Conference on Dependable Systems and Their Applications (DSA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSA.2019.00012","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
Clone detection detects similar code fragments (refer to as clone code) in software products. It can help with software optimization and maintenance. Code clone detection can be divided into textual, lexical, syntactic and semantic levels. The existing technologies have achieved many good results in the first three levels, but no significant results have been obtained in semantic clone detection. In this paper, we propose a novel semantic level clone detection approach. We use the control flow graph (CFG) as an intermediate representation of the program method, combining the classical dynamic time warping (DTW) algorithm in the field of speech recognition with two deep neural network models (bidirectional RNN autoencoder and graph convolutional network (GCN)) to detect semantic level clone from local to global. We experimented with a dataset consisting of five large-scale real-world systems and a code corpus containing a large number of programming problems. The experimental results show that our approach can achieve good results in detecting both local and global semantic clone.