{"title":"Clustering software systems to identify subsystem structures using knowledgebase","authors":"Md. Nasim Adnan, Md. Rashedul Islam, S. Hossain","doi":"10.1109/MYSEC.2011.6140714","DOIUrl":null,"url":null,"abstract":"The structure of a software system deteriorates as a result of continuous maintenance activity. For the purpose of software reengineering or reverse engineering, software engineers often get the original source code as the most updated source of information due to lack of current documentation and limited or nonexistent availability of the original designers. The application of clustering techniques to software systems aiming to discover feature-oriented and meaningful subsystems can help software engineers involved in software reengineering or reverse engineering to understand high-level features provided by those subsystems. Continuous research is going on in the recent years — addressing different issues in the software clustering problem. Our software clustering approach introduces the use of Knowledgebase, which leads to considerable improvement than the existing approaches. Similarity measurement is the key to perform successful clustering. Similarity measurement in the existing approaches has a common drawback that they do not incorporate the diversity of software systems. Our approach uses Knowledgebase which acts as a repository of information about the internal structure of the generic types of the software systems to provide guidelines on similarity measurement criteria and their respective weightages. The final clustering is done by populating automatically generated subsystems along with the known subsystems (provided by Knowledgebase). In our research, we have developed a tool named “ULAB Cluster 1.0” which implements our new clustering approach. This clustering tool has been evaluated by using a benchmark named “MoJo distance” for different well-known software systems. The experimental results show that our approach generates more appropriate subsystems than the other existing clustering approaches and outperforms others in different dimensions of software clustering quality.","PeriodicalId":137714,"journal":{"name":"2011 Malaysian Conference in Software Engineering","volume":"66 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 Malaysian Conference in Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MYSEC.2011.6140714","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
The structure of a software system deteriorates as a result of continuous maintenance activity. For the purpose of software reengineering or reverse engineering, software engineers often get the original source code as the most updated source of information due to lack of current documentation and limited or nonexistent availability of the original designers. The application of clustering techniques to software systems aiming to discover feature-oriented and meaningful subsystems can help software engineers involved in software reengineering or reverse engineering to understand high-level features provided by those subsystems. Continuous research is going on in the recent years — addressing different issues in the software clustering problem. Our software clustering approach introduces the use of Knowledgebase, which leads to considerable improvement than the existing approaches. Similarity measurement is the key to perform successful clustering. Similarity measurement in the existing approaches has a common drawback that they do not incorporate the diversity of software systems. Our approach uses Knowledgebase which acts as a repository of information about the internal structure of the generic types of the software systems to provide guidelines on similarity measurement criteria and their respective weightages. The final clustering is done by populating automatically generated subsystems along with the known subsystems (provided by Knowledgebase). In our research, we have developed a tool named “ULAB Cluster 1.0” which implements our new clustering approach. This clustering tool has been evaluated by using a benchmark named “MoJo distance” for different well-known software systems. The experimental results show that our approach generates more appropriate subsystems than the other existing clustering approaches and outperforms others in different dimensions of software clustering quality.