Yi Ren, Jianbo Guan, Jun Ma, Yusong Tan, Qingbo Wu, Y. Ding
{"title":"CLASC: A Changelog Based Automatic Code Source Classification Method for Operating System Packages","authors":"Yi Ren, Jianbo Guan, Jun Ma, Yusong Tan, Qingbo Wu, Y. Ding","doi":"10.1109/APSEC48747.2019.00058","DOIUrl":null,"url":null,"abstract":"Open source represents an important way in which today's software is developed. The adoption of open source software continues to accelerate because of the great potential it offers, such as productivity improvement, cost savings and quicker innovation. While the complexity and the size of software composition grow, it becomes difficult to effectively scan and track the code source, especially for software with tremendous scale of code, such as operating systems. So far, existing work on open source components mainly focus on how to mitigate potential license incompliance, to reduce potential security risks introduced by open source vulnerabilities, and to detect and match open source components in the code. To ensure code traceability and manageability for large scale mixed-source operating system, we believe it is beneficial to automatically distinguish sources of the system code in the granularity of software packages and manage them separately. However, according to the literature, there is a lack of relevant work in this area. In this paper, we first classify the packages into three categories in terms of code source from the perspective of OS developers and maintainers. Then we propose CLASC, an efficient code source classification algorithm. With the capability of package info extraction and analysis, CLASC can classify software packages into the defined categories according to their changelog info. And we design and implement KyAnalyzer, a Web-based package management and code source analysis platform. It provides automatic code source analyzing services and is capable of managing OS packages differentially according to their different categories of code source with CLASC incorporated as a component of it. Experimental results show the correctness and efficiency of the Web-enabled package source classifier.","PeriodicalId":325642,"journal":{"name":"2019 26th Asia-Pacific Software Engineering Conference (APSEC)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 26th Asia-Pacific Software Engineering Conference (APSEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APSEC48747.2019.00058","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Open source represents an important way in which today's software is developed. The adoption of open source software continues to accelerate because of the great potential it offers, such as productivity improvement, cost savings and quicker innovation. While the complexity and the size of software composition grow, it becomes difficult to effectively scan and track the code source, especially for software with tremendous scale of code, such as operating systems. So far, existing work on open source components mainly focus on how to mitigate potential license incompliance, to reduce potential security risks introduced by open source vulnerabilities, and to detect and match open source components in the code. To ensure code traceability and manageability for large scale mixed-source operating system, we believe it is beneficial to automatically distinguish sources of the system code in the granularity of software packages and manage them separately. However, according to the literature, there is a lack of relevant work in this area. In this paper, we first classify the packages into three categories in terms of code source from the perspective of OS developers and maintainers. Then we propose CLASC, an efficient code source classification algorithm. With the capability of package info extraction and analysis, CLASC can classify software packages into the defined categories according to their changelog info. And we design and implement KyAnalyzer, a Web-based package management and code source analysis platform. It provides automatic code source analyzing services and is capable of managing OS packages differentially according to their different categories of code source with CLASC incorporated as a component of it. Experimental results show the correctness and efficiency of the Web-enabled package source classifier.