{"title":"PKVIC: Supplement Missing Software Package Information in Security Vulnerability Reports","authors":"Jinke Song, Qiang Li, Haining Wang, Jiqiang Liu","doi":"10.1109/TDSC.2023.3334762","DOIUrl":null,"url":null,"abstract":"Nowadays security vulnerability reports contain commercial vendor-centric information but fail to include accurate information of open-source software packages. Open-source ecosystems use package managers, such as Maven, NuGet, NPM, and Gem, to cover hundreds of thousands of free code packages. However, we uncover that vulnerability reports frequently miss the vulnerable software package information when the software package comes from open-source ecosystems. To fill in this gap, we propose a framework called PKVIC (software package vulnerability information calibration), as the first tool to automatically associate security vulnerability reports with affected software packages from different open-source ecosystems. Specifically, PKVIC designs an ecosystem classifier to determine which ecosystem a vulnerability report belongs to. From the reports written in natural language, PKVIC extracts the entities closely related to software names in ecosystems. To efficiently and accurately locate the affected software packages from millions of packages, we propose a recursive traversal method to generate the package identifier based on the naming scheme and candidate named entities. We implemented the prototype of PKVIC and conducted comprehensive experiments to validate its efficacy. In particular, we ran PKVIC over 421,808 vulnerability reports from 20 well-known sources of security vulnerabilities and identified 11,279 unique vulnerability reports that affected 2,703 open-source software packages. PKVIC successfully found the accurate reference URLs for these 2,703 software packages across 6 open-source ecosystems, including Pypi, Gem, NPM, Packagist, Nuget, and Maven.","PeriodicalId":13047,"journal":{"name":"IEEE Transactions on Dependable and Secure Computing","volume":null,"pages":null},"PeriodicalIF":7.0000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Dependable and Secure Computing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/TDSC.2023.3334762","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Nowadays security vulnerability reports contain commercial vendor-centric information but fail to include accurate information of open-source software packages. Open-source ecosystems use package managers, such as Maven, NuGet, NPM, and Gem, to cover hundreds of thousands of free code packages. However, we uncover that vulnerability reports frequently miss the vulnerable software package information when the software package comes from open-source ecosystems. To fill in this gap, we propose a framework called PKVIC (software package vulnerability information calibration), as the first tool to automatically associate security vulnerability reports with affected software packages from different open-source ecosystems. Specifically, PKVIC designs an ecosystem classifier to determine which ecosystem a vulnerability report belongs to. From the reports written in natural language, PKVIC extracts the entities closely related to software names in ecosystems. To efficiently and accurately locate the affected software packages from millions of packages, we propose a recursive traversal method to generate the package identifier based on the naming scheme and candidate named entities. We implemented the prototype of PKVIC and conducted comprehensive experiments to validate its efficacy. In particular, we ran PKVIC over 421,808 vulnerability reports from 20 well-known sources of security vulnerabilities and identified 11,279 unique vulnerability reports that affected 2,703 open-source software packages. PKVIC successfully found the accurate reference URLs for these 2,703 software packages across 6 open-source ecosystems, including Pypi, Gem, NPM, Packagist, Nuget, and Maven.
期刊介绍:
The "IEEE Transactions on Dependable and Secure Computing (TDSC)" is a prestigious journal that publishes high-quality, peer-reviewed research in the field of computer science, specifically targeting the development of dependable and secure computing systems and networks. This journal is dedicated to exploring the fundamental principles, methodologies, and mechanisms that enable the design, modeling, and evaluation of systems that meet the required levels of reliability, security, and performance.
The scope of TDSC includes research on measurement, modeling, and simulation techniques that contribute to the understanding and improvement of system performance under various constraints. It also covers the foundations necessary for the joint evaluation, verification, and design of systems that balance performance, security, and dependability.
By publishing archival research results, TDSC aims to provide a valuable resource for researchers, engineers, and practitioners working in the areas of cybersecurity, fault tolerance, and system reliability. The journal's focus on cutting-edge research ensures that it remains at the forefront of advancements in the field, promoting the development of technologies that are critical for the functioning of modern, complex systems.