{"title":"A Data-Driven Classification Framework for Cybersecurity Breaches","authors":"Priyanka Rani, Abhijit Kumar Nag, Rifat Shahriyar","doi":"10.1109/mitp.2024.3374096","DOIUrl":null,"url":null,"abstract":"Unauthorized access to sensitive or confidential data results in a data breach, which can cause significant harm to an organization. Reporting breaches and reviewing prior records can help reduce damages. To aid in preparation, antivirus and security companies have published data breach reports, but they can be difficult to comprehend and require substantial effort to study. This article proposes a data breach incident classification framework using machine learning algorithms (naive Bayes, logistic regression, support vector machine, and random forest) on a dataset from the Privacy Rights Clearinghouse. The framework’s performance is evaluated using various metrics, including accuracy, F1 score, and confusion matrix. The article also employs topic modeling with latent Dirichlet allocation to enhance the classification’s accuracy.","PeriodicalId":49045,"journal":{"name":"IT Professional","volume":"32 1","pages":""},"PeriodicalIF":2.2000,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IT Professional","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/mitp.2024.3374096","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Unauthorized access to sensitive or confidential data results in a data breach, which can cause significant harm to an organization. Reporting breaches and reviewing prior records can help reduce damages. To aid in preparation, antivirus and security companies have published data breach reports, but they can be difficult to comprehend and require substantial effort to study. This article proposes a data breach incident classification framework using machine learning algorithms (naive Bayes, logistic regression, support vector machine, and random forest) on a dataset from the Privacy Rights Clearinghouse. The framework’s performance is evaluated using various metrics, including accuracy, F1 score, and confusion matrix. The article also employs topic modeling with latent Dirichlet allocation to enhance the classification’s accuracy.
IT ProfessionalCOMPUTER SCIENCE, INFORMATION SYSTEMS-COMPUTER SCIENCE, SOFTWARE ENGINEERING
CiteScore
5.00
自引率
0.00%
发文量
111
审稿时长
>12 weeks
期刊介绍:
IT Professional is a technical magazine of the IEEE Computer Society. It publishes peer-reviewed articles, columns and departments written for and by IT practitioners and researchers covering:
practical aspects of emerging and leading-edge digital technologies,
original ideas and guidance for IT applications, and
novel IT solutions for the enterprise.
IT Professional’s goal is to inform the broad spectrum of IT executives, IT project managers, IT researchers, and IT application developers from industry, government, and academia.