{"title":"VADER-RF:在安卓设备上保护用户隐私的新方案","authors":"Manish Verma, Parma Nand","doi":"10.1007/s13198-024-02461-1","DOIUrl":null,"url":null,"abstract":"<p>Android protects user privacy through its permission system and explains permission usage in privacy disclosure. Privacy disclosure often fails to predict app behavior accurately and leading to potential exploitation by malicious applications. To address this, we propose the VADER-RF technique, which combines VADER sentiment analysis with Random Forest machine learning to correlate privacy disclosures with app behavior. Our model analyzes privacy disclosure documents using sentiment analysis, extracting permissions from AndroidManifest.xml file, and explore the data flow analysis of Java files. These features were evaluated on Naive Bayes, SVM, Decision Tree and Random Forest machine learning models. The Random Forest model demonstrated superior performance with the highest accuracy (81.6%), precision (85.3%) and recall (89.4%). Kendall's Tau correlation coefficient is 0.54, which indicates that our model is moderate to strongly effective at predicting whether an app is malicious based on the selected features. Sentiment analysis significantly enhanced all models’ performance, underscoring the effectiveness of integrating sentiment analysis with traditional feature sets for advanced malware detection.</p>","PeriodicalId":14463,"journal":{"name":"International Journal of System Assurance Engineering and Management","volume":"22 1","pages":""},"PeriodicalIF":1.6000,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"VADER-RF: a novel scheme for protecting user privacy on android devices\",\"authors\":\"Manish Verma, Parma Nand\",\"doi\":\"10.1007/s13198-024-02461-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Android protects user privacy through its permission system and explains permission usage in privacy disclosure. Privacy disclosure often fails to predict app behavior accurately and leading to potential exploitation by malicious applications. To address this, we propose the VADER-RF technique, which combines VADER sentiment analysis with Random Forest machine learning to correlate privacy disclosures with app behavior. Our model analyzes privacy disclosure documents using sentiment analysis, extracting permissions from AndroidManifest.xml file, and explore the data flow analysis of Java files. These features were evaluated on Naive Bayes, SVM, Decision Tree and Random Forest machine learning models. The Random Forest model demonstrated superior performance with the highest accuracy (81.6%), precision (85.3%) and recall (89.4%). Kendall's Tau correlation coefficient is 0.54, which indicates that our model is moderate to strongly effective at predicting whether an app is malicious based on the selected features. Sentiment analysis significantly enhanced all models’ performance, underscoring the effectiveness of integrating sentiment analysis with traditional feature sets for advanced malware detection.</p>\",\"PeriodicalId\":14463,\"journal\":{\"name\":\"International Journal of System Assurance Engineering and Management\",\"volume\":\"22 1\",\"pages\":\"\"},\"PeriodicalIF\":1.6000,\"publicationDate\":\"2024-08-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of System Assurance Engineering and Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s13198-024-02461-1\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of System Assurance Engineering and Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s13198-024-02461-1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
摘要
安卓通过权限系统保护用户隐私,并在隐私披露中解释权限的使用。隐私披露往往不能准确预测应用程序的行为,从而导致恶意应用程序的潜在利用。针对这一问题,我们提出了 VADER-RF 技术,该技术将 VADER 情感分析与随机森林机器学习相结合,将隐私披露与应用程序行为关联起来。我们的模型利用情感分析来分析隐私披露文件,从 AndroidManifest.xml 文件中提取权限,并探索 Java 文件的数据流分析。在 Naive Bayes、SVM、决策树和随机森林机器学习模型上对这些特征进行了评估。随机森林模型表现优异,准确率(81.6%)、精确率(85.3%)和召回率(89.4%)最高。Kendall's Tau 相关系数为 0.54,这表明我们的模型在根据所选特征预测应用程序是否为恶意应用程序方面具有中度到高度的有效性。情感分析大大提高了所有模型的性能,突出表明了将情感分析与传统特征集整合用于高级恶意软件检测的有效性。
VADER-RF: a novel scheme for protecting user privacy on android devices
Android protects user privacy through its permission system and explains permission usage in privacy disclosure. Privacy disclosure often fails to predict app behavior accurately and leading to potential exploitation by malicious applications. To address this, we propose the VADER-RF technique, which combines VADER sentiment analysis with Random Forest machine learning to correlate privacy disclosures with app behavior. Our model analyzes privacy disclosure documents using sentiment analysis, extracting permissions from AndroidManifest.xml file, and explore the data flow analysis of Java files. These features were evaluated on Naive Bayes, SVM, Decision Tree and Random Forest machine learning models. The Random Forest model demonstrated superior performance with the highest accuracy (81.6%), precision (85.3%) and recall (89.4%). Kendall's Tau correlation coefficient is 0.54, which indicates that our model is moderate to strongly effective at predicting whether an app is malicious based on the selected features. Sentiment analysis significantly enhanced all models’ performance, underscoring the effectiveness of integrating sentiment analysis with traditional feature sets for advanced malware detection.
期刊介绍:
This Journal is established with a view to cater to increased awareness for high quality research in the seamless integration of heterogeneous technologies to formulate bankable solutions to the emergent complex engineering problems.
Assurance engineering could be thought of as relating to the provision of higher confidence in the reliable and secure implementation of a system’s critical characteristic features through the espousal of a holistic approach by using a wide variety of cross disciplinary tools and techniques. Successful realization of sustainable and dependable products, systems and services involves an extensive adoption of Reliability, Quality, Safety and Risk related procedures for achieving high assurancelevels of performance; also pivotal are the management issues related to risk and uncertainty that govern the practical constraints encountered in their deployment. It is our intention to provide a platform for the modeling and analysis of large engineering systems, among the other aforementioned allied goals of systems assurance engineering, leading to the enforcement of performance enhancement measures. Achieving a fine balance between theory and practice is the primary focus. The Journal only publishes high quality papers that have passed the rigorous peer review procedure of an archival scientific Journal. The aim is an increasing number of submissions, wide circulation and a high impact factor.