Yin Wang;Ming Fan;Junfeng Liu;Junjie Tao;Wuxia Jin;Haijun Wang;Qi Xiong;Ting Liu
{"title":"Do as You Say: Consistency Detection of Data Practice in Program Code and Privacy Policy in Mini-App","authors":"Yin Wang;Ming Fan;Junfeng Liu;Junjie Tao;Wuxia Jin;Haijun Wang;Qi Xiong;Ting Liu","doi":"10.1109/TSE.2024.3479288","DOIUrl":null,"url":null,"abstract":"Mini-app is an emerging form of mobile application that combines web technology with native capabilities. Its features, e.g., no need to download and no installation, have made it popular rapidly. However, privacy issues that violate the laws or regulations are breeding in the swiftly expanding mini-app ecosystem. Ensuring consistency between the mini-app's data practices embedded in its program code behavior and privacy policy description is crucial. But no work has systematically investigated the privacy problem of the mini-app before. To achieve this purpose, there are two main challenges. Firstly, the mini-app represents a novel application form, and a deficiency exists in information-sensitive code analysis tools capable of accurately discerning data practices from the code. Secondly, previous studies focusing on consistency have exhibited granularity issues related to data types and consistency patterns. This paper introduces MiniDetector, a novel approach for identifying consistency issues in mini-apps. MiniDetector employs data flow analysis to pinpoint data practices within the program code and utilizes a two-stage prompt engineering process to extract data practices from privacy policies. The results from both analyses are then compared to establish a consistency match. The proposed method undergoes sufficiency evaluations on a dataset comprising 70 mini-apps. Additionally, we conduct a comprehensive analysis of 100,000 mini-apps on the WeChat client in the wild, extracting 3,369 with privacy policies. Astonishingly, only 11 of these meet the consistency requirements, while 3,358 exhibit inconsistencies, resulting in an alarming inconsistency rate of 99.7%.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"50 12","pages":"3225-3248"},"PeriodicalIF":5.6000,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10715677/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Mini-app is an emerging form of mobile application that combines web technology with native capabilities. Its features, e.g., no need to download and no installation, have made it popular rapidly. However, privacy issues that violate the laws or regulations are breeding in the swiftly expanding mini-app ecosystem. Ensuring consistency between the mini-app's data practices embedded in its program code behavior and privacy policy description is crucial. But no work has systematically investigated the privacy problem of the mini-app before. To achieve this purpose, there are two main challenges. Firstly, the mini-app represents a novel application form, and a deficiency exists in information-sensitive code analysis tools capable of accurately discerning data practices from the code. Secondly, previous studies focusing on consistency have exhibited granularity issues related to data types and consistency patterns. This paper introduces MiniDetector, a novel approach for identifying consistency issues in mini-apps. MiniDetector employs data flow analysis to pinpoint data practices within the program code and utilizes a two-stage prompt engineering process to extract data practices from privacy policies. The results from both analyses are then compared to establish a consistency match. The proposed method undergoes sufficiency evaluations on a dataset comprising 70 mini-apps. Additionally, we conduct a comprehensive analysis of 100,000 mini-apps on the WeChat client in the wild, extracting 3,369 with privacy policies. Astonishingly, only 11 of these meet the consistency requirements, while 3,358 exhibit inconsistencies, resulting in an alarming inconsistency rate of 99.7%.
期刊介绍:
IEEE Transactions on Software Engineering seeks contributions comprising well-defined theoretical results and empirical studies with potential impacts on software construction, analysis, or management. The scope of this Transactions extends from fundamental mechanisms to the development of principles and their application in specific environments. Specific topic areas include:
a) Development and maintenance methods and models: Techniques and principles for specifying, designing, and implementing software systems, encompassing notations and process models.
b) Assessment methods: Software tests, validation, reliability models, test and diagnosis procedures, software redundancy, design for error control, and measurements and evaluation of process and product aspects.
c) Software project management: Productivity factors, cost models, schedule and organizational issues, and standards.
d) Tools and environments: Specific tools, integrated tool environments, associated architectures, databases, and parallel and distributed processing issues.
e) System issues: Hardware-software trade-offs.
f) State-of-the-art surveys: Syntheses and comprehensive reviews of the historical development within specific areas of interest.