{"title":"Is Your Policy Compliant?: A Deep Learning-based Empirical Study of Privacy Policies' Compliance with GDPR","authors":"Tamjid Al Rahat, Minjun Long, Yuan Tian","doi":"10.1145/3559613.3563195","DOIUrl":null,"url":null,"abstract":"Since the General Data Protection Regulation (GDPR) came into force in May 2018, companies have worked on their data practices to comply with the requirements of GDPR. In particular, since the privacy policy is the essential communication channel for users to understand and control their privacy when using companies' services, many companies updated their privacy policies after GDPR was enforced. However, most privacy policies are verbose, full of jargon, and vaguely describe companies' data practices and users' rights. In addition, our study shows that more than 32% of end users find it difficult to understand the privacy policies explaining GDPR requirements. Therefore, it is challenging for the end users and law enforcement authorities to manually check if companies' privacy policies comply with the requirements enforced by GDPR. In this paper, we create a privacy policy dataset of 1,080 websites annotated by experts with 18 GDPR requirements and develop a Convolutional Neural Network (CNN) based model that can classify the privacy policies into GDPR requirements with an accuracy of 89.2%. We apply our model to automatically measure GDPR compliance in the privacy policies of 9,761 most visited websites. Our results show that, even after four years since GDPR went into effect, 68% of websites still fail to comply with at least one requirement of GDPR.","PeriodicalId":416548,"journal":{"name":"Proceedings of the 21st Workshop on Privacy in the Electronic Society","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 21st Workshop on Privacy in the Electronic Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3559613.3563195","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Since the General Data Protection Regulation (GDPR) came into force in May 2018, companies have worked on their data practices to comply with the requirements of GDPR. In particular, since the privacy policy is the essential communication channel for users to understand and control their privacy when using companies' services, many companies updated their privacy policies after GDPR was enforced. However, most privacy policies are verbose, full of jargon, and vaguely describe companies' data practices and users' rights. In addition, our study shows that more than 32% of end users find it difficult to understand the privacy policies explaining GDPR requirements. Therefore, it is challenging for the end users and law enforcement authorities to manually check if companies' privacy policies comply with the requirements enforced by GDPR. In this paper, we create a privacy policy dataset of 1,080 websites annotated by experts with 18 GDPR requirements and develop a Convolutional Neural Network (CNN) based model that can classify the privacy policies into GDPR requirements with an accuracy of 89.2%. We apply our model to automatically measure GDPR compliance in the privacy policies of 9,761 most visited websites. Our results show that, even after four years since GDPR went into effect, 68% of websites still fail to comply with at least one requirement of GDPR.