{"title":"Protect user anonymity in query log","authors":"Anissa Mimi, S. N. Bahloul","doi":"10.1109/ICMWI.2010.5648160","DOIUrl":null,"url":null,"abstract":"The query logs provide to the research community a large amount of data which reflect the natural behavior of the user on the web. These data have many values and risks on user privacy. The use of these data has prompted several questions: The query logs owners are concerned by the security of their customers. But, academic, governmental and commercial searchers are interested in acquiring a significant amount of data for their research. The challenge is to ensure the user's privacy by reducing the potential risks without depleting the log query utility. In this paper we give an overview on query log data issue and propose a solution to ensure user anonymity. This solution is based on the replacement of terms relied with user identity. We think that replacing identifying terms like names of persons, and names of places with a significant substitutive, improve protection of the user identity by increasing number of possible identities for this user and at the same time, guarantee more utility than if the identifying terms are deleted.","PeriodicalId":404577,"journal":{"name":"2010 International Conference on Machine and Web Intelligence","volume":"91 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 International Conference on Machine and Web Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMWI.2010.5648160","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
The query logs provide to the research community a large amount of data which reflect the natural behavior of the user on the web. These data have many values and risks on user privacy. The use of these data has prompted several questions: The query logs owners are concerned by the security of their customers. But, academic, governmental and commercial searchers are interested in acquiring a significant amount of data for their research. The challenge is to ensure the user's privacy by reducing the potential risks without depleting the log query utility. In this paper we give an overview on query log data issue and propose a solution to ensure user anonymity. This solution is based on the replacement of terms relied with user identity. We think that replacing identifying terms like names of persons, and names of places with a significant substitutive, improve protection of the user identity by increasing number of possible identities for this user and at the same time, guarantee more utility than if the identifying terms are deleted.