Throughout our digital lives, we are getting recommendations for about almost everything we do, buy or consume. In that way, the field of recommender systems has been evolving vastly to match the increasing user needs accordingly. News, products, ideas and people are only a few of the things that we can be recommended with daily. However, even with the many years of research, several areas still remain unexplored. The focus of this paper revolves around such an area, namely on how to achieve diversity in single-user and group recommendations. Specifically, we decouple diversity from strictly revolving around items, and consider it as an orthogonal dimension that can be incorporated independently at different times in the recommender's workflow. We consider various definitions of diversity, taking into account either data items or users characteristics, and study how to cope with them, depending on whether we opt at diversity-aware single-user or group recommendations.
{"title":"On Achieving Diversity in Recommender Systems","authors":"Marialena Kyriakidi, K. Stefanidis, Y. Ioannidis","doi":"10.1145/3077331.3077341","DOIUrl":"https://doi.org/10.1145/3077331.3077341","url":null,"abstract":"Throughout our digital lives, we are getting recommendations for about almost everything we do, buy or consume. In that way, the field of recommender systems has been evolving vastly to match the increasing user needs accordingly. News, products, ideas and people are only a few of the things that we can be recommended with daily. However, even with the many years of research, several areas still remain unexplored. The focus of this paper revolves around such an area, namely on how to achieve diversity in single-user and group recommendations. Specifically, we decouple diversity from strictly revolving around items, and consider it as an orthogonal dimension that can be incorporated independently at different times in the recommender's workflow. We consider various definitions of diversity, taking into account either data items or users characteristics, and study how to cope with them, depending on whether we opt at diversity-aware single-user or group recommendations.","PeriodicalId":92430,"journal":{"name":"Proceedings of the ExploreDB'17. International Workshop on Exploratory Search in Databases and the Web (4th : 2017 : Chicago, Ill.)","volume":"74 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91233064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Varvara Kalokyri, Alexander Borgida, A. Marian, Daniela Vianna
A large number of personal digital traces is constantly generated or available online from a variety of sources, such as social media, calendars, purchase history, etc. These personal data traces are fragmented and highly heterogeneous, raising the need for an integrated view of the user's activities. Prior research in Personal Information Management focused mostly on creating a static model of the world (objects and their relationships). We argue that a dynamic world view is also helpful for making sense of collections of related personal documents, and propose a partial solution based on scripts -- a theoretically well-founded idea in AI and Cognitive Science. Scripts are stereotypical hierarchical plans for everyday activities, involving interactions between mostly social agents. We augment these with hints of the digital traces that they can leave. By connecting Personal Digital Traces through scripts, we can build an episodic view of users' digital memories, which allow users to explore related events and actions in an integrated way. The paper uses the Eating_Out script for illustration, and ends with a report on the results of a case-study of applying a prototype implementation on real user data.
{"title":"Integration and Exploration of Connected Personal Digital Traces","authors":"Varvara Kalokyri, Alexander Borgida, A. Marian, Daniela Vianna","doi":"10.1145/3077331.3077337","DOIUrl":"https://doi.org/10.1145/3077331.3077337","url":null,"abstract":"A large number of personal digital traces is constantly generated or available online from a variety of sources, such as social media, calendars, purchase history, etc. These personal data traces are fragmented and highly heterogeneous, raising the need for an integrated view of the user's activities. Prior research in Personal Information Management focused mostly on creating a static model of the world (objects and their relationships). We argue that a dynamic world view is also helpful for making sense of collections of related personal documents, and propose a partial solution based on scripts -- a theoretically well-founded idea in AI and Cognitive Science. Scripts are stereotypical hierarchical plans for everyday activities, involving interactions between mostly social agents. We augment these with hints of the digital traces that they can leave. By connecting Personal Digital Traces through scripts, we can build an episodic view of users' digital memories, which allow users to explore related events and actions in an integrated way. The paper uses the Eating_Out script for illustration, and ends with a report on the results of a case-study of applying a prototype implementation on real user data.","PeriodicalId":92430,"journal":{"name":"Proceedings of the ExploreDB'17. International Workshop on Exploratory Search in Databases and the Web (4th : 2017 : Chicago, Ill.)","volume":"31 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81417420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Joan Guisado-Gámez, Arnau Prat-Pérez, J. Larriba-Pey
The search for relevant information can be very frustrating for users who, unintentionally, use inappropriate keywords to express their needs. Expansion techniques aim at transforming the users' queries by adding new terms, called expansion features, that better describe the real users' intent. We propose Structural Query Expansion (SQE), a method that relies on relevant structures found in knowledge bases (KBs) to extract the expansion features as opposed to the use of semantics. In the particular case of this paper, we use Wikipedia because it is probably the largest source of up-to-date information. SQE is capable of achieving more than 150% improvement over non-expanded queries and is able to identify the expansion features in less than 0.2 seconds in the worst-case scenario. SQE is designed as an orthogonal method that can be combined with other expansion techniques, such as pseudo-relevance feedback.
{"title":"Structural Query Expansion via motifs from Wikipedia","authors":"Joan Guisado-Gámez, Arnau Prat-Pérez, J. Larriba-Pey","doi":"10.1145/3077331.3077342","DOIUrl":"https://doi.org/10.1145/3077331.3077342","url":null,"abstract":"The search for relevant information can be very frustrating for users who, unintentionally, use inappropriate keywords to express their needs. Expansion techniques aim at transforming the users' queries by adding new terms, called expansion features, that better describe the real users' intent. We propose Structural Query Expansion (SQE), a method that relies on relevant structures found in knowledge bases (KBs) to extract the expansion features as opposed to the use of semantics. In the particular case of this paper, we use Wikipedia because it is probably the largest source of up-to-date information. SQE is capable of achieving more than 150% improvement over non-expanded queries and is able to identify the expansion features in less than 0.2 seconds in the worst-case scenario. SQE is designed as an orthogonal method that can be combined with other expansion techniques, such as pseudo-relevance feedback.","PeriodicalId":92430,"journal":{"name":"Proceedings of the ExploreDB'17. International Workshop on Exploratory Search in Databases and the Web (4th : 2017 : Chicago, Ill.)","volume":"17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82589463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tobias Bleifuß, T. Johnson, D. Kalashnikov, Felix Naumann, Vladislav Shkapenyuk, D. Srivastava
Data and metadata suffer many different kinds of change: values are inserted, deleted or updated; entities appear and disappear; properties are added or re-purposed, etc. Explicitly recognizing, exploring, and evaluating such change can alert to changes in data ingestion procedures, can help assess data quality, and can improve the general understanding of the dataset and its behavior over time. We propose a data model-independent framework to formalize such change. Our change-cube enables exploration and discovery of such changes to reveal dataset behavior over time.
{"title":"Enabling Change Exploration: Vision Paper","authors":"Tobias Bleifuß, T. Johnson, D. Kalashnikov, Felix Naumann, Vladislav Shkapenyuk, D. Srivastava","doi":"10.1145/3077331.3077340","DOIUrl":"https://doi.org/10.1145/3077331.3077340","url":null,"abstract":"Data and metadata suffer many different kinds of change: values are inserted, deleted or updated; entities appear and disappear; properties are added or re-purposed, etc. Explicitly recognizing, exploring, and evaluating such change can alert to changes in data ingestion procedures, can help assess data quality, and can improve the general understanding of the dataset and its behavior over time. We propose a data model-independent framework to formalize such change. Our change-cube enables exploration and discovery of such changes to reveal dataset behavior over time.","PeriodicalId":92430,"journal":{"name":"Proceedings of the ExploreDB'17. International Workshop on Exploratory Search in Databases and the Web (4th : 2017 : Chicago, Ill.)","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82334926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniel Petrov, Rakan Alseghayer, M. Sharaf, Panos K. Chrysanthis, Alexandros Labrinidis
The rapid growth of monitoring applications has led to unprecedented amounts of generated time series data. Data analysts typically explore such large volumes of time series data looking for valuable insights. One such insight is finding pairs of time series, in which subsequences of values exhibit certain levels of correlation. However, since exploratory queries tend to be initially vague and imprecise, an analyst will typically use the results of one query as a springboard to formulating a new one, in which the correlation specifications are further refined. As such, it is essential to provide analysts with quick initial results to their exploratory queries, which allows for speeding up the refinement process. This goal is challenging when exploring the correlation in a large search space that consists of a big number of long time series. In this work we propose search algorithms that address precisely that challenge. The main idea underlying our work is to design priority-based search algorithms that efficiently navigate the rather large space to quickly find the initial results of an exploratory query. Our experimental results show that our algorithms outperform existing ones and enable high degree of interactivity in exploring large time series data.
{"title":"Interactive Exploration of Correlated Time Series","authors":"Daniel Petrov, Rakan Alseghayer, M. Sharaf, Panos K. Chrysanthis, Alexandros Labrinidis","doi":"10.1145/3077331.3077335","DOIUrl":"https://doi.org/10.1145/3077331.3077335","url":null,"abstract":"The rapid growth of monitoring applications has led to unprecedented amounts of generated time series data. Data analysts typically explore such large volumes of time series data looking for valuable insights. One such insight is finding pairs of time series, in which subsequences of values exhibit certain levels of correlation. However, since exploratory queries tend to be initially vague and imprecise, an analyst will typically use the results of one query as a springboard to formulating a new one, in which the correlation specifications are further refined. As such, it is essential to provide analysts with quick initial results to their exploratory queries, which allows for speeding up the refinement process. This goal is challenging when exploring the correlation in a large search space that consists of a big number of long time series. In this work we propose search algorithms that address precisely that challenge. The main idea underlying our work is to design priority-based search algorithms that efficiently navigate the rather large space to quickly find the initial results of an exploratory query. Our experimental results show that our algorithms outperform existing ones and enable high degree of interactivity in exploring large time series data.","PeriodicalId":92430,"journal":{"name":"Proceedings of the ExploreDB'17. International Workshop on Exploratory Search in Databases and the Web (4th : 2017 : Chicago, Ill.)","volume":"10 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86192684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Similarity searches are at the heart of exploratory data analysis tasks. Distance metrics are typically used to characterize the similarity between data objects represented as feature vectors. However, when the dimensionality of the data increases and the number of features is large, traditional distance metrics fail to distinguish between the closest and furthest data points. Localized distance functions have been proposed as an alternative to traditional distance metrics. These functions only consider dimensions close to query to compute the distance/similarity. Furthermore, in order to enable interactive explorations of high-dimensional data, indexing support for ad-hoc queries is needed. In this work we set up to investigate whether bit-sliced indices can be used for exploratory analytics such as similarity searches and data clustering for high-dimensional big-data. We also propose a novel dynamic quantization called Query dependent Equi-Depth (QED) quantization and show its effectiveness on characterizing high-dimensional similarity. When applying QED we observe improvements in kNN classification accuracy over traditional distance functions.
Acm reference format: Gheorghi Guzun and Guadalupe Canahuate. 2017. Supporting Dynamic Quantization for High-Dimensional Data Analytics. In Proceedings of Ex-ploreDB'17, Chicago, IL, USA, May 14-19, 2017, 6 pages. https://doi.org/http://dx.doi.org/10.1145/3077331.3077336.
相似性搜索是探索性数据分析任务的核心。距离度量通常用于表示为特征向量的数据对象之间的相似性。然而,当数据的维数增加,特征数量很大时,传统的距离度量无法区分最近和最远的数据点。局部距离函数已被提出作为传统距离度量的替代方法。这些函数只考虑接近查询的维度来计算距离/相似度。此外,为了支持对高维数据的交互式探索,需要对特别查询提供索引支持。在这项工作中,我们开始研究位切片索引是否可以用于探索性分析,如相似性搜索和高维大数据的数据聚类。我们还提出了一种新的动态量化,称为查询相关等深度量化(QED),并证明了它在表征高维相似性方面的有效性。当应用QED时,我们观察到kNN分类精度比传统距离函数有所提高。Acm参考格式:georghi Guzun and Guadalupe canhuate . 2017。支持高维数据分析的动态量化。《Proceedings of Ex-ploreDB’17》,2017年5月14-19日,美国芝加哥,IL, USA, 6页。https://doi.org/http: / / dx.doi.org/10.1145/3077331.3077336。
{"title":"Supporting Dynamic Quantization for High-Dimensional Data Analytics.","authors":"Gheorghi Guzun, Guadalupe Canahuate","doi":"10.1145/3077331.3077336","DOIUrl":"https://doi.org/10.1145/3077331.3077336","url":null,"abstract":"<p><p>Similarity searches are at the heart of exploratory data analysis tasks. Distance metrics are typically used to characterize the similarity between data objects represented as feature vectors. However, when the dimensionality of the data increases and the number of features is large, traditional distance metrics fail to distinguish between the closest and furthest data points. Localized distance functions have been proposed as an alternative to traditional distance metrics. These functions only consider dimensions close to query to compute the distance/similarity. Furthermore, in order to enable interactive explorations of high-dimensional data, indexing support for ad-hoc queries is needed. In this work we set up to investigate whether bit-sliced indices can be used for exploratory analytics such as similarity searches and data clustering for high-dimensional big-data. We also propose a novel dynamic quantization called Query dependent Equi-Depth (QED) quantization and show its effectiveness on characterizing high-dimensional similarity. When applying QED we observe improvements in kNN classification accuracy over traditional distance functions.</p><p><strong>Acm reference format: </strong>Gheorghi Guzun and Guadalupe Canahuate. 2017. Supporting Dynamic Quantization for High-Dimensional Data Analytics. In Proceedings of Ex-ploreDB'17, Chicago, IL, USA, May 14-19, 2017, 6 pages. https://doi.org/http://dx.doi.org/10.1145/3077331.3077336.</p>","PeriodicalId":92430,"journal":{"name":"Proceedings of the ExploreDB'17. International Workshop on Exploratory Search in Databases and the Web (4th : 2017 : Chicago, Ill.)","volume":"2017 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/3077331.3077336","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36096037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proceedings of the ExploreDB'17","authors":"","doi":"10.1145/3077331","DOIUrl":"https://doi.org/10.1145/3077331","url":null,"abstract":"","PeriodicalId":92430,"journal":{"name":"Proceedings of the ExploreDB'17. International Workshop on Exploratory Search in Databases and the Web (4th : 2017 : Chicago, Ill.)","volume":"14 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81357705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}