{"title":"Towards strong regret minimization sets: Balancing freshness and diversity in data selection","authors":"Hongjie Guo , Jianzhong Li , Hong Gao","doi":"10.1016/j.tcs.2024.114986","DOIUrl":null,"url":null,"abstract":"<div><div>Multi-criteria decision-making typically requires selecting a concise, representative set from large databases. Regret minimization set (RMS) queries have emerged as a solution to circumvent the necessity of a utility function in top-<em>k</em> queries and to address the expansive result sets produced by skyline queries. However, traditional RMS formulations only ensure one result under any utility function and do not account for the diversity and freshness of results. This study introduces the concept of strong regret minimization set (SRMS), ensuring the utility value accuracy of selected <em>k</em> data points under any utility function while incorporating result diversity and freshness. We explore two new computational challenges: the Minimum Size problem, focusing on reducing the result set size with bounded utility error, and the Max-sum Diversity and Freshness problem, aiming to optimize the diversity and freshness of the selected set. Both problems are proved to be NP-hard, and we develop approximation algorithms for them. Experimental results on both real-world and synthetic data show high efficiency and scalability of proposed algorithms.</div></div>","PeriodicalId":49438,"journal":{"name":"Theoretical Computer Science","volume":"1026 ","pages":"Article 114986"},"PeriodicalIF":0.9000,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Theoretical Computer Science","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0304397524006030","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Multi-criteria decision-making typically requires selecting a concise, representative set from large databases. Regret minimization set (RMS) queries have emerged as a solution to circumvent the necessity of a utility function in top-k queries and to address the expansive result sets produced by skyline queries. However, traditional RMS formulations only ensure one result under any utility function and do not account for the diversity and freshness of results. This study introduces the concept of strong regret minimization set (SRMS), ensuring the utility value accuracy of selected k data points under any utility function while incorporating result diversity and freshness. We explore two new computational challenges: the Minimum Size problem, focusing on reducing the result set size with bounded utility error, and the Max-sum Diversity and Freshness problem, aiming to optimize the diversity and freshness of the selected set. Both problems are proved to be NP-hard, and we develop approximation algorithms for them. Experimental results on both real-world and synthetic data show high efficiency and scalability of proposed algorithms.
期刊介绍:
Theoretical Computer Science is mathematical and abstract in spirit, but it derives its motivation from practical and everyday computation. Its aim is to understand the nature of computation and, as a consequence of this understanding, provide more efficient methodologies. All papers introducing or studying mathematical, logic and formal concepts and methods are welcome, provided that their motivation is clearly drawn from the field of computing.