Networks in many real-world applications come with an inherent uncertainty in their structure, due to e.g., noisy measurements, inference and prediction models, or for privacy purposes. Modeling and analyzing uncertain graphs has attracted a great deal of attention. Among the various graph analytic tasks studied, the extraction of dense substructures, such as cores or trusses, has a central role. In this paper, we study the problem of (k, γ)-truss indexing and querying over an uncertain graph . A (k, γ)-truss is the largest subgraph of , such that the probability of each edge being contained in at least k − 2 triangles is no less than γ. Our first proposal, CPT-index, keeps all the (k, γ)-trusses: retrieval for any given k and γ can be executed in an optimal linear time w.r.t. the graph size of the queried (k, γ)-truss. We develop a bottom-up CPT-indexconstruction scheme and an improved algorithm for fast CPT-indexconstruction using top-down graph partitions. For trading off between (k, γ)-truss offline indexing and online querying, we further develop an approximate indexing approach (ϵ, Δr)-APXequipped with two parameters, ϵ and Δr, that govern tolerated errors. Extensive experiments using large-scale uncertain graphs with 261 million edges validate the efficiency of our proposed indexing and querying algorithms against state-of-the-art methods.
{"title":"Efficient Probabilistic Truss Indexing on Uncertain Graphs","authors":"Zitang Sun, Xin Huang, Jianliang Xu, F. Bonchi","doi":"10.1145/3442381.3449976","DOIUrl":"https://doi.org/10.1145/3442381.3449976","url":null,"abstract":"Networks in many real-world applications come with an inherent uncertainty in their structure, due to e.g., noisy measurements, inference and prediction models, or for privacy purposes. Modeling and analyzing uncertain graphs has attracted a great deal of attention. Among the various graph analytic tasks studied, the extraction of dense substructures, such as cores or trusses, has a central role. In this paper, we study the problem of (k, γ)-truss indexing and querying over an uncertain graph . A (k, γ)-truss is the largest subgraph of , such that the probability of each edge being contained in at least k − 2 triangles is no less than γ. Our first proposal, CPT-index, keeps all the (k, γ)-trusses: retrieval for any given k and γ can be executed in an optimal linear time w.r.t. the graph size of the queried (k, γ)-truss. We develop a bottom-up CPT-indexconstruction scheme and an improved algorithm for fast CPT-indexconstruction using top-down graph partitions. For trading off between (k, γ)-truss offline indexing and online querying, we further develop an approximate indexing approach (ϵ, Δr)-APXequipped with two parameters, ϵ and Δr, that govern tolerated errors. Extensive experiments using large-scale uncertain graphs with 261 million edges validate the efficiency of our proposed indexing and querying algorithms against state-of-the-art methods.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114540927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lifelong machine learning (LML) has driven the development of extensive web applications, enabling the learning systems deployed on web servers to deal with a sequence of tasks in an incremental fashion. Such systems can retain knowledge from learned tasks in a knowledge base and seamlessly apply it to improve the future learning. Unfortunately, most existing LML methods require labels in every task, whereas providing persistent human labeling for all future tasks is costly, onerous, error-prone, and hence impractical. Motivated by this situation, we propose a new paradigm named unsupervised lifelong learning with curricula (ULLC), where only one task needs to be labeled for initialization and the system then performs lifelong learning for subsequent tasks in an unsupervised fashion. A main challenge of realizing this paradigm lies in the occurrence of negative knowledge transfer, where partial old knowledge becomes detrimental for learning a given task yet cannot be filtered out by the learner without the help of labels. To overcome this challenge, we draw insights from the learning behaviors of humans. Specifically, when faced with a difficult task that cannot be well tackled by our current knowledge, we usually postpone it and work on some easier tasks first, which allows us to grow our knowledge. Thereafter, once we go back to the postponed task, we are more likely to tackle it well as we are more knowledgeable now. The key idea of ULLC is similar – at any time, a pool of candidate tasks are organized in a curriculum by their distances to the knowledge base. The learner then starts from the closer tasks, accumulates knowledge from learning them, and moves to learn the faraway tasks with a gradually augmented knowledge base. The viability and effectiveness of our proposal are substantiated through extensive empirical studies on both synthetic and real datasets.
{"title":"Unsupervised Lifelong Learning with Curricula","authors":"Yi He, Sheng Chen, Baijun Wu, Xu Yuan, Xindong Wu","doi":"10.1145/3442381.3449839","DOIUrl":"https://doi.org/10.1145/3442381.3449839","url":null,"abstract":"Lifelong machine learning (LML) has driven the development of extensive web applications, enabling the learning systems deployed on web servers to deal with a sequence of tasks in an incremental fashion. Such systems can retain knowledge from learned tasks in a knowledge base and seamlessly apply it to improve the future learning. Unfortunately, most existing LML methods require labels in every task, whereas providing persistent human labeling for all future tasks is costly, onerous, error-prone, and hence impractical. Motivated by this situation, we propose a new paradigm named unsupervised lifelong learning with curricula (ULLC), where only one task needs to be labeled for initialization and the system then performs lifelong learning for subsequent tasks in an unsupervised fashion. A main challenge of realizing this paradigm lies in the occurrence of negative knowledge transfer, where partial old knowledge becomes detrimental for learning a given task yet cannot be filtered out by the learner without the help of labels. To overcome this challenge, we draw insights from the learning behaviors of humans. Specifically, when faced with a difficult task that cannot be well tackled by our current knowledge, we usually postpone it and work on some easier tasks first, which allows us to grow our knowledge. Thereafter, once we go back to the postponed task, we are more likely to tackle it well as we are more knowledgeable now. The key idea of ULLC is similar – at any time, a pool of candidate tasks are organized in a curriculum by their distances to the knowledge base. The learner then starts from the closer tasks, accumulates knowledge from learning them, and moves to learn the faraway tasks with a gradually augmented knowledge base. The viability and effectiveness of our proposal are substantiated through extensive empirical studies on both synthetic and real datasets.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114285604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Y. Zheng, Chen Gao, Xiang Li, Xiangnan He, Depeng Jin, Yong Li
Recommendation models are usually trained on observational interaction data. However, observational interaction data could result from users’ conformity towards popular items, which entangles users’ real interest. Existing methods tracks this problem as eliminating popularity bias, e.g., by re-weighting training samples or leveraging a small fraction of unbiased data. However, the variety of user conformity is ignored by these approaches, and different causes of an interaction are bundled together as unified representations, hence robustness and interpretability are not guaranteed when underlying causes are changing. In this paper, we present DICE, a general framework that learns representations where interest and conformity are structurally disentangled, and various backbone recommendation models could be smoothly integrated. We assign users and items with separate embeddings for interest and conformity, and make each embedding capture only one cause by training with cause-specific data which is obtained according to the colliding effect of causal inference. Our proposed methodology outperforms state-of-the-art baselines with remarkable improvements on two real-world datasets on top of various backbone models. We further demonstrate that the learned embeddings successfully capture the desired causes, and show that DICE guarantees the robustness and interpretability of recommendation.
{"title":"Disentangling User Interest and Conformity for Recommendation with Causal Embedding","authors":"Y. Zheng, Chen Gao, Xiang Li, Xiangnan He, Depeng Jin, Yong Li","doi":"10.1145/3442381.3449788","DOIUrl":"https://doi.org/10.1145/3442381.3449788","url":null,"abstract":"Recommendation models are usually trained on observational interaction data. However, observational interaction data could result from users’ conformity towards popular items, which entangles users’ real interest. Existing methods tracks this problem as eliminating popularity bias, e.g., by re-weighting training samples or leveraging a small fraction of unbiased data. However, the variety of user conformity is ignored by these approaches, and different causes of an interaction are bundled together as unified representations, hence robustness and interpretability are not guaranteed when underlying causes are changing. In this paper, we present DICE, a general framework that learns representations where interest and conformity are structurally disentangled, and various backbone recommendation models could be smoothly integrated. We assign users and items with separate embeddings for interest and conformity, and make each embedding capture only one cause by training with cause-specific data which is obtained according to the colliding effect of causal inference. Our proposed methodology outperforms state-of-the-art baselines with remarkable improvements on two real-world datasets on top of various backbone models. We further demonstrate that the learned embeddings successfully capture the desired causes, and show that DICE guarantees the robustness and interpretability of recommendation.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116501207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Skill estimation mechanisms, colloquially known as rating systems, play an important role in competitive sports and games. They provide a measure of player skill, which incentivizes competitive performances and enables balanced match-ups. In this paper, we present a novel Bayesian rating system for contests with many participants. It is widely applicable to competition formats with discrete ranked matches, such as online programming competitions, obstacle courses races, and video games. The system’s simplicity allows us to prove theoretical bounds on its robustness and runtime. In addition, we show that it is incentive-compatible: a player who seeks to maximize their rating will never want to underperform. Experimentally, the rating system surpasses existing systems in prediction accuracy, and computes faster than existing systems by up to an order of magnitude.
{"title":"Elo-MMR: A Rating System for Massive Multiplayer Competitions","authors":"Aram Ebtekar, Paul Liu","doi":"10.1145/3442381.3450091","DOIUrl":"https://doi.org/10.1145/3442381.3450091","url":null,"abstract":"Skill estimation mechanisms, colloquially known as rating systems, play an important role in competitive sports and games. They provide a measure of player skill, which incentivizes competitive performances and enables balanced match-ups. In this paper, we present a novel Bayesian rating system for contests with many participants. It is widely applicable to competition formats with discrete ranked matches, such as online programming competitions, obstacle courses races, and video games. The system’s simplicity allows us to prove theoretical bounds on its robustness and runtime. In addition, we show that it is incentive-compatible: a player who seeks to maximize their rating will never want to underperform. Experimentally, the rating system surpasses existing systems in prediction accuracy, and computes faster than existing systems by up to an order of magnitude.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122371209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nofar Carmeli, Xiaolan Wang, Yoshihiko Suhara, S. Angelidis, Yuliang Li, Jinfeng Li, W. Tan
The Web is a major resource of both factual and subjective information. While there are significant efforts to organize factual information into knowledge bases, there is much less work on organizing opinions, which are abundant in subjective data, into a structured format. We present ExplainIt, a system that extracts and organizes opinions into an opinion graph, which are useful for downstream applications such as generating explainable review summaries and facilitating search over opinion phrases. In such graphs, a node represents a set of semantically similar opinions extracted from reviews and an edge between two nodes signifies that one node explains the other. ExplainIt mines explanations in a supervised method and groups similar opinions together in a weakly supervised way before combining the clusters of opinions together with their explanation relationships into an opinion graph. We experimentally demonstrate that the explanation relationships generated in the opinion graph are of good quality and our labeled datasets for explanation mining and grouping opinions are publicly available at https://github.com/megagonlabs/explainit.
{"title":"Constructing Explainable Opinion Graphs from Reviews","authors":"Nofar Carmeli, Xiaolan Wang, Yoshihiko Suhara, S. Angelidis, Yuliang Li, Jinfeng Li, W. Tan","doi":"10.1145/3442381.3450081","DOIUrl":"https://doi.org/10.1145/3442381.3450081","url":null,"abstract":"The Web is a major resource of both factual and subjective information. While there are significant efforts to organize factual information into knowledge bases, there is much less work on organizing opinions, which are abundant in subjective data, into a structured format. We present ExplainIt, a system that extracts and organizes opinions into an opinion graph, which are useful for downstream applications such as generating explainable review summaries and facilitating search over opinion phrases. In such graphs, a node represents a set of semantically similar opinions extracted from reviews and an edge between two nodes signifies that one node explains the other. ExplainIt mines explanations in a supervised method and groups similar opinions together in a weakly supervised way before combining the clusters of opinions together with their explanation relationships into an opinion graph. We experimentally demonstrate that the explanation relationships generated in the opinion graph are of good quality and our labeled datasets for explanation mining and grouping opinions are publicly available at https://github.com/megagonlabs/explainit.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130399135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sungkyu (Shaun) Park, Jamie Yejean Park, Hyojin Chin, Jeong-han Kang, M. Cha
Fact-checking has become the de facto solution for fighting fake news online. This research brings attention to the unexpected and diminished effect of fact-checking due to cognitive biases. We experimented (66,870 decisions) comparing the change in users’ stance toward unproven claims before and after being presented with a hypothetical fact-checked condition. We found that, first, the claims tagged with the ‘Lack of Evidence’ label are recognized similarly as false information unlike other borderline labels, indicating the presence of uncertainty-aversion bias in response to insufficient information. Second, users who initially show disapproval toward a claim are less likely to correct their views later than those who initially approve of the same claim when opposite fact-checking labels are shown — an indication of disapproval bias. Finally, user interviews revealed that users are more likely to share claims with Divided Evidence than those with Lack of Evidence among borderline messages, reaffirming the presence of uncertainty-aversion bias. On average, we confirm that fact-checking helps users correct their views and reduces the circulation of falsehoods by leading them to abandon extreme views. Simultaneously, the presence of two biases reveals that fact-checking does not always elicit the desired user experience and that the outcome varies by the design of fact-checking messages and people’s initial view. These new observations have direct implications for multiple stakeholders, including platforms, policy-makers, and online users.
{"title":"An Experimental Study to Understand User Experience and Perception Bias Occurred by Fact-checking Messages","authors":"Sungkyu (Shaun) Park, Jamie Yejean Park, Hyojin Chin, Jeong-han Kang, M. Cha","doi":"10.1145/3442381.3450121","DOIUrl":"https://doi.org/10.1145/3442381.3450121","url":null,"abstract":"Fact-checking has become the de facto solution for fighting fake news online. This research brings attention to the unexpected and diminished effect of fact-checking due to cognitive biases. We experimented (66,870 decisions) comparing the change in users’ stance toward unproven claims before and after being presented with a hypothetical fact-checked condition. We found that, first, the claims tagged with the ‘Lack of Evidence’ label are recognized similarly as false information unlike other borderline labels, indicating the presence of uncertainty-aversion bias in response to insufficient information. Second, users who initially show disapproval toward a claim are less likely to correct their views later than those who initially approve of the same claim when opposite fact-checking labels are shown — an indication of disapproval bias. Finally, user interviews revealed that users are more likely to share claims with Divided Evidence than those with Lack of Evidence among borderline messages, reaffirming the presence of uncertainty-aversion bias. On average, we confirm that fact-checking helps users correct their views and reduces the circulation of falsehoods by leading them to abandon extreme views. Simultaneously, the presence of two biases reveals that fact-checking does not always elicit the desired user experience and that the outcome varies by the design of fact-checking messages and people’s initial view. These new observations have direct implications for multiple stakeholders, including platforms, policy-makers, and online users.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127788691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural point processes (NPPs) employ neural networks to capture complicated dynamics of asynchronous event sequences. Existing NPPs feed all history events into neural networks, assuming that all event types contribute to the prediction of the target type. However, this assumption can be problematic because in reality some event types do not contribute to the predictions of another type. To correct this defect, we learn to omit those types of events that do not contribute to the prediction of one target type during the formulation of NPPs. Towards this end, we simultaneously consider the tasks of (1) finding event types that contribute to predictions of the target types and (2) learning a NPP model from event sequences. For the former, we formulate a latent graph, with event types being vertices and non-zero contributing relationships being directed edges; then we propose a probabilistic graph generator, from which we sample a latent graph. For the latter, the sampled graph can be readily used as a plug-in to modify an existing NPP model. Because these two tasks are nested, we propose to optimize the model parameters through bilevel programming, and develop an efficient solution based on truncated gradient back-propagation. Experimental results on both synthetic and real-world datasets show the improved performance against state-of-the-art baselines. This work removes disturbance of non-contributing event types with the aid of a validation procedure, similar to the practice to mitigate overfitting used when training machine learning models.
{"title":"Learning Neural Point Processes with Latent Graphs","authors":"Qiang Zhang, Aldo Lipani, Emine Yilmaz","doi":"10.1145/3442381.3450135","DOIUrl":"https://doi.org/10.1145/3442381.3450135","url":null,"abstract":"Neural point processes (NPPs) employ neural networks to capture complicated dynamics of asynchronous event sequences. Existing NPPs feed all history events into neural networks, assuming that all event types contribute to the prediction of the target type. However, this assumption can be problematic because in reality some event types do not contribute to the predictions of another type. To correct this defect, we learn to omit those types of events that do not contribute to the prediction of one target type during the formulation of NPPs. Towards this end, we simultaneously consider the tasks of (1) finding event types that contribute to predictions of the target types and (2) learning a NPP model from event sequences. For the former, we formulate a latent graph, with event types being vertices and non-zero contributing relationships being directed edges; then we propose a probabilistic graph generator, from which we sample a latent graph. For the latter, the sampled graph can be readily used as a plug-in to modify an existing NPP model. Because these two tasks are nested, we propose to optimize the model parameters through bilevel programming, and develop an efficient solution based on truncated gradient back-propagation. Experimental results on both synthetic and real-world datasets show the improved performance against state-of-the-art baselines. This work removes disturbance of non-contributing event types with the aid of a validation procedure, similar to the practice to mitigate overfitting used when training machine learning models.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117145366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cen Chen, Chengyu Wang, Minghui Qiu, D. Gao, Linbo Jin, Wang Li
Question Answering (QA) systems have been extensively studied in both academia and the research community due to their wide real-world applications. When building such industrial-scale QA applications, we are facing two prominent challenges, i.e., i) lacking a sufficient amount of training data to learn an accurate model and ii) requiring high inference speed for online model serving. There are generally two ways to mitigate the above-mentioned problems. One is to adopt transfer learning to leverage information from other domains; the other is to distill the “dark knowledge” from a large teacher model to small student models. The former usually employs parameter sharing mechanisms for knowledge transfer, but does not utilize the “dark knowledge” of pre-trained large models. The latter usually does not consider the cross-domain information from other domains. We argue that these two types of methods can be complementary to each other. Hence in this work, we provide a new perspective on the potential of the teacher-student paradigm facilitating cross-domain transfer learning, where the teacher and student tasks belong to heterogeneous domains, with the goal to improve the student model’s performance in the target domain. Our framework considers the “dark knowledge” learned from large teacher models and also leverages the adaptive hints to alleviate the domain differences between teacher and student models. Extensive experiments have been conducted on two text matching tasks for retrieval-based QA systems. Results show the proposed method has better performance than the competing methods including the existing state-of-the-art transfer learning methods. We have also deployed our method in an online production system and observed significant improvements compared to the existing approaches in terms of both accuracy and cross-domain robustness.
{"title":"Cross-domain Knowledge Distillation for Retrieval-based Question Answering Systems","authors":"Cen Chen, Chengyu Wang, Minghui Qiu, D. Gao, Linbo Jin, Wang Li","doi":"10.1145/3442381.3449814","DOIUrl":"https://doi.org/10.1145/3442381.3449814","url":null,"abstract":"Question Answering (QA) systems have been extensively studied in both academia and the research community due to their wide real-world applications. When building such industrial-scale QA applications, we are facing two prominent challenges, i.e., i) lacking a sufficient amount of training data to learn an accurate model and ii) requiring high inference speed for online model serving. There are generally two ways to mitigate the above-mentioned problems. One is to adopt transfer learning to leverage information from other domains; the other is to distill the “dark knowledge” from a large teacher model to small student models. The former usually employs parameter sharing mechanisms for knowledge transfer, but does not utilize the “dark knowledge” of pre-trained large models. The latter usually does not consider the cross-domain information from other domains. We argue that these two types of methods can be complementary to each other. Hence in this work, we provide a new perspective on the potential of the teacher-student paradigm facilitating cross-domain transfer learning, where the teacher and student tasks belong to heterogeneous domains, with the goal to improve the student model’s performance in the target domain. Our framework considers the “dark knowledge” learned from large teacher models and also leverages the adaptive hints to alleviate the domain differences between teacher and student models. Extensive experiments have been conducted on two text matching tasks for retrieval-based QA systems. Results show the proposed method has better performance than the competing methods including the existing state-of-the-art transfer learning methods. We have also deployed our method in an online production system and observed significant improvements compared to the existing approaches in terms of both accuracy and cross-domain robustness.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115218371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A widely adopted paradigm in the design of recommender systems is to represent users and items as vectors, often referred to as latent factors or embeddings. Embeddings can be obtained using a variety of recommendation models and served in production using a variety of data engineering solutions. Embeddings also facilitate transfer learning, where trained embeddings from one model are reused in another. In contrast, some of the best-performing collaborative filtering models today are high-dimensional linear models that do not rely on factorization, and so they do not produce embeddings [27, 28]. They also require pruning, amounting to a trade-off between the model size and the density of the predicted affinities. This paper argues for the use of high-dimensional, sparse latent factor models, instead. We propose a new recommendation model based on a full-rank factorization of the inverse Gram matrix. The resulting high-dimensional embeddings can be made sparse while still factorizing a dense affinity matrix. We show how the embeddings combine the advantages of latent representations with the performance of high-dimensional linear models.
{"title":"High-dimensional Sparse Embeddings for Collaborative Filtering","authors":"J. V. Balen, Bart Goethals","doi":"10.1145/3442381.3450054","DOIUrl":"https://doi.org/10.1145/3442381.3450054","url":null,"abstract":"A widely adopted paradigm in the design of recommender systems is to represent users and items as vectors, often referred to as latent factors or embeddings. Embeddings can be obtained using a variety of recommendation models and served in production using a variety of data engineering solutions. Embeddings also facilitate transfer learning, where trained embeddings from one model are reused in another. In contrast, some of the best-performing collaborative filtering models today are high-dimensional linear models that do not rely on factorization, and so they do not produce embeddings [27, 28]. They also require pruning, amounting to a trade-off between the model size and the density of the predicted affinities. This paper argues for the use of high-dimensional, sparse latent factor models, instead. We propose a new recommendation model based on a full-rank factorization of the inverse Gram matrix. The resulting high-dimensional embeddings can be made sparse while still factorizing a dense affinity matrix. We show how the embeddings combine the advantages of latent representations with the performance of high-dimensional linear models.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115442795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Current research on federated learning mainly focuses on joint optimization, improving efficiency and effectiveness, and protecting privacy. However, there are relatively few studies on incentive mechanisms. Most studies fail to consider the fact that if there is no profit, participants have no incentive to provide data and training models, and task requesters cannot identify and select reliable participants with high-quality data. Therefore, this paper proposes a federated learning incentive mechanism based on reputation and reverse auction theory. Participants bid for tasks, and reputation indirectly reflects their reliability and data quality. In this federated learning program, we select and reward participants by combining the reputation and bids of the participants under a limited budget. Theoretical analysis proves that the mechanism satisfies computational efficiency, individual rationality, budget feasibility, and truthfulness. The simulation results show the effectiveness of the mechanism.
{"title":"Incentive Mechanism for Horizontal Federated Learning Based on Reputation and Reverse Auction","authors":"Jingwen Zhang, Yuezhou Wu, Rong Pan","doi":"10.1145/3442381.3449888","DOIUrl":"https://doi.org/10.1145/3442381.3449888","url":null,"abstract":"Current research on federated learning mainly focuses on joint optimization, improving efficiency and effectiveness, and protecting privacy. However, there are relatively few studies on incentive mechanisms. Most studies fail to consider the fact that if there is no profit, participants have no incentive to provide data and training models, and task requesters cannot identify and select reliable participants with high-quality data. Therefore, this paper proposes a federated learning incentive mechanism based on reputation and reverse auction theory. Participants bid for tasks, and reputation indirectly reflects their reliability and data quality. In this federated learning program, we select and reward participants by combining the reputation and bids of the participants under a limited budget. Theoretical analysis proves that the mechanism satisfies computational efficiency, individual rationality, budget feasibility, and truthfulness. The simulation results show the effectiveness of the mechanism.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128706275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}