Click-through-rate (CTR) prediction plays an important role in online advertising and ad recommender systems. In the past decade, maximizing CTR has been the main focus of model development and solution creation. Therefore, researchers and practitioners have proposed various models and solutions to enhance the effectiveness of CTR prediction. Most of the existing literature focuses on capturing either implicit or explicit feature interactions. Although implicit interactions are successfully captured in some studies, explicit interactions present a challenge for achieving high CTR by extracting both low-order and high-order feature interactions. Unnecessary and irrelevant features may cause high computational time and low prediction performance. Furthermore, certain features may perform well with specific predictive models while underperforming with others. Also, feature distribution may fluctuate due to traffic variations. Most importantly, in live production environments, resources are limited, and the time for inference is just as crucial as training time. Because of all these reasons, feature selection is one of the most important factors in enhancing CTR prediction model performance. Simple filter-based feature selection algorithms do not perform well and they are not sufficient. An effective and efficient feature selection algorithm is needed to consistently filter the most useful features during live CTR prediction process. In this paper, we propose a heuristic algorithm named Neighborhood Search with Heuristic-based Feature Selection (NeSHFS) to enhance CTR prediction performance while reducing dimensionality and training time costs. We conduct comprehensive experiments on three public datasets to validate the efficiency and effectiveness of our proposed solution.
点击率(CTR)预测在在线广告和广告推荐系统中发挥着重要作用。在过去十年中,最大化点击率一直是模型开发和解决方案创建的重点。因此,研究人员和从业人员提出了各种模型和解决方案,以提高 CTR 预测的有效性。现有文献大多侧重于捕捉隐式或显式特征交互。虽然一些研究成功地捕捉到了隐式交互,但显式交互对通过提取低阶和高阶特征交互来实现高点击率提出了挑战。此外,某些特征可能在特定预测模型中表现良好,而在其他预测模型中表现不佳。此外,特征分布可能会因流量变化而波动。最重要的是,在实时生产环境中,资源是有限的,推理时间与训练时间同样重要。鉴于上述原因,特征选择是提高点击率预测模型性能的最重要因素之一。基于简单过滤器的特征选择算法性能不佳,而且不够充分。我们需要一种有效且高效的特征选择算法,以便在实时点击率预测过程中持续筛选出最有用的特征。本文提出了一种名为 "基于启发式特征选择的邻域搜索(NeighborhoodSearch with Heuristic-based Feature Selection,NeSHFS)"的启发式算法,以提高 CTR 预测性能,同时降低维度和训练时间成本。
{"title":"NeSHFS: Neighborhood Search with Heuristic-based Feature Selection for Click-Through Rate Prediction","authors":"Dogukan Aksu, Ismail Hakki Toroslu, Hasan Davulcu","doi":"arxiv-2409.08703","DOIUrl":"https://doi.org/arxiv-2409.08703","url":null,"abstract":"Click-through-rate (CTR) prediction plays an important role in online\u0000advertising and ad recommender systems. In the past decade, maximizing CTR has\u0000been the main focus of model development and solution creation. Therefore,\u0000researchers and practitioners have proposed various models and solutions to\u0000enhance the effectiveness of CTR prediction. Most of the existing literature\u0000focuses on capturing either implicit or explicit feature interactions. Although\u0000implicit interactions are successfully captured in some studies, explicit\u0000interactions present a challenge for achieving high CTR by extracting both\u0000low-order and high-order feature interactions. Unnecessary and irrelevant\u0000features may cause high computational time and low prediction performance.\u0000Furthermore, certain features may perform well with specific predictive models\u0000while underperforming with others. Also, feature distribution may fluctuate due\u0000to traffic variations. Most importantly, in live production environments,\u0000resources are limited, and the time for inference is just as crucial as\u0000training time. Because of all these reasons, feature selection is one of the\u0000most important factors in enhancing CTR prediction model performance. Simple\u0000filter-based feature selection algorithms do not perform well and they are not\u0000sufficient. An effective and efficient feature selection algorithm is needed to\u0000consistently filter the most useful features during live CTR prediction\u0000process. In this paper, we propose a heuristic algorithm named Neighborhood\u0000Search with Heuristic-based Feature Selection (NeSHFS) to enhance CTR\u0000prediction performance while reducing dimensionality and training time costs.\u0000We conduct comprehensive experiments on three public datasets to validate the\u0000efficiency and effectiveness of our proposed solution.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Over the years, Music Information Retrieval (MIR) has proposed various models pretrained on large amounts of music data. Transfer learning showcases the proven effectiveness of pretrained backend models with a broad spectrum of downstream tasks, including auto-tagging and genre classification. However, MIR papers generally do not explore the efficiency of pretrained models for Music Recommender Systems (MRS). In addition, the Recommender Systems community tends to favour traditional end-to-end neural network learning over these models. Our research addresses this gap and evaluates the applicability of six pretrained backend models (MusicFM, Music2Vec, MERT, EncodecMAE, Jukebox, and MusiCNN) in the context of MRS. We assess their performance using three recommendation models: K-nearest neighbours (KNN), shallow neural network, and BERT4Rec. Our findings suggest that pretrained audio representations exhibit significant performance variability between traditional MIR tasks and MRS, indicating that valuable aspects of musical information captured by backend models may differ depending on the task. This study establishes a foundation for further exploration of pretrained audio representations to enhance music recommendation systems.
{"title":"Comparative Analysis of Pretrained Audio Representations in Music Recommender Systems","authors":"Yan-Martin Tamm, Anna Aljanaki","doi":"arxiv-2409.08987","DOIUrl":"https://doi.org/arxiv-2409.08987","url":null,"abstract":"Over the years, Music Information Retrieval (MIR) has proposed various models\u0000pretrained on large amounts of music data. Transfer learning showcases the\u0000proven effectiveness of pretrained backend models with a broad spectrum of\u0000downstream tasks, including auto-tagging and genre classification. However, MIR\u0000papers generally do not explore the efficiency of pretrained models for Music\u0000Recommender Systems (MRS). In addition, the Recommender Systems community tends\u0000to favour traditional end-to-end neural network learning over these models. Our\u0000research addresses this gap and evaluates the applicability of six pretrained\u0000backend models (MusicFM, Music2Vec, MERT, EncodecMAE, Jukebox, and MusiCNN) in\u0000the context of MRS. We assess their performance using three recommendation\u0000models: K-nearest neighbours (KNN), shallow neural network, and BERT4Rec. Our\u0000findings suggest that pretrained audio representations exhibit significant\u0000performance variability between traditional MIR tasks and MRS, indicating that\u0000valuable aspects of musical information captured by backend models may differ\u0000depending on the task. This study establishes a foundation for further\u0000exploration of pretrained audio representations to enhance music recommendation\u0000systems.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"19 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hang Pan, Shuxian Bi, Wenjie Wang, Haoxuan Li, Peng Wu, Fuli Feng, Xiangnan He
Recommending items solely catering to users' historical interests narrows users' horizons. Recent works have considered steering target users beyond their historical interests by directly adjusting items exposed to them. However, the recommended items for direct steering might not align perfectly with users' interests evolution, detrimentally affecting target users' experience. To avoid this issue, we propose a new task named Proactive Recommendation in Social Networks (PRSN) that indirectly steers users' interest by utilizing the influence of social neighbors, i.e., indirect steering by adjusting the exposure of a target item to target users' neighbors. The key to PRSN lies in answering an interventional question: what would a target user's feedback be on a target item if the item is exposed to the user's different neighbors? To answer this question, we resort to causal inference and formalize PRSN as: (1) estimating the potential feedback of a user on an item, under the network interference by the item's exposure to the user's neighbors; and (2) adjusting the exposure of a target item to target users' neighbors to trade off steering performance and the damage to the neighbors' experience. To this end, we propose a Neighbor Interference Recommendation (NIRec) framework with two key modules: (1)an interference representation-based estimation module for modeling potential feedback; and (2) a post-learning-based optimization module for optimizing a target item's exposure to trade off steering performance and the neighbors' experience by greedy search. We conduct extensive semi-simulation experiments based on three real-world datasets, validating the steering effectiveness of NIRec.
{"title":"Proactive Recommendation in Social Networks: Steering User Interest via Neighbor Influence","authors":"Hang Pan, Shuxian Bi, Wenjie Wang, Haoxuan Li, Peng Wu, Fuli Feng, Xiangnan He","doi":"arxiv-2409.08934","DOIUrl":"https://doi.org/arxiv-2409.08934","url":null,"abstract":"Recommending items solely catering to users' historical interests narrows\u0000users' horizons. Recent works have considered steering target users beyond\u0000their historical interests by directly adjusting items exposed to them.\u0000However, the recommended items for direct steering might not align perfectly\u0000with users' interests evolution, detrimentally affecting target users'\u0000experience. To avoid this issue, we propose a new task named Proactive\u0000Recommendation in Social Networks (PRSN) that indirectly steers users' interest\u0000by utilizing the influence of social neighbors, i.e., indirect steering by\u0000adjusting the exposure of a target item to target users' neighbors. The key to\u0000PRSN lies in answering an interventional question: what would a target user's\u0000feedback be on a target item if the item is exposed to the user's different\u0000neighbors? To answer this question, we resort to causal inference and formalize\u0000PRSN as: (1) estimating the potential feedback of a user on an item, under the\u0000network interference by the item's exposure to the user's neighbors; and (2)\u0000adjusting the exposure of a target item to target users' neighbors to trade off\u0000steering performance and the damage to the neighbors' experience. To this end,\u0000we propose a Neighbor Interference Recommendation (NIRec) framework with two\u0000key modules: (1)an interference representation-based estimation module for\u0000modeling potential feedback; and (2) a post-learning-based optimization module\u0000for optimizing a target item's exposure to trade off steering performance and\u0000the neighbors' experience by greedy search. We conduct extensive\u0000semi-simulation experiments based on three real-world datasets, validating the\u0000steering effectiveness of NIRec.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the growing success of Large Language models (LLMs) in information-seeking scenarios, search engines are now adopting generative approaches to provide answers along with in-line citations as attribution. While existing work focuses mainly on attributed question answering, in this paper, we target information-seeking scenarios which are often more challenging due to the open-ended nature of the queries and the size of the label space in terms of the diversity of candidate-attributed answers per query. We propose a reproducible framework to evaluate and benchmark attributed information seeking, using any backbone LLM, and different architectural designs: (1) Generate (2) Retrieve then Generate, and (3) Generate then Retrieve. Experiments using HAGRID, an attributed information-seeking dataset, show the impact of different scenarios on both the correctness and attributability of answers.
{"title":"An Evaluation Framework for Attributed Information Retrieval using Large Language Models","authors":"Hanane Djeddal, Pierre Erbacher, Raouf Toukal, Laure Soulier, Karen Pinel-Sauvagnat, Sophia Katrenko, Lynda Tamine","doi":"arxiv-2409.08014","DOIUrl":"https://doi.org/arxiv-2409.08014","url":null,"abstract":"With the growing success of Large Language models (LLMs) in\u0000information-seeking scenarios, search engines are now adopting generative\u0000approaches to provide answers along with in-line citations as attribution.\u0000While existing work focuses mainly on attributed question answering, in this\u0000paper, we target information-seeking scenarios which are often more challenging\u0000due to the open-ended nature of the queries and the size of the label space in\u0000terms of the diversity of candidate-attributed answers per query. We propose a\u0000reproducible framework to evaluate and benchmark attributed information\u0000seeking, using any backbone LLM, and different architectural designs: (1)\u0000Generate (2) Retrieve then Generate, and (3) Generate then Retrieve.\u0000Experiments using HAGRID, an attributed information-seeking dataset, show the\u0000impact of different scenarios on both the correctness and attributability of\u0000answers.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Many organizations rely on Threat Intelligence (TI) feeds to assess the risk associated with security threats. Due to the volume and heterogeneity of data, it is prohibitive to manually analyze the threat information available in different loosely structured TI feeds. Thus, there is a need to develop automated methods to vet and extract actionable information from TI feeds. To this end, we present a machine learning pipeline to automatically detect vulnerability exploitation from TI feeds. We first model threat vocabulary in loosely structured TI feeds using state-of-the-art embedding techniques (Doc2Vec and BERT) and then use it to train a supervised machine learning classifier to detect exploitation of security vulnerabilities. We use our approach to identify exploitation events in 191 different TI feeds. Our longitudinal evaluation shows that it is able to accurately identify exploitation events from TI feeds only using past data for training and even on TI feeds withheld from training. Our proposed approach is useful for a variety of downstream tasks such as data-driven vulnerability risk assessment.
许多组织依靠威胁情报(TI)信息源来评估与安全威胁相关的风险。由于数据量大且异构,要手动分析结构松散的 TI 源中的威胁信息非常困难。因此,有必要开发自动化方法,从 TI 源中审核和提取可操作的信息。为此,我们提出了一种机器学习管道,用于自动检测技术信息源中的漏洞利用情况。我们首先使用最先进的嵌入技术(Doc2Vec 和 BERT)对结构松散的 TI feed 中的威胁词汇进行建模,然后使用它来训练监督机器学习分类器,以检测安全漏洞的利用情况。我们使用该方法识别了 191 个不同 TI 源中的利用事件。我们的纵向评估结果表明,该方法仅使用过去的数据进行训练,就能准确识别 TI feed 中的利用事件,甚至能识别未进行训练的 TI feed 中的利用事件。我们提出的方法适用于各种下游任务,如数据驱动的漏洞风险评估。
{"title":"Harnessing TI Feeds for Exploitation Detection","authors":"Kajal Patel, Zubair Shafiq, Mateus Nogueira, Daniel Sadoc Menasché, Enrico Lovat, Taimur Kashif, Ashton Woiwood, Matheus Martins","doi":"arxiv-2409.07709","DOIUrl":"https://doi.org/arxiv-2409.07709","url":null,"abstract":"Many organizations rely on Threat Intelligence (TI) feeds to assess the risk\u0000associated with security threats. Due to the volume and heterogeneity of data,\u0000it is prohibitive to manually analyze the threat information available in\u0000different loosely structured TI feeds. Thus, there is a need to develop\u0000automated methods to vet and extract actionable information from TI feeds. To\u0000this end, we present a machine learning pipeline to automatically detect\u0000vulnerability exploitation from TI feeds. We first model threat vocabulary in\u0000loosely structured TI feeds using state-of-the-art embedding techniques\u0000(Doc2Vec and BERT) and then use it to train a supervised machine learning\u0000classifier to detect exploitation of security vulnerabilities. We use our\u0000approach to identify exploitation events in 191 different TI feeds. Our\u0000longitudinal evaluation shows that it is able to accurately identify\u0000exploitation events from TI feeds only using past data for training and even on\u0000TI feeds withheld from training. Our proposed approach is useful for a variety\u0000of downstream tasks such as data-driven vulnerability risk assessment.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"117 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gabriel de Souza P. Moreira, Ronay Ak, Benedikt Schifferer, Mengyao Xu, Radek Osmulski, Even Oldridge
Ranking models play a crucial role in enhancing overall accuracy of text retrieval systems. These multi-stage systems typically utilize either dense embedding models or sparse lexical indices to retrieve relevant passages based on a given query, followed by ranking models that refine the ordering of the candidate passages by its relevance to the query. This paper benchmarks various publicly available ranking models and examines their impact on ranking accuracy. We focus on text retrieval for question-answering tasks, a common use case for Retrieval-Augmented Generation systems. Our evaluation benchmarks include models some of which are commercially viable for industrial applications. We introduce a state-of-the-art ranking model, NV-RerankQA-Mistral-4B-v3, which achieves a significant accuracy increase of ~14% compared to pipelines with other rerankers. We also provide an ablation study comparing the fine-tuning of ranking models with different sizes, losses and self-attention mechanisms. Finally, we discuss challenges of text retrieval pipelines with ranking models in real-world industry applications, in particular the trade-offs among model size, ranking accuracy and system requirements like indexing and serving latency / throughput.
{"title":"Enhancing Q&A Text Retrieval with Ranking Models: Benchmarking, fine-tuning and deploying Rerankers for RAG","authors":"Gabriel de Souza P. Moreira, Ronay Ak, Benedikt Schifferer, Mengyao Xu, Radek Osmulski, Even Oldridge","doi":"arxiv-2409.07691","DOIUrl":"https://doi.org/arxiv-2409.07691","url":null,"abstract":"Ranking models play a crucial role in enhancing overall accuracy of text\u0000retrieval systems. These multi-stage systems typically utilize either dense\u0000embedding models or sparse lexical indices to retrieve relevant passages based\u0000on a given query, followed by ranking models that refine the ordering of the\u0000candidate passages by its relevance to the query. This paper benchmarks various publicly available ranking models and examines\u0000their impact on ranking accuracy. We focus on text retrieval for\u0000question-answering tasks, a common use case for Retrieval-Augmented Generation\u0000systems. Our evaluation benchmarks include models some of which are\u0000commercially viable for industrial applications. We introduce a state-of-the-art ranking model, NV-RerankQA-Mistral-4B-v3,\u0000which achieves a significant accuracy increase of ~14% compared to pipelines\u0000with other rerankers. We also provide an ablation study comparing the\u0000fine-tuning of ranking models with different sizes, losses and self-attention\u0000mechanisms. Finally, we discuss challenges of text retrieval pipelines with ranking\u0000models in real-world industry applications, in particular the trade-offs among\u0000model size, ranking accuracy and system requirements like indexing and serving\u0000latency / throughput.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sümeyye Öztürk, Ahmed Burak Ercan, Resul Tugay, Şule Gündüz Öğüdücü
In today's world of globalized commerce, cross-market recommendation systems (CMRs) are crucial for providing personalized user experiences across diverse market segments. However, traditional recommendation algorithms have difficulties dealing with market specificity and data sparsity, especially in new or emerging markets. In this paper, we propose the CrossGR model, which utilizes Graph Isomorphism Networks (GINs) to improve CMR systems. It outperforms existing benchmarks in NDCG@10 and HR@10 metrics, demonstrating its adaptability and accuracy in handling diverse market segments. The CrossGR model is adaptable and accurate, making it well-suited for handling the complexities of cross-market recommendation tasks. Its robustness is demonstrated by consistent performance across different evaluation timeframes, indicating its potential to cater to evolving market trends and user preferences. Our findings suggest that GINs represent a promising direction for CMRs, paving the way for more sophisticated, personalized, and context-aware recommendation systems in the dynamic landscape of global e-commerce.
{"title":"Enhancing Cross-Market Recommendation System with Graph Isomorphism Networks: A Novel Approach to Personalized User Experience","authors":"Sümeyye Öztürk, Ahmed Burak Ercan, Resul Tugay, Şule Gündüz Öğüdücü","doi":"arxiv-2409.07850","DOIUrl":"https://doi.org/arxiv-2409.07850","url":null,"abstract":"In today's world of globalized commerce, cross-market recommendation systems\u0000(CMRs) are crucial for providing personalized user experiences across diverse\u0000market segments. However, traditional recommendation algorithms have\u0000difficulties dealing with market specificity and data sparsity, especially in\u0000new or emerging markets. In this paper, we propose the CrossGR model, which\u0000utilizes Graph Isomorphism Networks (GINs) to improve CMR systems. It\u0000outperforms existing benchmarks in NDCG@10 and HR@10 metrics, demonstrating its\u0000adaptability and accuracy in handling diverse market segments. The CrossGR\u0000model is adaptable and accurate, making it well-suited for handling the\u0000complexities of cross-market recommendation tasks. Its robustness is\u0000demonstrated by consistent performance across different evaluation timeframes,\u0000indicating its potential to cater to evolving market trends and user\u0000preferences. Our findings suggest that GINs represent a promising direction for\u0000CMRs, paving the way for more sophisticated, personalized, and context-aware\u0000recommendation systems in the dynamic landscape of global e-commerce.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"55 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In hierarchical cognitive radio networks, edge or cloud servers utilize the data collected by edge devices for modulation classification, which, however, is faced with problems of the transmission overhead, data privacy, and computation load. In this article, an edge learning (EL) based framework jointly mobilizing the edge device and the edge server for intelligent co-inference is proposed to realize the collaborative automatic modulation classification (C-AMC) between them. A spectrum semantic compression neural network (SSCNet) with the lightweight structure is designed for the edge device to compress the collected raw data into a compact semantic message that is then sent to the edge server via the wireless channel. On the edge server side, a modulation classification neural network (MCNet) combining bidirectional long short-term memory (Bi?LSTM) and multi-head attention layers is elaborated to deter?mine the modulation type from the noisy semantic message. By leveraging the computation resources of both the edge device and the edge server, high transmission overhead and risks of data privacy leakage are avoided. The simulation results verify the effectiveness of the proposed C-AMC framework, significantly reducing the model size and computational complexity.
{"title":"Collaborative Automatic Modulation Classification via Deep Edge Inference for Hierarchical Cognitive Radio Networks","authors":"Chaowei He, Peihao Dong, Fuhui Zhou, Qihui Wu","doi":"arxiv-2409.07946","DOIUrl":"https://doi.org/arxiv-2409.07946","url":null,"abstract":"In hierarchical cognitive radio networks, edge or cloud servers utilize the\u0000data collected by edge devices for modulation classification, which, however,\u0000is faced with problems of the transmission overhead, data privacy, and\u0000computation load. In this article, an edge learning (EL) based framework\u0000jointly mobilizing the edge device and the edge server for intelligent\u0000co-inference is proposed to realize the collaborative automatic modulation\u0000classification (C-AMC) between them. A spectrum semantic compression neural\u0000network (SSCNet) with the lightweight structure is designed for the edge device\u0000to compress the collected raw data into a compact semantic message that is then\u0000sent to the edge server via the wireless channel. On the edge server side, a\u0000modulation classification neural network (MCNet) combining bidirectional long\u0000short-term memory (Bi?LSTM) and multi-head attention layers is elaborated to\u0000deter?mine the modulation type from the noisy semantic message. By leveraging\u0000the computation resources of both the edge device and the edge server, high\u0000transmission overhead and risks of data privacy leakage are avoided. The\u0000simulation results verify the effectiveness of the proposed C-AMC framework,\u0000significantly reducing the model size and computational complexity.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"173 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chaoqun Yang, Wei Yuan, Liang Qu, Thanh Tam Nguyen
Federated recommender systems (FedRecs) have emerged as a popular research direction for protecting users' privacy in on-device recommendations. In FedRecs, users keep their data locally and only contribute their local collaborative information by uploading model parameters to a central server. While this rigid framework protects users' raw data during training, it severely compromises the recommendation model's performance due to the following reasons: (1) Due to the power law distribution nature of user behavior data, individual users have few data points to train a recommendation model, resulting in uploaded model updates that may be far from optimal; (2) As each user's uploaded parameters are learned from local data, which lacks global collaborative information, relying solely on parameter aggregation methods such as FedAvg to fuse global collaborative information may be suboptimal. To bridge this performance gap, we propose a novel federated recommendation framework, PDC-FRS. Specifically, we design a privacy-preserving data contribution mechanism that allows users to share their data with a differential privacy guarantee. Based on the shared but perturbed data, an auxiliary model is trained in parallel with the original federated recommendation process. This auxiliary model enhances FedRec by augmenting each user's local dataset and integrating global collaborative information. To demonstrate the effectiveness of PDC-FRS, we conduct extensive experiments on two widely used recommendation datasets. The empirical results showcase the superiority of PDC-FRS compared to baseline methods.
{"title":"PDC-FRS: Privacy-preserving Data Contribution for Federated Recommender System","authors":"Chaoqun Yang, Wei Yuan, Liang Qu, Thanh Tam Nguyen","doi":"arxiv-2409.07773","DOIUrl":"https://doi.org/arxiv-2409.07773","url":null,"abstract":"Federated recommender systems (FedRecs) have emerged as a popular research\u0000direction for protecting users' privacy in on-device recommendations. In\u0000FedRecs, users keep their data locally and only contribute their local\u0000collaborative information by uploading model parameters to a central server.\u0000While this rigid framework protects users' raw data during training, it\u0000severely compromises the recommendation model's performance due to the\u0000following reasons: (1) Due to the power law distribution nature of user\u0000behavior data, individual users have few data points to train a recommendation\u0000model, resulting in uploaded model updates that may be far from optimal; (2) As\u0000each user's uploaded parameters are learned from local data, which lacks global\u0000collaborative information, relying solely on parameter aggregation methods such\u0000as FedAvg to fuse global collaborative information may be suboptimal. To bridge\u0000this performance gap, we propose a novel federated recommendation framework,\u0000PDC-FRS. Specifically, we design a privacy-preserving data contribution\u0000mechanism that allows users to share their data with a differential privacy\u0000guarantee. Based on the shared but perturbed data, an auxiliary model is\u0000trained in parallel with the original federated recommendation process. This\u0000auxiliary model enhances FedRec by augmenting each user's local dataset and\u0000integrating global collaborative information. To demonstrate the effectiveness\u0000of PDC-FRS, we conduct extensive experiments on two widely used recommendation\u0000datasets. The empirical results showcase the superiority of PDC-FRS compared to\u0000baseline methods.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"404 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Savvina Daniil, Manel Slokom, Mirjam Cuper, Cynthia C. S. Liem, Jacco van Ossenbruggen, Laura Hollink
Statements on the propagation of bias by recommender systems are often hard to verify or falsify. Research on bias tends to draw from a small pool of publicly available datasets and is therefore bound by their specific properties. Additionally, implementation choices are often not explicitly described or motivated in research, while they may have an effect on bias propagation. In this paper, we explore the challenges of measuring and reporting popularity bias. We showcase the impact of data properties and algorithm configurations on popularity bias by combining synthetic data with well known recommender systems frameworks that implement UserKNN. First, we identify data characteristics that might impact popularity bias, based on the functionality of UserKNN. Accordingly, we generate various datasets that combine these characteristics. Second, we locate UserKNN configurations that vary across implementations in literature. We evaluate popularity bias for five synthetic datasets and five UserKNN configurations, and offer insights on their joint effect. We find that, depending on the data characteristics, various UserKNN configurations can lead to different conclusions regarding the propagation of popularity bias. These results motivate the need for explicitly addressing algorithmic configuration and data properties when reporting and interpreting bias in recommender systems.
{"title":"On the challenges of studying bias in Recommender Systems: A UserKNN case study","authors":"Savvina Daniil, Manel Slokom, Mirjam Cuper, Cynthia C. S. Liem, Jacco van Ossenbruggen, Laura Hollink","doi":"arxiv-2409.08046","DOIUrl":"https://doi.org/arxiv-2409.08046","url":null,"abstract":"Statements on the propagation of bias by recommender systems are often hard\u0000to verify or falsify. Research on bias tends to draw from a small pool of\u0000publicly available datasets and is therefore bound by their specific\u0000properties. Additionally, implementation choices are often not explicitly\u0000described or motivated in research, while they may have an effect on bias\u0000propagation. In this paper, we explore the challenges of measuring and\u0000reporting popularity bias. We showcase the impact of data properties and\u0000algorithm configurations on popularity bias by combining synthetic data with\u0000well known recommender systems frameworks that implement UserKNN. First, we\u0000identify data characteristics that might impact popularity bias, based on the\u0000functionality of UserKNN. Accordingly, we generate various datasets that\u0000combine these characteristics. Second, we locate UserKNN configurations that\u0000vary across implementations in literature. We evaluate popularity bias for five\u0000synthetic datasets and five UserKNN configurations, and offer insights on their\u0000joint effect. We find that, depending on the data characteristics, various\u0000UserKNN configurations can lead to different conclusions regarding the\u0000propagation of popularity bias. These results motivate the need for explicitly\u0000addressing algorithmic configuration and data properties when reporting and\u0000interpreting bias in recommender systems.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}