{"title":"Practically Unbiased Pairwise Loss for Recommendation With Implicit Feedback","authors":"Tianwei Cao;Qianqian Xu;Zhiyong Yang;Zhanyu Ma;Qingming Huang","doi":"10.1109/TPAMI.2024.3519711","DOIUrl":null,"url":null,"abstract":"Recommender systems have been widely employed on various online platforms to improve user experience. In these systems, recommendation models are often learned from the users’ historical behaviors that are automatically collected. Notably, recommender systems differ slightly from ordinary supervised learning tasks. In recommender systems, there is an exposure mechanism that decides which items could be presented to each specific user, which breaks the i.i.d assumption of supervised learning and brings biases into the recommendation models. In this paper, we focus on unbiased ranking loss weighted by inversed propensity scores (IPS), which are widely used in recommendations with implicit feedback labels. More specifically, we first highlight the fact that there is a gap between theory and practice in IPS-weighted unbiased loss. The existing pairwise loss could be theoretically unbiased by adopting an IPS weighting scheme. Unfortunately, the propensity scores are hard to estimate due to the inaccessibility of each user-item pair's true exposure status. In practical scenarios, we can only approximate the propensity scores. In this way, the theoretically unbiased loss would be still practically biased. To solve this problem, we first construct a theoretical framework to obtain a generalization upper bound of the current theoretically unbiased loss. The bound illustrates that we can ensure the theoretically unbiased loss's generalization ability if we lower its implementation loss and practical bias at the same time. To that aim, we suggest treating feedback label <inline-formula><tex-math>$Y_{ui}$</tex-math></inline-formula> as a noisy proxy for exposure result <inline-formula><tex-math>$O_{ui}$</tex-math></inline-formula> for each user-item pair <inline-formula><tex-math>$(u, i)$</tex-math></inline-formula>. Here we assume the noise rate meets the condition that <inline-formula><tex-math>$\\hat{P}(O_{ui}=1, Y_{ui}\\ne O_{ui}) < 1/2$</tex-math></inline-formula>. According to our analysis, this is a mild assumption that can be satisfied by many real-world applications. Based on this, we could train an accurate propensity model directly by leveraging a noise-resistant loss function. Then we could construct a practically unbiased recommendation model weighted by precise propensity scores. Lastly, experimental findings on public datasets demonstrate our suggested method's effectiveness.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 4","pages":"2460-2474"},"PeriodicalIF":0.0000,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10810273/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Recommender systems have been widely employed on various online platforms to improve user experience. In these systems, recommendation models are often learned from the users’ historical behaviors that are automatically collected. Notably, recommender systems differ slightly from ordinary supervised learning tasks. In recommender systems, there is an exposure mechanism that decides which items could be presented to each specific user, which breaks the i.i.d assumption of supervised learning and brings biases into the recommendation models. In this paper, we focus on unbiased ranking loss weighted by inversed propensity scores (IPS), which are widely used in recommendations with implicit feedback labels. More specifically, we first highlight the fact that there is a gap between theory and practice in IPS-weighted unbiased loss. The existing pairwise loss could be theoretically unbiased by adopting an IPS weighting scheme. Unfortunately, the propensity scores are hard to estimate due to the inaccessibility of each user-item pair's true exposure status. In practical scenarios, we can only approximate the propensity scores. In this way, the theoretically unbiased loss would be still practically biased. To solve this problem, we first construct a theoretical framework to obtain a generalization upper bound of the current theoretically unbiased loss. The bound illustrates that we can ensure the theoretically unbiased loss's generalization ability if we lower its implementation loss and practical bias at the same time. To that aim, we suggest treating feedback label $Y_{ui}$ as a noisy proxy for exposure result $O_{ui}$ for each user-item pair $(u, i)$. Here we assume the noise rate meets the condition that $\hat{P}(O_{ui}=1, Y_{ui}\ne O_{ui}) < 1/2$. According to our analysis, this is a mild assumption that can be satisfied by many real-world applications. Based on this, we could train an accurate propensity model directly by leveraging a noise-resistant loss function. Then we could construct a practically unbiased recommendation model weighted by precise propensity scores. Lastly, experimental findings on public datasets demonstrate our suggested method's effectiveness.