Efficient Learning to Learn a Robust CTR Model for Web-scale Online Sponsored Search Advertising

Xin Wang, Peng Yang, S. Chen, Lin Liu, Liang Zhao, Jiacheng Guo, Mingming Sun, Ping Li
{"title":"Efficient Learning to Learn a Robust CTR Model for Web-scale Online Sponsored Search Advertising","authors":"Xin Wang, Peng Yang, S. Chen, Lin Liu, Liang Zhao, Jiacheng Guo, Mingming Sun, Ping Li","doi":"10.1145/3459637.3481912","DOIUrl":null,"url":null,"abstract":"Click-through rate (CTR) prediction is crucial for online sponsored search advertising. Several successful CTR models have been adopted in the industry, including the regularized logistic regression (LR). Nonetheless, the learning process suffers from two limitations: 1) Feature crosses for high-order information may generate trillions of features, which are sparse for online learning examples; 2) Rapid changing of data distribution brings challenges to the accurate learning since the model has to perform a fast adaptation on the new data. Moreover, existing adaptive optimizers are ineffective in handling the sparsity issue for high-dimensional features. In this paper, we propose to learn an optimizer in a meta-learning scenario, where the optimizer is learned on prior data and can be easily adapted to the new data. We firstly build a low-dimensional feature embedding on prior data to encode the association among features. Then, the gradients on new data can be decomposed into the low-dimensional space, enabling the parameter update smoothed and relieving the sparsity. Note that this technology could be deployed into a distributed system to ensure efficient online learning on the trillions-level parameters. We conduct extensive experiments to evaluate the algorithm in terms of prediction accuracy and actual revenue. Experimental results demonstrate that the proposed framework achieves a promising prediction on the new data. The final online revenue is noticeably improved compared to the baseline. This framework was initially deployed in Baidu Search Ads (a.k.a. Phoenix Nest) in 2014 and is currently still being used in certain modules of Baidu's ads systems.","PeriodicalId":405296,"journal":{"name":"Proceedings of the 30th ACM International Conference on Information & Knowledge Management","volume":"64 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 30th ACM International Conference on Information & Knowledge Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3459637.3481912","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Click-through rate (CTR) prediction is crucial for online sponsored search advertising. Several successful CTR models have been adopted in the industry, including the regularized logistic regression (LR). Nonetheless, the learning process suffers from two limitations: 1) Feature crosses for high-order information may generate trillions of features, which are sparse for online learning examples; 2) Rapid changing of data distribution brings challenges to the accurate learning since the model has to perform a fast adaptation on the new data. Moreover, existing adaptive optimizers are ineffective in handling the sparsity issue for high-dimensional features. In this paper, we propose to learn an optimizer in a meta-learning scenario, where the optimizer is learned on prior data and can be easily adapted to the new data. We firstly build a low-dimensional feature embedding on prior data to encode the association among features. Then, the gradients on new data can be decomposed into the low-dimensional space, enabling the parameter update smoothed and relieving the sparsity. Note that this technology could be deployed into a distributed system to ensure efficient online learning on the trillions-level parameters. We conduct extensive experiments to evaluate the algorithm in terms of prediction accuracy and actual revenue. Experimental results demonstrate that the proposed framework achieves a promising prediction on the new data. The final online revenue is noticeably improved compared to the baseline. This framework was initially deployed in Baidu Search Ads (a.k.a. Phoenix Nest) in 2014 and is currently still being used in certain modules of Baidu's ads systems.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
有效学习学习网络规模在线赞助搜索广告的鲁棒点击率模型
点击率(CTR)预测对在线赞助搜索广告至关重要。业界已经采用了几个成功的CTR模型,包括正则化逻辑回归(LR)。然而,学习过程存在两个局限性:1)高阶信息的特征交叉可能会产生数万亿个特征,这对于在线学习示例来说是稀疏的;2)数据分布的快速变化给模型的准确学习带来了挑战,因为模型必须对新数据进行快速适应。此外,现有的自适应优化器在处理高维特征的稀疏性问题时效果不佳。在本文中,我们建议在元学习场景中学习优化器,其中优化器是在先前数据上学习的,并且可以很容易地适应新数据。首先在先验数据上构建低维特征嵌入,对特征之间的关联进行编码。然后,将新数据上的梯度分解到低维空间中,使参数更新平滑,减轻稀疏性。请注意,这项技术可以部署到分布式系统中,以确保在数万亿级别的参数上有效地在线学习。我们进行了大量的实验来评估该算法的预测准确性和实际收益。实验结果表明,该框架对新数据的预测效果良好。与基线相比,最终的在线收入显著提高。该框架最初于2014年部署在百度搜索广告(又名凤凰巢)中,目前仍在百度广告系统的某些模块中使用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
UltraGCN Fine and Coarse Granular Argument Classification before Clustering CHASE Crawler Detection in Location-Based Services Using Attributed Action Net Failure Prediction for Large-scale Water Pipe Networks Using GNN and Temporal Failure Series
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1