双注意力高阶分解机

Arindam Sarkar, Dipankar Das, Vivek Sembium, Prakash Mandayam Comar
{"title":"双注意力高阶分解机","authors":"Arindam Sarkar, Dipankar Das, Vivek Sembium, Prakash Mandayam Comar","doi":"10.1145/3523227.3546789","DOIUrl":null,"url":null,"abstract":"Numerous problems of practical significance such as clickthrough rate (CTR) prediction, forecasting, tagging and so on, involve complex interaction of various user, item and context features. Manual feature engineering has been used in the past to model these combinatorial features but it requires domain expertise and becomes prohibitively expensive as the number of features increases. Feedforward neural networks alleviate the need for manual feature engineering to a large extent and have shown impressive performance across multiple domains due to their ability to learn arbitrary functions. Despite multiple layers of non-linear projections, neural networks are limited in their ability to efficiently model functions with higher order interaction terms. In recent years, Factorization Machines and its variants have been proposed to explicitly capture higher order combinatorial interactions. However not all feature interactions are equally important, and in sparse data settings, without a suitable suppression mechanism, this might result into noisy terms during inference and hurt model generalization. In this work we present Dual Attentional Higher Order Factorization Machine (DA-HoFM), a unified attentional higher order factorization machine which leverages a compositional architecture to compute higher order terms with complexity linear in terms of maximum interaction degree. Equipped with sparse dual attention mechanism, DA-HoFM summarizes interaction terms at each layer, and is able to efficiently select important higher order terms. We empirically demonstrate effectiveness of our proposed models on the task of CTR prediction, where our model exhibits superior performance compared to the recent state-of-the-art models, outperforming them by up to 6.7% on the logloss metric.","PeriodicalId":443279,"journal":{"name":"Proceedings of the 16th ACM Conference on Recommender Systems","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Dual Attentional Higher Order Factorization Machines\",\"authors\":\"Arindam Sarkar, Dipankar Das, Vivek Sembium, Prakash Mandayam Comar\",\"doi\":\"10.1145/3523227.3546789\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Numerous problems of practical significance such as clickthrough rate (CTR) prediction, forecasting, tagging and so on, involve complex interaction of various user, item and context features. Manual feature engineering has been used in the past to model these combinatorial features but it requires domain expertise and becomes prohibitively expensive as the number of features increases. Feedforward neural networks alleviate the need for manual feature engineering to a large extent and have shown impressive performance across multiple domains due to their ability to learn arbitrary functions. Despite multiple layers of non-linear projections, neural networks are limited in their ability to efficiently model functions with higher order interaction terms. In recent years, Factorization Machines and its variants have been proposed to explicitly capture higher order combinatorial interactions. However not all feature interactions are equally important, and in sparse data settings, without a suitable suppression mechanism, this might result into noisy terms during inference and hurt model generalization. In this work we present Dual Attentional Higher Order Factorization Machine (DA-HoFM), a unified attentional higher order factorization machine which leverages a compositional architecture to compute higher order terms with complexity linear in terms of maximum interaction degree. Equipped with sparse dual attention mechanism, DA-HoFM summarizes interaction terms at each layer, and is able to efficiently select important higher order terms. We empirically demonstrate effectiveness of our proposed models on the task of CTR prediction, where our model exhibits superior performance compared to the recent state-of-the-art models, outperforming them by up to 6.7% on the logloss metric.\",\"PeriodicalId\":443279,\"journal\":{\"name\":\"Proceedings of the 16th ACM Conference on Recommender Systems\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 16th ACM Conference on Recommender Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3523227.3546789\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 16th ACM Conference on Recommender Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3523227.3546789","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

点击率预测、预测、标注等许多具有现实意义的问题,涉及到各种用户、物品和上下文特征的复杂交互。过去,人工特征工程被用于对这些组合特征进行建模,但它需要领域的专业知识,并且随着特征数量的增加,它变得非常昂贵。前馈神经网络在很大程度上减轻了人工特征工程的需要,并且由于其学习任意函数的能力,在多个领域显示出令人印象深刻的性能。尽管有多层非线性投影,但神经网络在对具有高阶交互项的函数进行有效建模方面的能力有限。近年来,因数分解机及其变体已被提出用于显式捕获高阶组合相互作用。然而,并不是所有的特征交互都同样重要,在稀疏数据设置中,如果没有合适的抑制机制,这可能会导致在推理过程中产生噪声项并损害模型泛化。本文提出了双注意高阶因数分解机(Dual attention高阶因数分解机,DA-HoFM),这是一种统一的注意高阶因数分解机,它利用组合结构来计算最大相互作用程度上具有线性复杂性的高阶项。DA-HoFM采用稀疏双注意机制,对每一层的交互项进行总结,能够高效地选择重要的高阶项。我们通过经验证明了我们提出的模型在CTR预测任务上的有效性,与最近最先进的模型相比,我们的模型表现出卓越的性能,在logloss指标上优于它们高达6.7%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Dual Attentional Higher Order Factorization Machines
Numerous problems of practical significance such as clickthrough rate (CTR) prediction, forecasting, tagging and so on, involve complex interaction of various user, item and context features. Manual feature engineering has been used in the past to model these combinatorial features but it requires domain expertise and becomes prohibitively expensive as the number of features increases. Feedforward neural networks alleviate the need for manual feature engineering to a large extent and have shown impressive performance across multiple domains due to their ability to learn arbitrary functions. Despite multiple layers of non-linear projections, neural networks are limited in their ability to efficiently model functions with higher order interaction terms. In recent years, Factorization Machines and its variants have been proposed to explicitly capture higher order combinatorial interactions. However not all feature interactions are equally important, and in sparse data settings, without a suitable suppression mechanism, this might result into noisy terms during inference and hurt model generalization. In this work we present Dual Attentional Higher Order Factorization Machine (DA-HoFM), a unified attentional higher order factorization machine which leverages a compositional architecture to compute higher order terms with complexity linear in terms of maximum interaction degree. Equipped with sparse dual attention mechanism, DA-HoFM summarizes interaction terms at each layer, and is able to efficiently select important higher order terms. We empirically demonstrate effectiveness of our proposed models on the task of CTR prediction, where our model exhibits superior performance compared to the recent state-of-the-art models, outperforming them by up to 6.7% on the logloss metric.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Heterogeneous Graph Representation Learning for multi-target Cross-Domain Recommendation Imbalanced Data Sparsity as a Source of Unfair Bias in Collaborative Filtering Position Awareness Modeling with Knowledge Distillation for CTR Prediction Multi-Modal Dialog State Tracking for Interactive Fashion Recommendation Denoising Self-Attentive Sequential Recommendation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1