Xuansheng Wu, Hanqin Wan, Qiaoyu Tan, Wenlin Yao, Ninghao Liu
{"title":"DIRECT: Dual Interpretable Recommendation with Multi-aspect Word Attribution","authors":"Xuansheng Wu, Hanqin Wan, Qiaoyu Tan, Wenlin Yao, Ninghao Liu","doi":"10.1145/3663483","DOIUrl":null,"url":null,"abstract":"<p>Recommending products to users with intuitive explanations helps improve the system in transparency, persuasiveness, and satisfaction. Existing interpretation techniques include post-hoc methods and interpretable modeling. The former category could quantitatively analyze input contribution to model prediction but has limited interpretation faithfulness, while the latter could explain model internal mechanisms but may not directly attribute model predictions to input features. In this study, we propose a novel <underline>D</underline>ual <underline>I</underline>nterpretable <underline>Rec</underline>ommenda<underline>t</underline>ion model called DIRECT, which integrates ideas of the two interpretation categories to inherit their advantages and avoid limitations. Specifically, DIRECT makes use of item descriptions as explainable evidence for recommendation. First, similar to the post-hoc interpretation, DIRECT could attribute the prediction of a user preference score to textual words of the item descriptions. The attribution of each word is related to its sentiment polarity and word importance, where a word is important if it corresponds to an item aspect that the user is interested in. Second, to improve the interpretability of embedding space, we propose to extract high-level concepts from embeddings, where each concept corresponds to an item aspect. To learn discriminative concepts, we employ a concept-bottleneck layer, and maximize the coding rate reduction on word-aspect embeddings by leveraging a word-word affinity graph extracted from a pre-trained language model. In this way, DIRECT simultaneously achieves faithful attribution and usable interpretation of embedding space. We also show that DIRECT achieves linear inference time complexity regarding the length of item reviews. We conduct experiments including ablation studies on five real-world datasets. Quantitative analysis, visualizations, and case studies verify the interpretability of DIRECT. Our code is available at: https://github.com/JacksonWuxs/DIRECT.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":null,"pages":null},"PeriodicalIF":7.2000,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Intelligent Systems and Technology","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3663483","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Recommending products to users with intuitive explanations helps improve the system in transparency, persuasiveness, and satisfaction. Existing interpretation techniques include post-hoc methods and interpretable modeling. The former category could quantitatively analyze input contribution to model prediction but has limited interpretation faithfulness, while the latter could explain model internal mechanisms but may not directly attribute model predictions to input features. In this study, we propose a novel Dual Interpretable Recommendation model called DIRECT, which integrates ideas of the two interpretation categories to inherit their advantages and avoid limitations. Specifically, DIRECT makes use of item descriptions as explainable evidence for recommendation. First, similar to the post-hoc interpretation, DIRECT could attribute the prediction of a user preference score to textual words of the item descriptions. The attribution of each word is related to its sentiment polarity and word importance, where a word is important if it corresponds to an item aspect that the user is interested in. Second, to improve the interpretability of embedding space, we propose to extract high-level concepts from embeddings, where each concept corresponds to an item aspect. To learn discriminative concepts, we employ a concept-bottleneck layer, and maximize the coding rate reduction on word-aspect embeddings by leveraging a word-word affinity graph extracted from a pre-trained language model. In this way, DIRECT simultaneously achieves faithful attribution and usable interpretation of embedding space. We also show that DIRECT achieves linear inference time complexity regarding the length of item reviews. We conduct experiments including ablation studies on five real-world datasets. Quantitative analysis, visualizations, and case studies verify the interpretability of DIRECT. Our code is available at: https://github.com/JacksonWuxs/DIRECT.
通过直观的解释向用户推荐产品有助于提高系统的透明度、说服力和满意度。现有的解释技术包括事后方法和可解释建模。前者可以定量分析输入对模型预测的贡献,但解释的忠实性有限;后者可以解释模型的内部机制,但可能无法将模型预测直接归因于输入特征。在本研究中,我们提出了一种名为 DIRECT 的新型双重可解释推荐模型,它整合了两种解释类别的思想,继承了它们的优点,避免了它们的局限性。具体来说,DIRECT 利用项目描述作为可解释的推荐证据。首先,与事后解释类似,DIRECT 可以将用户偏好分数的预测归因于项目描述中的文字词句。每个词的归因都与其情感极性和词的重要性有关,如果一个词与用户感兴趣的项目方面相对应,那么这个词就是重要的。其次,为了提高嵌入空间的可解释性,我们建议从嵌入中提取高级概念,每个概念对应一个项目方面。为了学习辨别概念,我们采用了一个概念瓶颈层,并利用从预先训练的语言模型中提取的词-词亲和图,最大限度地降低词-词嵌入的编码率。这样,DIRECT 就能同时实现嵌入空间的忠实归属和可用解释。我们还证明,DIRECT 在项目评论长度方面实现了线性推理时间复杂性。我们在五个真实世界数据集上进行了实验,包括消融研究。定量分析、可视化和案例研究验证了 DIRECT 的可解释性。我们的代码可在以下网址获取:https://github.com/JacksonWuxs/DIRECT。
期刊介绍:
ACM Transactions on Intelligent Systems and Technology is a scholarly journal that publishes the highest quality papers on intelligent systems, applicable algorithms and technology with a multi-disciplinary perspective. An intelligent system is one that uses artificial intelligence (AI) techniques to offer important services (e.g., as a component of a larger system) to allow integrated systems to perceive, reason, learn, and act intelligently in the real world.
ACM TIST is published quarterly (six issues a year). Each issue has 8-11 regular papers, with around 20 published journal pages or 10,000 words per paper. Additional references, proofs, graphs or detailed experiment results can be submitted as a separate appendix, while excessively lengthy papers will be rejected automatically. Authors can include online-only appendices for additional content of their published papers and are encouraged to share their code and/or data with other readers.