利用大型语言模型在产品知识图谱中为电子商务进行关系标注

IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE International Journal of Machine Learning and Cybernetics Pub Date : 2024-08-15 DOI:10.1007/s13042-024-02274-5
Jiao Chen, Luyi Ma, Xiaohan Li, Jianpeng Xu, Jason H. D. Cho, Kaushiki Nag, Evren Korpeoglu, Sushant Kumar, Kannan Achan
{"title":"利用大型语言模型在产品知识图谱中为电子商务进行关系标注","authors":"Jiao Chen, Luyi Ma, Xiaohan Li, Jianpeng Xu, Jason H. D. Cho, Kaushiki Nag, Evren Korpeoglu, Sushant Kumar, Kannan Achan","doi":"10.1007/s13042-024-02274-5","DOIUrl":null,"url":null,"abstract":"<p>Product Knowledge Graphs (PKGs) play a crucial role in enhancing e-commerce system performance by providing structured information about entities and their relationships, such as complementary or substitutable relations between products or product types, which can be utilized in recommender systems. However, relation labeling in PKGs remains a challenging task due to the dynamic nature of e-commerce domains and the associated cost of human labor. Recently, breakthroughs in Large Language Models (LLMs) have shown surprising results in numerous natural language processing tasks, especially in the in-context learning (ICL). In this paper, we conduct an empirical study of LLMs for relation labeling in e-commerce PKGs, investigating their powerful learning capabilities in natural language and effectiveness in predicting relations between product types with few-shot in-context learning. We evaluate the performance of various LLMs, including PaLM-2, GPT-3.5, and Llama-2, on benchmark datasets for e-commerce relation labeling tasks. We use different prompt engineering techniques to examine their impact on model performance. Our results show that LLMs can achieve competitive performance compared to human labelers using just 1–5 labeled examples per relation. We also illustrate the bias issues in LLMs towards minority ethnic groups. Additionally, we show that LLMs significantly outperform existing KG completion models or classification methods in relation labeling for e-commerce KGs and exhibit performance strong enough to replace human labeling. Beyond empirical investigations, we also carry out a theoretical analysis to explain the superior capability of LLMs in few-shot ICL by comparing it with kernel regression.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":null,"pages":null},"PeriodicalIF":3.1000,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Relation labeling in product knowledge graphs with large language models for e-commerce\",\"authors\":\"Jiao Chen, Luyi Ma, Xiaohan Li, Jianpeng Xu, Jason H. D. Cho, Kaushiki Nag, Evren Korpeoglu, Sushant Kumar, Kannan Achan\",\"doi\":\"10.1007/s13042-024-02274-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Product Knowledge Graphs (PKGs) play a crucial role in enhancing e-commerce system performance by providing structured information about entities and their relationships, such as complementary or substitutable relations between products or product types, which can be utilized in recommender systems. However, relation labeling in PKGs remains a challenging task due to the dynamic nature of e-commerce domains and the associated cost of human labor. Recently, breakthroughs in Large Language Models (LLMs) have shown surprising results in numerous natural language processing tasks, especially in the in-context learning (ICL). In this paper, we conduct an empirical study of LLMs for relation labeling in e-commerce PKGs, investigating their powerful learning capabilities in natural language and effectiveness in predicting relations between product types with few-shot in-context learning. We evaluate the performance of various LLMs, including PaLM-2, GPT-3.5, and Llama-2, on benchmark datasets for e-commerce relation labeling tasks. We use different prompt engineering techniques to examine their impact on model performance. Our results show that LLMs can achieve competitive performance compared to human labelers using just 1–5 labeled examples per relation. We also illustrate the bias issues in LLMs towards minority ethnic groups. Additionally, we show that LLMs significantly outperform existing KG completion models or classification methods in relation labeling for e-commerce KGs and exhibit performance strong enough to replace human labeling. Beyond empirical investigations, we also carry out a theoretical analysis to explain the superior capability of LLMs in few-shot ICL by comparing it with kernel regression.</p>\",\"PeriodicalId\":51327,\"journal\":{\"name\":\"International Journal of Machine Learning and Cybernetics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-08-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Machine Learning and Cybernetics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s13042-024-02274-5\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Machine Learning and Cybernetics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s13042-024-02274-5","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

产品知识图谱(PKG)通过提供有关实体及其关系的结构化信息,如产品或产品类型之间的互补或可替代关系,在提高电子商务系统性能方面发挥着至关重要的作用。然而,由于电子商务领域的动态性质和相关的人力成本,在 PKG 中进行关系标注仍然是一项具有挑战性的任务。最近,大语言模型(LLM)在许多自然语言处理任务中取得了突破性进展,尤其是在上下文学习(ICL)方面,显示出令人惊喜的成果。在本文中,我们对 LLMs 在电子商务 PKG 中的关系标注进行了实证研究,调查了它们在自然语言中的强大学习能力,以及通过少量上下文学习预测产品类型之间关系的有效性。我们在电子商务关系标注任务的基准数据集上评估了各种 LLM 的性能,包括 PaLM-2、GPT-3.5 和 Llama-2。我们使用了不同的提示工程技术来检验它们对模型性能的影响。我们的结果表明,与人类标注者相比,LLM 只需使用每个关系的 1-5 个标注示例就能获得具有竞争力的性能。我们还说明了 LLM 对少数民族群体的偏见问题。此外,我们还表明,在电子商务 KG 的关系标注方面,LLM 明显优于现有的 KG 完成模型或分类方法,其表现足以取代人工标注。除了实证研究之外,我们还进行了理论分析,通过与核回归进行比较,解释了 LLMs 在少数族群 ICL 中的卓越能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Relation labeling in product knowledge graphs with large language models for e-commerce

Product Knowledge Graphs (PKGs) play a crucial role in enhancing e-commerce system performance by providing structured information about entities and their relationships, such as complementary or substitutable relations between products or product types, which can be utilized in recommender systems. However, relation labeling in PKGs remains a challenging task due to the dynamic nature of e-commerce domains and the associated cost of human labor. Recently, breakthroughs in Large Language Models (LLMs) have shown surprising results in numerous natural language processing tasks, especially in the in-context learning (ICL). In this paper, we conduct an empirical study of LLMs for relation labeling in e-commerce PKGs, investigating their powerful learning capabilities in natural language and effectiveness in predicting relations between product types with few-shot in-context learning. We evaluate the performance of various LLMs, including PaLM-2, GPT-3.5, and Llama-2, on benchmark datasets for e-commerce relation labeling tasks. We use different prompt engineering techniques to examine their impact on model performance. Our results show that LLMs can achieve competitive performance compared to human labelers using just 1–5 labeled examples per relation. We also illustrate the bias issues in LLMs towards minority ethnic groups. Additionally, we show that LLMs significantly outperform existing KG completion models or classification methods in relation labeling for e-commerce KGs and exhibit performance strong enough to replace human labeling. Beyond empirical investigations, we also carry out a theoretical analysis to explain the superior capability of LLMs in few-shot ICL by comparing it with kernel regression.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
International Journal of Machine Learning and Cybernetics
International Journal of Machine Learning and Cybernetics COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-
CiteScore
7.90
自引率
10.70%
发文量
225
期刊介绍: Cybernetics is concerned with describing complex interactions and interrelationships between systems which are omnipresent in our daily life. Machine Learning discovers fundamental functional relationships between variables and ensembles of variables in systems. The merging of the disciplines of Machine Learning and Cybernetics is aimed at the discovery of various forms of interaction between systems through diverse mechanisms of learning from data. The International Journal of Machine Learning and Cybernetics (IJMLC) focuses on the key research problems emerging at the junction of machine learning and cybernetics and serves as a broad forum for rapid dissemination of the latest advancements in the area. The emphasis of IJMLC is on the hybrid development of machine learning and cybernetics schemes inspired by different contributing disciplines such as engineering, mathematics, cognitive sciences, and applications. New ideas, design alternatives, implementations and case studies pertaining to all the aspects of machine learning and cybernetics fall within the scope of the IJMLC. Key research areas to be covered by the journal include: Machine Learning for modeling interactions between systems Pattern Recognition technology to support discovery of system-environment interaction Control of system-environment interactions Biochemical interaction in biological and biologically-inspired systems Learning for improvement of communication schemes between systems
期刊最新文献
LSSMSD: defending against black-box DNN model stealing based on localized stochastic sensitivity CHNSCDA: circRNA-disease association prediction based on strongly correlated heterogeneous neighbor sampling Contextual feature fusion and refinement network for camouflaged object detection Scnet: shape-aware convolution with KFNN for point clouds completion Self-refined variational transformer for image-conditioned layout generation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1