利用大型语言模型在产品知识图谱中为电子商务进行关系标注

IF 3.1 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE International Journal of Machine Learning and Cybernetics Pub Date : 2024-08-15 DOI:10.1007/s13042-024-02274-5

Jiao Chen, Luyi Ma, Xiaohan Li, Jianpeng Xu, Jason H. D. Cho, Kaushiki Nag, Evren Korpeoglu, Sushant Kumar, Kannan Achan

{"title":"利用大型语言模型在产品知识图谱中为电子商务进行关系标注","authors":"Jiao Chen, Luyi Ma, Xiaohan Li, Jianpeng Xu, Jason H. D. Cho, Kaushiki Nag, Evren Korpeoglu, Sushant Kumar, Kannan Achan","doi":"10.1007/s13042-024-02274-5","DOIUrl":null,"url":null,"abstract":"<p>Product Knowledge Graphs (PKGs) play a crucial role in enhancing e-commerce system performance by providing structured information about entities and their relationships, such as complementary or substitutable relations between products or product types, which can be utilized in recommender systems. However, relation labeling in PKGs remains a challenging task due to the dynamic nature of e-commerce domains and the associated cost of human labor. Recently, breakthroughs in Large Language Models (LLMs) have shown surprising results in numerous natural language processing tasks, especially in the in-context learning (ICL). In this paper, we conduct an empirical study of LLMs for relation labeling in e-commerce PKGs, investigating their powerful learning capabilities in natural language and effectiveness in predicting relations between product types with few-shot in-context learning. We evaluate the performance of various LLMs, including PaLM-2, GPT-3.5, and Llama-2, on benchmark datasets for e-commerce relation labeling tasks. We use different prompt engineering techniques to examine their impact on model performance. Our results show that LLMs can achieve competitive performance compared to human labelers using just 1–5 labeled examples per relation. We also illustrate the bias issues in LLMs towards minority ethnic groups. Additionally, we show that LLMs significantly outperform existing KG completion models or classification methods in relation labeling for e-commerce KGs and exhibit performance strong enough to replace human labeling. Beyond empirical investigations, we also carry out a theoretical analysis to explain the superior capability of LLMs in few-shot ICL by comparing it with kernel regression.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"176 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Relation labeling in product knowledge graphs with large language models for e-commerce\",\"authors\":\"Jiao Chen, Luyi Ma, Xiaohan Li, Jianpeng Xu, Jason H. D. Cho, Kaushiki Nag, Evren Korpeoglu, Sushant Kumar, Kannan Achan\",\"doi\":\"10.1007/s13042-024-02274-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Product Knowledge Graphs (PKGs) play a crucial role in enhancing e-commerce system performance by providing structured information about entities and their relationships, such as complementary or substitutable relations between products or product types, which can be utilized in recommender systems. However, relation labeling in PKGs remains a challenging task due to the dynamic nature of e-commerce domains and the associated cost of human labor. Recently, breakthroughs in Large Language Models (LLMs) have shown surprising results in numerous natural language processing tasks, especially in the in-context learning (ICL). In this paper, we conduct an empirical study of LLMs for relation labeling in e-commerce PKGs, investigating their powerful learning capabilities in natural language and effectiveness in predicting relations between product types with few-shot in-context learning. We evaluate the performance of various LLMs, including PaLM-2, GPT-3.5, and Llama-2, on benchmark datasets for e-commerce relation labeling tasks. We use different prompt engineering techniques to examine their impact on model performance. Our results show that LLMs can achieve competitive performance compared to human labelers using just 1–5 labeled examples per relation. We also illustrate the bias issues in LLMs towards minority ethnic groups. Additionally, we show that LLMs significantly outperform existing KG completion models or classification methods in relation labeling for e-commerce KGs and exhibit performance strong enough to replace human labeling. Beyond empirical investigations, we also carry out a theoretical analysis to explain the superior capability of LLMs in few-shot ICL by comparing it with kernel regression.</p>\",\"PeriodicalId\":51327,\"journal\":{\"name\":\"International Journal of Machine Learning and Cybernetics\",\"volume\":\"176 1\",\"pages\":\"\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-08-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Machine Learning and Cybernetics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s13042-024-02274-5\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Machine Learning and Cybernetics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s13042-024-02274-5","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

产品知识图谱（PKG）通过提供有关实体及其关系的结构化信息，如产品或产品类型之间的互补或可替代关系，在提高电子商务系统性能方面发挥着至关重要的作用。然而，由于电子商务领域的动态性质和相关的人力成本，在 PKG 中进行关系标注仍然是一项具有挑战性的任务。最近，大语言模型（LLM）在许多自然语言处理任务中取得了突破性进展，尤其是在上下文学习（ICL）方面，显示出令人惊喜的成果。在本文中，我们对 LLMs 在电子商务 PKG 中的关系标注进行了实证研究，调查了它们在自然语言中的强大学习能力，以及通过少量上下文学习预测产品类型之间关系的有效性。我们在电子商务关系标注任务的基准数据集上评估了各种 LLM 的性能，包括 PaLM-2、GPT-3.5 和 Llama-2。我们使用了不同的提示工程技术来检验它们对模型性能的影响。我们的结果表明，与人类标注者相比，LLM 只需使用每个关系的 1-5 个标注示例就能获得具有竞争力的性能。我们还说明了 LLM 对少数民族群体的偏见问题。此外，我们还表明，在电子商务 KG 的关系标注方面，LLM 明显优于现有的 KG 完成模型或分类方法，其表现足以取代人工标注。除了实证研究之外，我们还进行了理论分析，通过与核回归进行比较，解释了 LLMs 在少数族群 ICL 中的卓越能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Relation labeling in product knowledge graphs with large language models for e-commerce

Product Knowledge Graphs (PKGs) play a crucial role in enhancing e-commerce system performance by providing structured information about entities and their relationships, such as complementary or substitutable relations between products or product types, which can be utilized in recommender systems. However, relation labeling in PKGs remains a challenging task due to the dynamic nature of e-commerce domains and the associated cost of human labor. Recently, breakthroughs in Large Language Models (LLMs) have shown surprising results in numerous natural language processing tasks, especially in the in-context learning (ICL). In this paper, we conduct an empirical study of LLMs for relation labeling in e-commerce PKGs, investigating their powerful learning capabilities in natural language and effectiveness in predicting relations between product types with few-shot in-context learning. We evaluate the performance of various LLMs, including PaLM-2, GPT-3.5, and Llama-2, on benchmark datasets for e-commerce relation labeling tasks. We use different prompt engineering techniques to examine their impact on model performance. Our results show that LLMs can achieve competitive performance compared to human labelers using just 1–5 labeled examples per relation. We also illustrate the bias issues in LLMs towards minority ethnic groups. Additionally, we show that LLMs significantly outperform existing KG completion models or classification methods in relation labeling for e-commerce KGs and exhibit performance strong enough to replace human labeling. Beyond empirical investigations, we also carry out a theoretical analysis to explain the superior capability of LLMs in few-shot ICL by comparing it with kernel regression.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Machine Learning and Cybernetics COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-

CiteScore

7.90

自引率

10.70%

发文量

225

期刊介绍： Cybernetics is concerned with describing complex interactions and interrelationships between systems which are omnipresent in our daily life. Machine Learning discovers fundamental functional relationships between variables and ensembles of variables in systems. The merging of the disciplines of Machine Learning and Cybernetics is aimed at the discovery of various forms of interaction between systems through diverse mechanisms of learning from data. The International Journal of Machine Learning and Cybernetics (IJMLC) focuses on the key research problems emerging at the junction of machine learning and cybernetics and serves as a broad forum for rapid dissemination of the latest advancements in the area. The emphasis of IJMLC is on the hybrid development of machine learning and cybernetics schemes inspired by different contributing disciplines such as engineering, mathematics, cognitive sciences, and applications. New ideas, design alternatives, implementations and case studies pertaining to all the aspects of machine learning and cybernetics fall within the scope of the IJMLC. Key research areas to be covered by the journal include: Machine Learning for modeling interactions between systems Pattern Recognition technology to support discovery of system-environment interaction Control of system-environment interactions Biochemical interaction in biological and biologically-inspired systems Learning for improvement of communication schemes between systems