Text semantic matching algorithm based on the introduction of external knowledge under contrastive learning

IF 3.1 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE International Journal of Machine Learning and Cybernetics Pub Date : 2024-07-24 DOI:10.1007/s13042-024-02285-2

Jie Hu, Yinglian Zhu, Lishan Wu, Qilei Luo, Fei Teng, Tianrui Li

{"title":"Text semantic matching algorithm based on the introduction of external knowledge under contrastive learning","authors":"Jie Hu, Yinglian Zhu, Lishan Wu, Qilei Luo, Fei Teng, Tianrui Li","doi":"10.1007/s13042-024-02285-2","DOIUrl":null,"url":null,"abstract":"<p>Measuring the semantic similarity between two texts is a fundamental aspect of text semantic matching. Each word in the texts holds a weighted meaning, and it is essential for the model to effectively capture the most crucial knowledge. However, current text matching methods based on BERT have limitations in acquiring professional domain knowledge. BERT requires extensive domain-specific training data to perform well in specialized fields such as medicine, where obtaining labeled data is challenging. In addition, current text matching models that inject domain knowledge often rely on creating new training tasks to fine-tune the model, which is time-consuming. Although existing works have directly injected domain knowledge into BERT through similarity matrices, they struggle to handle the challenge of small sample sizes in professional fields. Contrastive learning trains a representation learning model by generating instances that exhibit either similarity or dissimilarity, so that a more general representation can be learned with a small number of samples. In this paper, we propose to directly integrate the word similarity matrix into BERT’s multi-head attention mechanism under a contrastive learning framework to align similar words during training. Furthermore, in the context of Chinese medical applications, we propose an entity MASK approach to enhance the understanding of medical terms by pre-trained models. The proposed method helps BERT acquire domain knowledge to better learn text representations in professional fields. Extensive experimental results have shown that the algorithm significantly improves the performance of the text matching model, especially when training data is limited.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"13 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Machine Learning and Cybernetics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s13042-024-02285-2","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Measuring the semantic similarity between two texts is a fundamental aspect of text semantic matching. Each word in the texts holds a weighted meaning, and it is essential for the model to effectively capture the most crucial knowledge. However, current text matching methods based on BERT have limitations in acquiring professional domain knowledge. BERT requires extensive domain-specific training data to perform well in specialized fields such as medicine, where obtaining labeled data is challenging. In addition, current text matching models that inject domain knowledge often rely on creating new training tasks to fine-tune the model, which is time-consuming. Although existing works have directly injected domain knowledge into BERT through similarity matrices, they struggle to handle the challenge of small sample sizes in professional fields. Contrastive learning trains a representation learning model by generating instances that exhibit either similarity or dissimilarity, so that a more general representation can be learned with a small number of samples. In this paper, we propose to directly integrate the word similarity matrix into BERT’s multi-head attention mechanism under a contrastive learning framework to align similar words during training. Furthermore, in the context of Chinese medical applications, we propose an entity MASK approach to enhance the understanding of medical terms by pre-trained models. The proposed method helps BERT acquire domain knowledge to better learn text representations in professional fields. Extensive experimental results have shown that the algorithm significantly improves the performance of the text matching model, especially when training data is limited.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

对比学习下基于外部知识引入的文本语义匹配算法

测量两个文本之间的语义相似性是文本语义匹配的一个基本方面。文本中的每个词都具有加权意义，因此模型必须有效捕捉最关键的知识。然而，目前基于 BERT 的文本匹配方法在获取专业领域知识方面存在局限性。BERT 需要大量特定领域的训练数据，才能在医学等专业领域取得良好的效果，而在这些领域，获取标注数据是一项挑战。此外，目前注入领域知识的文本匹配模型往往依赖于创建新的训练任务来微调模型，这非常耗时。虽然现有研究通过相似性矩阵直接将领域知识注入 BERT，但它们难以应对专业领域样本量小的挑战。对比学习通过生成表现出相似性或不相似性的实例来训练表征学习模型，因此只需少量样本就能学习到更通用的表征。在本文中，我们提出在对比学习框架下，将词语相似性矩阵直接集成到 BERT 的多头注意力机制中，以便在训练过程中将相似词语对齐。此外，在中文医疗应用方面，我们提出了一种实体 MASK 方法，通过预训练模型来增强对医疗术语的理解。所提出的方法有助于 BERT 获取领域知识，从而更好地学习专业领域的文本表征。大量实验结果表明，该算法显著提高了文本匹配模型的性能，尤其是在训练数据有限的情况下。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

International Journal of Machine Learning and Cybernetics COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-

CiteScore

7.90

自引率

10.70%

发文量

225

期刊介绍： Cybernetics is concerned with describing complex interactions and interrelationships between systems which are omnipresent in our daily life. Machine Learning discovers fundamental functional relationships between variables and ensembles of variables in systems. The merging of the disciplines of Machine Learning and Cybernetics is aimed at the discovery of various forms of interaction between systems through diverse mechanisms of learning from data. The International Journal of Machine Learning and Cybernetics (IJMLC) focuses on the key research problems emerging at the junction of machine learning and cybernetics and serves as a broad forum for rapid dissemination of the latest advancements in the area. The emphasis of IJMLC is on the hybrid development of machine learning and cybernetics schemes inspired by different contributing disciplines such as engineering, mathematics, cognitive sciences, and applications. New ideas, design alternatives, implementations and case studies pertaining to all the aspects of machine learning and cybernetics fall within the scope of the IJMLC. Key research areas to be covered by the journal include: Machine Learning for modeling interactions between systems Pattern Recognition technology to support discovery of system-environment interaction Control of system-environment interactions Biochemical interaction in biological and biologically-inspired systems Learning for improvement of communication schemes between systems