Text semantic matching algorithm based on the introduction of external knowledge under contrastive learning

IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE International Journal of Machine Learning and Cybernetics Pub Date : 2024-07-24 DOI:10.1007/s13042-024-02285-2
Jie Hu, Yinglian Zhu, Lishan Wu, Qilei Luo, Fei Teng, Tianrui Li
{"title":"Text semantic matching algorithm based on the introduction of external knowledge under contrastive learning","authors":"Jie Hu, Yinglian Zhu, Lishan Wu, Qilei Luo, Fei Teng, Tianrui Li","doi":"10.1007/s13042-024-02285-2","DOIUrl":null,"url":null,"abstract":"<p>Measuring the semantic similarity between two texts is a fundamental aspect of text semantic matching. Each word in the texts holds a weighted meaning, and it is essential for the model to effectively capture the most crucial knowledge. However, current text matching methods based on BERT have limitations in acquiring professional domain knowledge. BERT requires extensive domain-specific training data to perform well in specialized fields such as medicine, where obtaining labeled data is challenging. In addition, current text matching models that inject domain knowledge often rely on creating new training tasks to fine-tune the model, which is time-consuming. Although existing works have directly injected domain knowledge into BERT through similarity matrices, they struggle to handle the challenge of small sample sizes in professional fields. Contrastive learning trains a representation learning model by generating instances that exhibit either similarity or dissimilarity, so that a more general representation can be learned with a small number of samples. In this paper, we propose to directly integrate the word similarity matrix into BERT’s multi-head attention mechanism under a contrastive learning framework to align similar words during training. Furthermore, in the context of Chinese medical applications, we propose an entity MASK approach to enhance the understanding of medical terms by pre-trained models. The proposed method helps BERT acquire domain knowledge to better learn text representations in professional fields. Extensive experimental results have shown that the algorithm significantly improves the performance of the text matching model, especially when training data is limited.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":null,"pages":null},"PeriodicalIF":3.1000,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Machine Learning and Cybernetics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s13042-024-02285-2","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Measuring the semantic similarity between two texts is a fundamental aspect of text semantic matching. Each word in the texts holds a weighted meaning, and it is essential for the model to effectively capture the most crucial knowledge. However, current text matching methods based on BERT have limitations in acquiring professional domain knowledge. BERT requires extensive domain-specific training data to perform well in specialized fields such as medicine, where obtaining labeled data is challenging. In addition, current text matching models that inject domain knowledge often rely on creating new training tasks to fine-tune the model, which is time-consuming. Although existing works have directly injected domain knowledge into BERT through similarity matrices, they struggle to handle the challenge of small sample sizes in professional fields. Contrastive learning trains a representation learning model by generating instances that exhibit either similarity or dissimilarity, so that a more general representation can be learned with a small number of samples. In this paper, we propose to directly integrate the word similarity matrix into BERT’s multi-head attention mechanism under a contrastive learning framework to align similar words during training. Furthermore, in the context of Chinese medical applications, we propose an entity MASK approach to enhance the understanding of medical terms by pre-trained models. The proposed method helps BERT acquire domain knowledge to better learn text representations in professional fields. Extensive experimental results have shown that the algorithm significantly improves the performance of the text matching model, especially when training data is limited.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
对比学习下基于外部知识引入的文本语义匹配算法
测量两个文本之间的语义相似性是文本语义匹配的一个基本方面。文本中的每个词都具有加权意义,因此模型必须有效捕捉最关键的知识。然而,目前基于 BERT 的文本匹配方法在获取专业领域知识方面存在局限性。BERT 需要大量特定领域的训练数据,才能在医学等专业领域取得良好的效果,而在这些领域,获取标注数据是一项挑战。此外,目前注入领域知识的文本匹配模型往往依赖于创建新的训练任务来微调模型,这非常耗时。虽然现有研究通过相似性矩阵直接将领域知识注入 BERT,但它们难以应对专业领域样本量小的挑战。对比学习通过生成表现出相似性或不相似性的实例来训练表征学习模型,因此只需少量样本就能学习到更通用的表征。在本文中,我们提出在对比学习框架下,将词语相似性矩阵直接集成到 BERT 的多头注意力机制中,以便在训练过程中将相似词语对齐。此外,在中文医疗应用方面,我们提出了一种实体 MASK 方法,通过预训练模型来增强对医疗术语的理解。所提出的方法有助于 BERT 获取领域知识,从而更好地学习专业领域的文本表征。大量实验结果表明,该算法显著提高了文本匹配模型的性能,尤其是在训练数据有限的情况下。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
International Journal of Machine Learning and Cybernetics
International Journal of Machine Learning and Cybernetics COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-
CiteScore
7.90
自引率
10.70%
发文量
225
期刊介绍: Cybernetics is concerned with describing complex interactions and interrelationships between systems which are omnipresent in our daily life. Machine Learning discovers fundamental functional relationships between variables and ensembles of variables in systems. The merging of the disciplines of Machine Learning and Cybernetics is aimed at the discovery of various forms of interaction between systems through diverse mechanisms of learning from data. The International Journal of Machine Learning and Cybernetics (IJMLC) focuses on the key research problems emerging at the junction of machine learning and cybernetics and serves as a broad forum for rapid dissemination of the latest advancements in the area. The emphasis of IJMLC is on the hybrid development of machine learning and cybernetics schemes inspired by different contributing disciplines such as engineering, mathematics, cognitive sciences, and applications. New ideas, design alternatives, implementations and case studies pertaining to all the aspects of machine learning and cybernetics fall within the scope of the IJMLC. Key research areas to be covered by the journal include: Machine Learning for modeling interactions between systems Pattern Recognition technology to support discovery of system-environment interaction Control of system-environment interactions Biochemical interaction in biological and biologically-inspired systems Learning for improvement of communication schemes between systems
期刊最新文献
LSSMSD: defending against black-box DNN model stealing based on localized stochastic sensitivity CHNSCDA: circRNA-disease association prediction based on strongly correlated heterogeneous neighbor sampling Contextual feature fusion and refinement network for camouflaged object detection Scnet: shape-aware convolution with KFNN for point clouds completion Self-refined variational transformer for image-conditioned layout generation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1