QSIM: A Quantum-inspired hierarchical semantic interaction model for text classification

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Neurocomputing Pub Date : 2024-10-04 DOI:10.1016/j.neucom.2024.128658

Hui Gao , Peng Zhang , Jing Zhang , Chang Yang

{"title":"QSIM: A Quantum-inspired hierarchical semantic interaction model for text classification","authors":"Hui Gao , Peng Zhang , Jing Zhang , Chang Yang","doi":"10.1016/j.neucom.2024.128658","DOIUrl":null,"url":null,"abstract":"<div><div>Semantic interaction modeling is a fundamental technology in natural language understanding that guides models to extract deep semantic information from text. Currently, the attention mechanism is one of the most effective techniques in semantic interaction modeling, which learns word-level attention representation by measuring the relevance between different words. However, the attention mechanism is limited to word-level semantic interaction, it cannot meet the needs of fine-grained interactive information for some text classification tasks. In recent years, quantum-inspired language modeling methods have successfully constructed quantized representations of language systems in Hilbert spaces, which use density matrices to achieve fine-grained semantic interaction modeling.</div><div>This paper presents a <strong>Q</strong>uantum-inspired hierarchical <strong>S</strong>emantic <strong>I</strong>nteraction <strong>M</strong>odel (<strong>QSIM</strong>), which follows the sememe-word-sentence language construction principle and utilizes quantum entanglement theory to capture hierarchical semantic interaction information in Hilbert space. Our work builds on the idea of the attention mechanism and extends it. Specifically, we explore the original semantic space from a quantum theory perspective and derive the core semantic space using the Schmidt decomposition technique, where: (1) Sememe is represented as the unit vector in the two-dimensional minimum semantic space; (2) Word is represented as reduced density matrices in the core semantic space, where Schmidt coefficients quantify sememe-level semantic interaction. Compared to density matrices, reduced density matrices capture fine-grained semantic interaction information with lower computational cost; (3) Sentence is represented as quantum superposition states of words, and the degree of word-level semantic interaction is measured using entanglement entropy.</div><div>To evaluate the model’s performance, we conducted experiments on 15 text classification datasets. The experimental results demonstrate that our model is superior to classical neural network models and traditional quantum-inspired language models. Furthermore, the experiment also confirms two distinct advantages of QISM: (1) <strong>flexibility</strong>, as it can be integrated into various mainstream neural network text classification architectures; and (2) <strong>practicability</strong>, as it alleviates the problem of parameter growth inherent in density matrix calculation in quantum language model.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":null,"pages":null},"PeriodicalIF":5.5000,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231224014292","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Semantic interaction modeling is a fundamental technology in natural language understanding that guides models to extract deep semantic information from text. Currently, the attention mechanism is one of the most effective techniques in semantic interaction modeling, which learns word-level attention representation by measuring the relevance between different words. However, the attention mechanism is limited to word-level semantic interaction, it cannot meet the needs of fine-grained interactive information for some text classification tasks. In recent years, quantum-inspired language modeling methods have successfully constructed quantized representations of language systems in Hilbert spaces, which use density matrices to achieve fine-grained semantic interaction modeling.

This paper presents a Quantum-inspired hierarchical Semantic Interaction Model (QSIM), which follows the sememe-word-sentence language construction principle and utilizes quantum entanglement theory to capture hierarchical semantic interaction information in Hilbert space. Our work builds on the idea of the attention mechanism and extends it. Specifically, we explore the original semantic space from a quantum theory perspective and derive the core semantic space using the Schmidt decomposition technique, where: (1) Sememe is represented as the unit vector in the two-dimensional minimum semantic space; (2) Word is represented as reduced density matrices in the core semantic space, where Schmidt coefficients quantify sememe-level semantic interaction. Compared to density matrices, reduced density matrices capture fine-grained semantic interaction information with lower computational cost; (3) Sentence is represented as quantum superposition states of words, and the degree of word-level semantic interaction is measured using entanglement entropy.

To evaluate the model’s performance, we conducted experiments on 15 text classification datasets. The experimental results demonstrate that our model is superior to classical neural network models and traditional quantum-inspired language models. Furthermore, the experiment also confirms two distinct advantages of QISM: (1) flexibility, as it can be integrated into various mainstream neural network text classification architectures; and (2) practicability, as it alleviates the problem of parameter growth inherent in density matrix calculation in quantum language model.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

QSIM：用于文本分类的量子启发分层语义交互模型

语义交互建模是自然语言理解的一项基础技术，它引导模型从文本中提取深层语义信息。目前，注意力机制是语义交互建模中最有效的技术之一，它通过测量不同词之间的相关性来学习词级注意力表征。然而，注意力机制仅限于词级语义交互，无法满足一些文本分类任务对细粒度交互信息的需求。近年来，量子启发的语言建模方法成功地在希尔伯特空间中构建了语言系统的量子化表征，利用密度矩阵实现了细粒度语义交互建模。本文提出了量子启发的分层语义交互模型（QSIM），该模型遵循seme-word-sentence语言构建原理，利用量子纠缠理论捕捉希尔伯特空间中的分层语义交互信息。我们的工作以注意力机制的思想为基础，并对其进行了扩展。具体来说，我们从量子理论的角度探索原始语义空间，并利用施密特分解技术推导出核心语义空间，其中：(1) 在二维最小语义空间中，Sememe 表示为单位向量；(2) 在核心语义空间中，Word 表示为还原密度矩阵，其中施密特系数量化了sememe层面的语义交互。与密度矩阵相比，还原密度矩阵能以更低的计算成本捕捉细粒度的语义交互信息；（3）句子表示为词的量子叠加态，词级语义交互程度用纠缠熵来衡量。实验结果表明，我们的模型优于经典的神经网络模型和传统的量子语言模型。此外，实验还证实了 QISM 的两个显著优势：(1) 灵活性，因为它可以集成到各种主流神经网络文本分类架构中；(2) 实用性，因为它缓解了量子语言模型在密度矩阵计算中固有的参数增长问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Neurocomputing 工程技术-计算机：人工智能

CiteScore

13.10

自引率

10.00%

发文量

1382

审稿时长

70 days

期刊介绍： Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.