用于文本分类的图形接收变换器编码器

IF 3 3区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Transactions on Signal and Information Processing over Networks Pub Date : 2024-03-21 DOI:10.1109/TSIPN.2024.3380362
Arda Can Aras;Tuna Alikaşifoğlu;Aykut Koç
{"title":"用于文本分类的图形接收变换器编码器","authors":"Arda Can Aras;Tuna Alikaşifoğlu;Aykut Koç","doi":"10.1109/TSIPN.2024.3380362","DOIUrl":null,"url":null,"abstract":"By employing attention mechanisms, transformers have made great improvements in nearly all NLP tasks, including text classification. However, the context of the transformer's attention mechanism is limited to single sequences, and their fine-tuning stage can utilize only inductive learning. Focusing on broader contexts by representing texts as graphs, previous works have generalized transformer models to graph domains to employ attention mechanisms beyond single sequences. However, these approaches either require exhaustive pre-training stages, learn only transductively, or can learn inductively without utilizing pre-trained models. To address these problems simultaneously, we propose the Graph Receptive Transformer Encoder (GRTE), which combines graph neural networks (GNNs) with large-scale pre-trained models for text classification in both inductive and transductive fashions. By constructing heterogeneous and homogeneous graphs over given corpora and not requiring a pre-training stage, GRTE can utilize information from both large-scale pre-trained models and graph-structured relations. Our proposed method retrieves global and contextual information in documents and generates word embeddings as a by-product of inductive inference. We compared the proposed GRTE with a wide range of baseline models through comprehensive experiments. Compared to the state-of-the-art, we demonstrated that GRTE improves model performances and offers computational savings up to ˜100×.","PeriodicalId":56268,"journal":{"name":"IEEE Transactions on Signal and Information Processing over Networks","volume":"10 ","pages":"347-359"},"PeriodicalIF":3.0000,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Graph Receptive Transformer Encoder for Text Classification\",\"authors\":\"Arda Can Aras;Tuna Alikaşifoğlu;Aykut Koç\",\"doi\":\"10.1109/TSIPN.2024.3380362\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"By employing attention mechanisms, transformers have made great improvements in nearly all NLP tasks, including text classification. However, the context of the transformer's attention mechanism is limited to single sequences, and their fine-tuning stage can utilize only inductive learning. Focusing on broader contexts by representing texts as graphs, previous works have generalized transformer models to graph domains to employ attention mechanisms beyond single sequences. However, these approaches either require exhaustive pre-training stages, learn only transductively, or can learn inductively without utilizing pre-trained models. To address these problems simultaneously, we propose the Graph Receptive Transformer Encoder (GRTE), which combines graph neural networks (GNNs) with large-scale pre-trained models for text classification in both inductive and transductive fashions. By constructing heterogeneous and homogeneous graphs over given corpora and not requiring a pre-training stage, GRTE can utilize information from both large-scale pre-trained models and graph-structured relations. Our proposed method retrieves global and contextual information in documents and generates word embeddings as a by-product of inductive inference. We compared the proposed GRTE with a wide range of baseline models through comprehensive experiments. Compared to the state-of-the-art, we demonstrated that GRTE improves model performances and offers computational savings up to ˜100×.\",\"PeriodicalId\":56268,\"journal\":{\"name\":\"IEEE Transactions on Signal and Information Processing over Networks\",\"volume\":\"10 \",\"pages\":\"347-359\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2024-03-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Signal and Information Processing over Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10477516/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Signal and Information Processing over Networks","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10477516/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

通过采用注意力机制,变换器在包括文本分类在内的几乎所有 NLP 任务中都取得了巨大进步。但是,转换器的注意机制的上下文仅限于单一序列,而且其微调阶段只能利用归纳学习。以前的研究通过将文本表示为图来关注更广泛的上下文,并将变换器模型推广到图域,以采用超越单一序列的关注机制。然而,这些方法要么需要详尽的预训练阶段,要么只能进行归纳学习,要么只能进行归纳学习而不能利用预训练模型。为了同时解决这些问题,我们提出了图接收变换器编码器(GRTE),它将图神经网络(GNN)与大规模预训练模型相结合,以归纳和变换的方式进行文本分类。通过在给定的语料库中构建异质和同质图,并且不需要预训练阶段,GRTE 可以利用大规模预训练模型和图结构关系中的信息。我们提出的方法可以检索文档中的全局信息和上下文信息,并生成词嵌入作为归纳推理的副产品。通过综合实验,我们将所提出的 GRTE 与各种基线模型进行了比较。与最先进的模型相比,我们证明 GRTE 提高了模型性能,并节省了高达 ˜100 倍的计算量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Graph Receptive Transformer Encoder for Text Classification
By employing attention mechanisms, transformers have made great improvements in nearly all NLP tasks, including text classification. However, the context of the transformer's attention mechanism is limited to single sequences, and their fine-tuning stage can utilize only inductive learning. Focusing on broader contexts by representing texts as graphs, previous works have generalized transformer models to graph domains to employ attention mechanisms beyond single sequences. However, these approaches either require exhaustive pre-training stages, learn only transductively, or can learn inductively without utilizing pre-trained models. To address these problems simultaneously, we propose the Graph Receptive Transformer Encoder (GRTE), which combines graph neural networks (GNNs) with large-scale pre-trained models for text classification in both inductive and transductive fashions. By constructing heterogeneous and homogeneous graphs over given corpora and not requiring a pre-training stage, GRTE can utilize information from both large-scale pre-trained models and graph-structured relations. Our proposed method retrieves global and contextual information in documents and generates word embeddings as a by-product of inductive inference. We compared the proposed GRTE with a wide range of baseline models through comprehensive experiments. Compared to the state-of-the-art, we demonstrated that GRTE improves model performances and offers computational savings up to ˜100×.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Transactions on Signal and Information Processing over Networks
IEEE Transactions on Signal and Information Processing over Networks Computer Science-Computer Networks and Communications
CiteScore
5.80
自引率
12.50%
发文量
56
期刊介绍: The IEEE Transactions on Signal and Information Processing over Networks publishes high-quality papers that extend the classical notions of processing of signals defined over vector spaces (e.g. time and space) to processing of signals and information (data) defined over networks, potentially dynamically varying. In signal processing over networks, the topology of the network may define structural relationships in the data, or may constrain processing of the data. Topics include distributed algorithms for filtering, detection, estimation, adaptation and learning, model selection, data fusion, and diffusion or evolution of information over such networks, and applications of distributed signal processing.
期刊最新文献
Reinforcement Learning-Based Event-Triggered Constrained Containment Control for Perturbed Multiagent Systems Finite-Time Performance Mask Function-Based Distributed Privacy-Preserving Consensus: Case Study on Optimal Dispatch of Energy System Discrete-Time Controllability of Cartesian Product Networks Generalized Simplicial Attention Neural Networks A Continuous-Time Algorithm for Distributed Optimization With Nonuniform Time-Delay Under Switching and Unbalanced Digraphs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1