{"title":"GEML:通过相互学习进行文本分类的图增强预训练语言模型框架","authors":"Tao Yu, Rui Song, Sandro Pinto, Tiago Gomes, Adriano Tavares, Hao Xu","doi":"10.1007/s10489-024-05831-1","DOIUrl":null,"url":null,"abstract":"<div><p>Large-scale Pre-trained Language Models (PLMs) have become the backbones of text classification due to their exceptional performance. However, they treat input documents as independent and uniformly distributed, thereby disregarding potential relationships among the documents. This limitation could lead to some miscalculations and inaccuracies in text classification. To address this issue, some recent work explores the integration of Graph Neural Networks (GNNs) with PLMs, as GNNs can effectively model document relationships. Yet, combining graph-based methods with PLMs is challenging due to the structural incompatibility between graphs and sequences. To tackle this challenge, we propose a graph-enhanced text mutual learning framework that integrates graph-based models with PLMs to boost classification performance. Our approach separates graph-based methods and language models into two independent channels and allows them to approximate each other through mutual learning of probability distributions. This probability-distribution-guided approach simplifies the adaptation of graph-based models to PLMs and enables seamless end-to-end training of the entire architecture. Moreover, we introduce Asymmetrical Learning, a strategy that enhances the learning process, and incorporate Uncertainty Weighting loss to achieve smoother probability distribution learning. These enhancements significantly improve the performance of mutual learning. The practical value of our research lies in its potential applications in various industries, such as social network analysis, information retrieval, and recommendation systems, where understanding and leveraging document relationships are crucial. Importantly, our method can be easily combined with different PLMs and consistently achieves state-of-the-art results on multiple public datasets.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 23","pages":"12215 - 12229"},"PeriodicalIF":3.4000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"GEML: a graph-enhanced pre-trained language model framework for text classification via mutual learning\",\"authors\":\"Tao Yu, Rui Song, Sandro Pinto, Tiago Gomes, Adriano Tavares, Hao Xu\",\"doi\":\"10.1007/s10489-024-05831-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Large-scale Pre-trained Language Models (PLMs) have become the backbones of text classification due to their exceptional performance. However, they treat input documents as independent and uniformly distributed, thereby disregarding potential relationships among the documents. This limitation could lead to some miscalculations and inaccuracies in text classification. To address this issue, some recent work explores the integration of Graph Neural Networks (GNNs) with PLMs, as GNNs can effectively model document relationships. Yet, combining graph-based methods with PLMs is challenging due to the structural incompatibility between graphs and sequences. To tackle this challenge, we propose a graph-enhanced text mutual learning framework that integrates graph-based models with PLMs to boost classification performance. Our approach separates graph-based methods and language models into two independent channels and allows them to approximate each other through mutual learning of probability distributions. This probability-distribution-guided approach simplifies the adaptation of graph-based models to PLMs and enables seamless end-to-end training of the entire architecture. Moreover, we introduce Asymmetrical Learning, a strategy that enhances the learning process, and incorporate Uncertainty Weighting loss to achieve smoother probability distribution learning. These enhancements significantly improve the performance of mutual learning. The practical value of our research lies in its potential applications in various industries, such as social network analysis, information retrieval, and recommendation systems, where understanding and leveraging document relationships are crucial. Importantly, our method can be easily combined with different PLMs and consistently achieves state-of-the-art results on multiple public datasets.</p></div>\",\"PeriodicalId\":8041,\"journal\":{\"name\":\"Applied Intelligence\",\"volume\":\"54 23\",\"pages\":\"12215 - 12229\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2024-09-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10489-024-05831-1\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-024-05831-1","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
GEML: a graph-enhanced pre-trained language model framework for text classification via mutual learning
Large-scale Pre-trained Language Models (PLMs) have become the backbones of text classification due to their exceptional performance. However, they treat input documents as independent and uniformly distributed, thereby disregarding potential relationships among the documents. This limitation could lead to some miscalculations and inaccuracies in text classification. To address this issue, some recent work explores the integration of Graph Neural Networks (GNNs) with PLMs, as GNNs can effectively model document relationships. Yet, combining graph-based methods with PLMs is challenging due to the structural incompatibility between graphs and sequences. To tackle this challenge, we propose a graph-enhanced text mutual learning framework that integrates graph-based models with PLMs to boost classification performance. Our approach separates graph-based methods and language models into two independent channels and allows them to approximate each other through mutual learning of probability distributions. This probability-distribution-guided approach simplifies the adaptation of graph-based models to PLMs and enables seamless end-to-end training of the entire architecture. Moreover, we introduce Asymmetrical Learning, a strategy that enhances the learning process, and incorporate Uncertainty Weighting loss to achieve smoother probability distribution learning. These enhancements significantly improve the performance of mutual learning. The practical value of our research lies in its potential applications in various industries, such as social network analysis, information retrieval, and recommendation systems, where understanding and leveraging document relationships are crucial. Importantly, our method can be easily combined with different PLMs and consistently achieves state-of-the-art results on multiple public datasets.
期刊介绍:
With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance.
The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.