On the Value of Head Labels in Multi-Label Text Classification

IF 4.8 3区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS ACM Transactions on Knowledge Discovery from Data Pub Date : 2024-02-05 DOI:10.1145/3643853

Haobo Wang, Cheng Peng, Hede Dong, Lei Feng, Weiwei Liu, Tianlei Hu, Ke Chen, Gang Chen

引用次数: 0

Abstract

A formidable challenge in the multi-label text classification (MLTC) context is that the labels often exhibit a long-tailed distribution, which typically prevents deep MLTC models from obtaining satisfactory performance. To alleviate this problem, most existing solutions attempt to improve tail performance by means of sampling or introducing extra knowledge. Data-rich labels, though more trustworthy, have not received the attention they deserve. In this work, we propose a multiple-stage training framework to exploit both model- and feature-level knowledge from the head labels, to improve both the representation and generalization ability of MLTC models. Moreover, we theoretically prove the superiority of our framework design over other alternatives. Comprehensive experiments on widely-used MLTC datasets clearly demonstrate that the proposed framework achieves highly superior results to state-of-the-art methods, highlighting the value of head labels in MLTC.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

论多标签文本分类中头部标签的价值

多标签文本分类（MLTC）面临的一个严峻挑战是，标签经常呈现长尾分布，这通常会阻碍深度 MLTC 模型获得令人满意的性能。为了缓解这一问题，大多数现有解决方案都试图通过采样或引入额外知识来提高尾部性能。数据丰富的标签虽然更值得信赖，但却没有得到应有的重视。在这项工作中，我们提出了一个多阶段训练框架，利用来自头部标签的模型级和特征级知识，来提高 MLTC 模型的表示和泛化能力。此外，我们还从理论上证明了我们的框架设计优于其他替代方案。在广泛使用的 MLTC 数据集上进行的综合实验清楚地表明，与最先进的方法相比，所提出的框架取得了非常优越的结果，凸显了头部标签在 MLTC 中的价值。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

ACM Transactions on Knowledge Discovery from Data COMPUTER SCIENCE, INFORMATION SYSTEMS-COMPUTER SCIENCE, SOFTWARE ENGINEERING

CiteScore

6.70

自引率

5.60%

发文量

172

审稿时长

3 months

期刊介绍： TKDD welcomes papers on a full range of research in the knowledge discovery and analysis of diverse forms of data. Such subjects include, but are not limited to: scalable and effective algorithms for data mining and big data analysis, mining brain networks, mining data streams, mining multi-media data, mining high-dimensional data, mining text, Web, and semi-structured data, mining spatial and temporal data, data mining for community generation, social network analysis, and graph structured data, security and privacy issues in data mining, visual, interactive and online data mining, pre-processing and post-processing for data mining, robust and scalable statistical methods, data mining languages, foundations of data mining, KDD framework and process, and novel applications and infrastructures exploiting data mining technology including massively parallel processing and cloud computing platforms. TKDD encourages papers that explore the above subjects in the context of large distributed networks of computers, parallel or multiprocessing computers, or new data devices. TKDD also encourages papers that describe emerging data mining applications that cannot be satisfied by the current data mining technology.