Multi-Head Encoding for Extreme Label Classification

IF 18.6 IEEE transactions on pattern analysis and machine intelligence Pub Date : 2024-12-25 DOI:10.1109/TPAMI.2024.3522298

Daojun Liang;Haixia Zhang;Dongfeng Yuan;Minggao Zhang

{"title":"Multi-Head Encoding for Extreme Label Classification","authors":"Daojun Liang;Haixia Zhang;Dongfeng Yuan;Minggao Zhang","doi":"10.1109/TPAMI.2024.3522298","DOIUrl":null,"url":null,"abstract":"The number of categories of instances in the real world is normally huge, and each instance may contain multiple labels. To distinguish these massive labels utilizing machine learning, eXtreme Label Classification (XLC) has been established. However, as the number of categories increases, the number of parameters and nonlinear operations in the classifier also rises. This results in a Classifier Computational Overload Problem (CCOP). To address this, we propose a Multi-Head Encoding (MHE) mechanism, which replaces the vanilla classifier with a multi-head classifier. During the training process, MHE decomposes extreme labels into the product of multiple short local labels, with each head trained on these local labels. During testing, the predicted labels can be directly calculated from the local predictions of each head. This reduces the computational load geometrically. Then, according to the characteristics of different XLC tasks, e.g., single-label, multi-label, and model pretraining tasks, three MHE-based implementations, i.e., Multi-Head Product, Multi-Head Cascade, and Multi-Head Sampling, are proposed to more effectively cope with CCOP. Moreover, we theoretically demonstrate that MHE can achieve performance approximately equivalent to that of the vanilla classifier by generalizing the low-rank approximation problem from Frobenius-norm to Cross-Entropy. Experimental results show that the proposed methods achieve state-of-the-art performance while significantly streamlining the training and inference processes of XLC tasks.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 3","pages":"2199-2211"},"PeriodicalIF":18.6000,"publicationDate":"2024-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10816186/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The number of categories of instances in the real world is normally huge, and each instance may contain multiple labels. To distinguish these massive labels utilizing machine learning, eXtreme Label Classification (XLC) has been established. However, as the number of categories increases, the number of parameters and nonlinear operations in the classifier also rises. This results in a Classifier Computational Overload Problem (CCOP). To address this, we propose a Multi-Head Encoding (MHE) mechanism, which replaces the vanilla classifier with a multi-head classifier. During the training process, MHE decomposes extreme labels into the product of multiple short local labels, with each head trained on these local labels. During testing, the predicted labels can be directly calculated from the local predictions of each head. This reduces the computational load geometrically. Then, according to the characteristics of different XLC tasks, e.g., single-label, multi-label, and model pretraining tasks, three MHE-based implementations, i.e., Multi-Head Product, Multi-Head Cascade, and Multi-Head Sampling, are proposed to more effectively cope with CCOP. Moreover, we theoretically demonstrate that MHE can achieve performance approximately equivalent to that of the vanilla classifier by generalizing the low-rank approximation problem from Frobenius-norm to Cross-Entropy. Experimental results show that the proposed methods achieve state-of-the-art performance while significantly streamlining the training and inference processes of XLC tasks.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

极端标签分类的多头编码

在现实世界中，实例的类别数量通常是巨大的，每个实例可能包含多个标签。为了利用机器学习来区分这些大量的标签，建立了极限标签分类（XLC）。然而，随着类别数量的增加，分类器中参数和非线性操作的数量也会增加。这就导致了分类器计算过载问题（CCOP）。为了解决这个问题，我们提出了一个多头编码（MHE）机制，它用多头分类器取代了传统的分类器。在训练过程中，MHE将极端标签分解成多个短局部标签的乘积，每个头部都在这些局部标签上进行训练。在测试过程中，可以直接从每个头部的局部预测中计算出预测标签。这在几何上减少了计算负荷。然后，根据不同XLC任务的特点，如单标签、多标签和模型预训练任务，提出了3种基于mhe的实现方法，即多头产品、多头级联和多头采样，以更有效地应对CCOP。此外，我们从理论上证明了MHE可以通过将低秩近似问题从Frobenius-norm推广到交叉熵来实现与香草分类器近似等效的性能。实验结果表明，该方法在显著简化XLC任务的训练和推理过程的同时，达到了最先进的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE transactions on pattern analysis and machine intelligence

自引率

0.00%

发文量

期刊最新文献

Learning-Based Multi-View Stereo: A Survey. GrowSP++: Growing Superpoints and Primitives for Unsupervised 3D Semantic Segmentation. Unsupervised Gaze Representation Learning by Switching Features. H₂OT: Hierarchical Hourglass Tokenizer for Efficient Video Pose Transformers. MV2DFusion: Leveraging Modality-Specific Object Semantics for Multi-Modal 3D Detection.