{"title":"Efficient Training Acceleration via Sample-Wise Dynamic Probabilistic Pruning","authors":"Feicheng Huang;Wenbo Zhou;Yue Huang;Xinghao Ding","doi":"10.1109/LSP.2024.3484289","DOIUrl":null,"url":null,"abstract":"Data pruning is observed to substantially reduce the computation and memory costs of model training. Previous studies have primarily focused on constructing a series of coresets with representative samples by leveraging predefined rules for evaluating sample importance. Learning dynamics and selection bias, however, are rarely being considered. In this letter, a novel Sample-wise Dynamic Probabilistic Pruning (SwDPP) method is proposed for efficient training. Specifically, instead of hard-pruning the samples that are considered easy or well-learned, we formulate the pruning process as a probabilistic sampling problem. This is achieved by a carefully-designed soft-selection mechanism, which constantly expresses learning dynamics and relaxes selection bias. Moreover, to alleviate the accuracy drop under high pruning rates, we introduce a probabilistic Mixup strategy for information diversity maintenance. Extensive experiments conducted on CIFAR-10, CIFAR-100 and Tiny-ImageNet show that, the proposed SwDPP outperforms current state-of-the-art methods across various pruning settings. Notably, on CIFAR-10 and CIFAR-100, SwDPP achieves lossless training acceleration using only 70% of the data per epoch.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"3034-3038"},"PeriodicalIF":3.2000,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Signal Processing Letters","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10723806/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Data pruning is observed to substantially reduce the computation and memory costs of model training. Previous studies have primarily focused on constructing a series of coresets with representative samples by leveraging predefined rules for evaluating sample importance. Learning dynamics and selection bias, however, are rarely being considered. In this letter, a novel Sample-wise Dynamic Probabilistic Pruning (SwDPP) method is proposed for efficient training. Specifically, instead of hard-pruning the samples that are considered easy or well-learned, we formulate the pruning process as a probabilistic sampling problem. This is achieved by a carefully-designed soft-selection mechanism, which constantly expresses learning dynamics and relaxes selection bias. Moreover, to alleviate the accuracy drop under high pruning rates, we introduce a probabilistic Mixup strategy for information diversity maintenance. Extensive experiments conducted on CIFAR-10, CIFAR-100 and Tiny-ImageNet show that, the proposed SwDPP outperforms current state-of-the-art methods across various pruning settings. Notably, on CIFAR-10 and CIFAR-100, SwDPP achieves lossless training acceleration using only 70% of the data per epoch.
期刊介绍:
The IEEE Signal Processing Letters is a monthly, archival publication designed to provide rapid dissemination of original, cutting-edge ideas and timely, significant contributions in signal, image, speech, language and audio processing. Papers published in the Letters can be presented within one year of their appearance in signal processing conferences such as ICASSP, GlobalSIP and ICIP, and also in several workshop organized by the Signal Processing Society.