Juan Yang, Guanghong Zhou, Ronggui Wang, Lixia Xue
{"title":"样本自适应分类推理网络","authors":"Juan Yang, Guanghong Zhou, Ronggui Wang, Lixia Xue","doi":"10.1007/s11063-024-11629-6","DOIUrl":null,"url":null,"abstract":"<p>Existing pre-trained models have yielded promising results in terms of computational time reduction. However, these models only focus on pruning simple sentences or less salient words, while neglecting the treatment of relatively complex sentences. It is frequently these sentences that cause the loss of model accuracy. This shows that the adaptation of the existing models is one-sided. To address this issue, in this paper, we propose a sample-adaptive training and inference model. Specifically, complex samples are extracted from the training datasets and a dedicated data augmentation module is trained to extract global and local semantic information of complex samples. During inference, simple samples can exit the model via the Sample Adaptive Exit Mechanism, Normal samples pass through the whole backbone model before inference, while complex samples are processed by the Characteristic Enhancement Module after passing through the backbone model. In this way, all samples are processed adaptively. Our extensive experiments on classification tasks datasets in the field of Natural Language Processing demonstrate that our method enhances model accuracy and reduces model inference time for multiple datasets. Moreover, our method is transferable and can be applied to multiple pre-trained models.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"23 1","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Sample-Adaptive Classification Inference Network\",\"authors\":\"Juan Yang, Guanghong Zhou, Ronggui Wang, Lixia Xue\",\"doi\":\"10.1007/s11063-024-11629-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Existing pre-trained models have yielded promising results in terms of computational time reduction. However, these models only focus on pruning simple sentences or less salient words, while neglecting the treatment of relatively complex sentences. It is frequently these sentences that cause the loss of model accuracy. This shows that the adaptation of the existing models is one-sided. To address this issue, in this paper, we propose a sample-adaptive training and inference model. Specifically, complex samples are extracted from the training datasets and a dedicated data augmentation module is trained to extract global and local semantic information of complex samples. During inference, simple samples can exit the model via the Sample Adaptive Exit Mechanism, Normal samples pass through the whole backbone model before inference, while complex samples are processed by the Characteristic Enhancement Module after passing through the backbone model. In this way, all samples are processed adaptively. Our extensive experiments on classification tasks datasets in the field of Natural Language Processing demonstrate that our method enhances model accuracy and reduces model inference time for multiple datasets. Moreover, our method is transferable and can be applied to multiple pre-trained models.</p>\",\"PeriodicalId\":51144,\"journal\":{\"name\":\"Neural Processing Letters\",\"volume\":\"23 1\",\"pages\":\"\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2024-05-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Processing Letters\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s11063-024-11629-6\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Processing Letters","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11063-024-11629-6","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Existing pre-trained models have yielded promising results in terms of computational time reduction. However, these models only focus on pruning simple sentences or less salient words, while neglecting the treatment of relatively complex sentences. It is frequently these sentences that cause the loss of model accuracy. This shows that the adaptation of the existing models is one-sided. To address this issue, in this paper, we propose a sample-adaptive training and inference model. Specifically, complex samples are extracted from the training datasets and a dedicated data augmentation module is trained to extract global and local semantic information of complex samples. During inference, simple samples can exit the model via the Sample Adaptive Exit Mechanism, Normal samples pass through the whole backbone model before inference, while complex samples are processed by the Characteristic Enhancement Module after passing through the backbone model. In this way, all samples are processed adaptively. Our extensive experiments on classification tasks datasets in the field of Natural Language Processing demonstrate that our method enhances model accuracy and reduces model inference time for multiple datasets. Moreover, our method is transferable and can be applied to multiple pre-trained models.
期刊介绍:
Neural Processing Letters is an international journal publishing research results and innovative ideas on all aspects of artificial neural networks. Coverage includes theoretical developments, biological models, new formal modes, learning, applications, software and hardware developments, and prospective researches.
The journal promotes fast exchange of information in the community of neural network researchers and users. The resurgence of interest in the field of artificial neural networks since the beginning of the 1980s is coupled to tremendous research activity in specialized or multidisciplinary groups. Research, however, is not possible without good communication between people and the exchange of information, especially in a field covering such different areas; fast communication is also a key aspect, and this is the reason for Neural Processing Letters