{"title":"Aspect-based sentiment classification with BERT and AI feedback","authors":"Lingling Xu, Weiming Wang","doi":"10.1016/j.nlp.2025.100136","DOIUrl":null,"url":null,"abstract":"<div><div>Data augmentation has been widely employed in low-resource aspect-based sentiment classification (ABSC) tasks to alleviate the issue of data sparsity and enhance the performance of the model. Unlike previous data augmentation approaches that rely on back translation, synonym replacement, or generative language models such as T5, the generation power of large language models is explored rarely. Large language models like GPT-3.5-turbo are trained on extensive datasets and corpus to capture semantic and contextual relationships between words and sentences. To this end, we propose Masked Aspect Term Prediction (MATP), a novel data augmentation method that utilizes the world knowledge and powerful generative capacity of large language models to generate new aspect terms via word masking. By incorporating AI feedback from large language models, MATP increases the diversity and richness of aspect terms. Experimental results on the ABSC datasets with BERT as the backbone model show that the introduction of new augmented datasets leads to significant improvements over baseline models, validating the effectiveness of the proposed data augmentation strategy that combines AI feedback.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"10 ","pages":"Article 100136"},"PeriodicalIF":0.0000,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Natural Language Processing Journal","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949719125000123","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Data augmentation has been widely employed in low-resource aspect-based sentiment classification (ABSC) tasks to alleviate the issue of data sparsity and enhance the performance of the model. Unlike previous data augmentation approaches that rely on back translation, synonym replacement, or generative language models such as T5, the generation power of large language models is explored rarely. Large language models like GPT-3.5-turbo are trained on extensive datasets and corpus to capture semantic and contextual relationships between words and sentences. To this end, we propose Masked Aspect Term Prediction (MATP), a novel data augmentation method that utilizes the world knowledge and powerful generative capacity of large language models to generate new aspect terms via word masking. By incorporating AI feedback from large language models, MATP increases the diversity and richness of aspect terms. Experimental results on the ABSC datasets with BERT as the backbone model show that the introduction of new augmented datasets leads to significant improvements over baseline models, validating the effectiveness of the proposed data augmentation strategy that combines AI feedback.