{"title":"基于数据增强优化的索赔立场分类","authors":"Bai Wei, Zhuang Yan","doi":"10.1109/ICCWAMTIP53232.2021.9674073","DOIUrl":null,"url":null,"abstract":"As online fora increasingly become the main media for argument and debate, the automatic processing of such data is rapidly becoming more and more important. Stance classification, which aims to classify the stance of the claims towards the given topic, can be applied in many application areas such as users' feelings about services and products. We propose a ensemble model for stance classification with data augment for small sample scenarios, multi-sample dropout for low training speed scenarios, focal loss for imbalance sample scenarios, pseudo labels for self-supervised training scenarios, adversarial training for low robustness scenarios, and all the above can be used in normal scenarios. Besides, the ensemble model is composed of task-specific RoBERTa and MacBERT, which can make more reasonable predictions. We used dataset from NLPCC to validate the model and it worked well.","PeriodicalId":358772,"journal":{"name":"2021 18th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Claim Stance Classification Optimized by Data Augment\",\"authors\":\"Bai Wei, Zhuang Yan\",\"doi\":\"10.1109/ICCWAMTIP53232.2021.9674073\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As online fora increasingly become the main media for argument and debate, the automatic processing of such data is rapidly becoming more and more important. Stance classification, which aims to classify the stance of the claims towards the given topic, can be applied in many application areas such as users' feelings about services and products. We propose a ensemble model for stance classification with data augment for small sample scenarios, multi-sample dropout for low training speed scenarios, focal loss for imbalance sample scenarios, pseudo labels for self-supervised training scenarios, adversarial training for low robustness scenarios, and all the above can be used in normal scenarios. Besides, the ensemble model is composed of task-specific RoBERTa and MacBERT, which can make more reasonable predictions. We used dataset from NLPCC to validate the model and it worked well.\",\"PeriodicalId\":358772,\"journal\":{\"name\":\"2021 18th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP)\",\"volume\":\"50 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 18th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCWAMTIP53232.2021.9674073\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 18th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCWAMTIP53232.2021.9674073","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Claim Stance Classification Optimized by Data Augment
As online fora increasingly become the main media for argument and debate, the automatic processing of such data is rapidly becoming more and more important. Stance classification, which aims to classify the stance of the claims towards the given topic, can be applied in many application areas such as users' feelings about services and products. We propose a ensemble model for stance classification with data augment for small sample scenarios, multi-sample dropout for low training speed scenarios, focal loss for imbalance sample scenarios, pseudo labels for self-supervised training scenarios, adversarial training for low robustness scenarios, and all the above can be used in normal scenarios. Besides, the ensemble model is composed of task-specific RoBERTa and MacBERT, which can make more reasonable predictions. We used dataset from NLPCC to validate the model and it worked well.