{"title":"用于酒店评论阿拉伯语方面类别检测的多标签学习增强方法","authors":"Asma Ameur, Sana Hamdi, Sadok Ben Yahia","doi":"10.1111/coin.12609","DOIUrl":null,"url":null,"abstract":"<p>In many fields, like aspect category detection (ACD) in aspect-based sentiment analysis, it is necessary to label each instance with more than one label at the same time. This study tackles the multilabel classification problem in the ACD task for the Arabic language. For this purpose, we used Arabic hotel reviews from the SemEval-2016 dataset, comprising 13,113 annotated tuples provided for training (10,509) and testing (2,604). To extract valuable information, we first propose specific data preprocessing. Then, we suggest using the dynamic weighted loss function and a data augmentation method to fix the problem with this dataset's imbalance. Using two possible approaches, we develop new ways to find different categories of things in a review sentence. The first is based on classifier chains using machine learning models. The second is based on transfer learning using pretrained AraBERT fine-tuning for contextual representation. Our findings show that both approaches outperformed the related works for ACD on the Arabic SemEval-2016. Moreover, we observed that AraBERT fine-tuning performed much better and achieved a promising <math>\n <semantics>\n <mrow>\n <msub>\n <mrow>\n <mi>F</mi>\n </mrow>\n <mrow>\n <mn>1</mn>\n </mrow>\n </msub>\n </mrow>\n <annotation>$$ {F}_1 $$</annotation>\n </semantics></math>-score of <math>\n <semantics>\n <mrow>\n <mn>68</mn>\n <mo>.</mo>\n <mn>02</mn>\n <mo>%</mo>\n </mrow>\n <annotation>$$ 68.02\\% $$</annotation>\n </semantics></math>.</p>","PeriodicalId":55228,"journal":{"name":"Computational Intelligence","volume":"40 1","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhanced approach of multilabel learning for the Arabic aspect category detection of the hotel reviews\",\"authors\":\"Asma Ameur, Sana Hamdi, Sadok Ben Yahia\",\"doi\":\"10.1111/coin.12609\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>In many fields, like aspect category detection (ACD) in aspect-based sentiment analysis, it is necessary to label each instance with more than one label at the same time. This study tackles the multilabel classification problem in the ACD task for the Arabic language. For this purpose, we used Arabic hotel reviews from the SemEval-2016 dataset, comprising 13,113 annotated tuples provided for training (10,509) and testing (2,604). To extract valuable information, we first propose specific data preprocessing. Then, we suggest using the dynamic weighted loss function and a data augmentation method to fix the problem with this dataset's imbalance. Using two possible approaches, we develop new ways to find different categories of things in a review sentence. The first is based on classifier chains using machine learning models. The second is based on transfer learning using pretrained AraBERT fine-tuning for contextual representation. Our findings show that both approaches outperformed the related works for ACD on the Arabic SemEval-2016. Moreover, we observed that AraBERT fine-tuning performed much better and achieved a promising <math>\\n <semantics>\\n <mrow>\\n <msub>\\n <mrow>\\n <mi>F</mi>\\n </mrow>\\n <mrow>\\n <mn>1</mn>\\n </mrow>\\n </msub>\\n </mrow>\\n <annotation>$$ {F}_1 $$</annotation>\\n </semantics></math>-score of <math>\\n <semantics>\\n <mrow>\\n <mn>68</mn>\\n <mo>.</mo>\\n <mn>02</mn>\\n <mo>%</mo>\\n </mrow>\\n <annotation>$$ 68.02\\\\% $$</annotation>\\n </semantics></math>.</p>\",\"PeriodicalId\":55228,\"journal\":{\"name\":\"Computational Intelligence\",\"volume\":\"40 1\",\"pages\":\"\"},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2023-11-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computational Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/coin.12609\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Intelligence","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/coin.12609","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Enhanced approach of multilabel learning for the Arabic aspect category detection of the hotel reviews
In many fields, like aspect category detection (ACD) in aspect-based sentiment analysis, it is necessary to label each instance with more than one label at the same time. This study tackles the multilabel classification problem in the ACD task for the Arabic language. For this purpose, we used Arabic hotel reviews from the SemEval-2016 dataset, comprising 13,113 annotated tuples provided for training (10,509) and testing (2,604). To extract valuable information, we first propose specific data preprocessing. Then, we suggest using the dynamic weighted loss function and a data augmentation method to fix the problem with this dataset's imbalance. Using two possible approaches, we develop new ways to find different categories of things in a review sentence. The first is based on classifier chains using machine learning models. The second is based on transfer learning using pretrained AraBERT fine-tuning for contextual representation. Our findings show that both approaches outperformed the related works for ACD on the Arabic SemEval-2016. Moreover, we observed that AraBERT fine-tuning performed much better and achieved a promising -score of .
期刊介绍:
This leading international journal promotes and stimulates research in the field of artificial intelligence (AI). Covering a wide range of issues - from the tools and languages of AI to its philosophical implications - Computational Intelligence provides a vigorous forum for the publication of both experimental and theoretical research, as well as surveys and impact studies. The journal is designed to meet the needs of a wide range of AI workers in academic and industrial research.