{"title":"基于医学语料库上转换器的多语种双向编码器表示用于库尔德语文本分类","authors":"Soran S. Badawi","doi":"10.14500/aro.11088","DOIUrl":null,"url":null,"abstract":"Technology has dominated a huge part of human life. Furthermore, technology users use language continuously to express feelings and sentiments about things. The science behind identifying human attitudes toward a particular product, service,or topic is one of the most active fields of research, and it is called sentiment analysis. While the English language is making real progress in sentiment analysis daily, other less-resourced languages, such as Kurdish, still suffer from fundamental issues and challenges in Natural Language Processing (NLP). This paper experimentswith the recently published medical corpus using the classical machine learning method and the latest deep learning tool in NLP and Bidirectional Encoder Representations from Transformers (BERT). We evaluated the findings of both machine learning and deep learning. The outcome indicates that BERT outperforms all the machine learning classifiers by scoring (92%) in accuracy, which is by two points higher than machine learning classifiers.","PeriodicalId":8398,"journal":{"name":"ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY","volume":null,"pages":null},"PeriodicalIF":1.2000,"publicationDate":"2023-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Using Multilingual Bidirectional Encoder Representations from Transformers on Medical Corpus for Kurdish Text Classification\",\"authors\":\"Soran S. Badawi\",\"doi\":\"10.14500/aro.11088\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Technology has dominated a huge part of human life. Furthermore, technology users use language continuously to express feelings and sentiments about things. The science behind identifying human attitudes toward a particular product, service,or topic is one of the most active fields of research, and it is called sentiment analysis. While the English language is making real progress in sentiment analysis daily, other less-resourced languages, such as Kurdish, still suffer from fundamental issues and challenges in Natural Language Processing (NLP). This paper experimentswith the recently published medical corpus using the classical machine learning method and the latest deep learning tool in NLP and Bidirectional Encoder Representations from Transformers (BERT). We evaluated the findings of both machine learning and deep learning. The outcome indicates that BERT outperforms all the machine learning classifiers by scoring (92%) in accuracy, which is by two points higher than machine learning classifiers.\",\"PeriodicalId\":8398,\"journal\":{\"name\":\"ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.2000,\"publicationDate\":\"2023-01-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.14500/aro.11088\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14500/aro.11088","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
Using Multilingual Bidirectional Encoder Representations from Transformers on Medical Corpus for Kurdish Text Classification
Technology has dominated a huge part of human life. Furthermore, technology users use language continuously to express feelings and sentiments about things. The science behind identifying human attitudes toward a particular product, service,or topic is one of the most active fields of research, and it is called sentiment analysis. While the English language is making real progress in sentiment analysis daily, other less-resourced languages, such as Kurdish, still suffer from fundamental issues and challenges in Natural Language Processing (NLP). This paper experimentswith the recently published medical corpus using the classical machine learning method and the latest deep learning tool in NLP and Bidirectional Encoder Representations from Transformers (BERT). We evaluated the findings of both machine learning and deep learning. The outcome indicates that BERT outperforms all the machine learning classifiers by scoring (92%) in accuracy, which is by two points higher than machine learning classifiers.