{"title":"Data Generation, Testing and Evaluation of Chinese Natural Language Processing in the Cloud","authors":"Minjie Ding, Mingang Chen, Wenjie Chen, Lizhi Cai, Yuanhao Chai","doi":"10.1109/SmartCloud55982.2022.00020","DOIUrl":null,"url":null,"abstract":"With the rapid development of artificial intelligence, natural language processing, as an important branch, has also become a hot research field. A series of super large-scale pre-trained models represented by BERT and GPT have made great progress in natural language understanding and natural language generation, even some of the experimental accuracy exceed the human benchmark. However, these models will also make some mistakes and even fairness problems when they have the language ability equivalent to human beings. In order to verify whether the models can truly understand natural language, the evaluation of these models is particularly important. More methods are needed to evaluate the model. The language model-based evaluation tools often require a lot of computing resources. In this paper, we propose a method for testing and evaluation of Chinese natural language processing in cloud, generate testing data and design tests for Chinese data and test two pre-trained models. The experimental results show that our method can find defects of the model, though it has high performance on specific dataset.","PeriodicalId":104366,"journal":{"name":"2022 IEEE 7th International Conference on Smart Cloud (SmartCloud)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 7th International Conference on Smart Cloud (SmartCloud)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SmartCloud55982.2022.00020","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
With the rapid development of artificial intelligence, natural language processing, as an important branch, has also become a hot research field. A series of super large-scale pre-trained models represented by BERT and GPT have made great progress in natural language understanding and natural language generation, even some of the experimental accuracy exceed the human benchmark. However, these models will also make some mistakes and even fairness problems when they have the language ability equivalent to human beings. In order to verify whether the models can truly understand natural language, the evaluation of these models is particularly important. More methods are needed to evaluate the model. The language model-based evaluation tools often require a lot of computing resources. In this paper, we propose a method for testing and evaluation of Chinese natural language processing in cloud, generate testing data and design tests for Chinese data and test two pre-trained models. The experimental results show that our method can find defects of the model, though it has high performance on specific dataset.