{"title":"自然语言生成的错误分析——以Topic-to-Essay生成为例","authors":"Ping Cai, Xingyuan Chen, Hongjun Wang, Peng Jin","doi":"10.1109/CIS52066.2020.00027","DOIUrl":null,"url":null,"abstract":"Although natural language generation (NLG) has achieved great success, there are still many problems with the generated text, if humans carefully examine it. To analyze the problems of NLG, we use manual evaluation methods to annotate and analyze the text generated by NLG. According to the analysis results, we can understand the defects of NLG in-depth, comprehensively, and accurately. Further, these provide cues for future improvement. In this paper, we first use a state-of-the-art Topic-to-Essay generation model to generate texts conditional on some topic words. Then, by analyzing the generated text, we propose an annotation framework, and then quantify the main drawbacks of current NLG, including poor semantic coherence, content duplication, logic errors, and repetition. It shows that the text generated by the current sequence-to-sequence model is still far from human expectation.","PeriodicalId":106959,"journal":{"name":"2020 16th International Conference on Computational Intelligence and Security (CIS)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"The errors analysis of natural language generation — A case study of Topic-to-Essay generation\",\"authors\":\"Ping Cai, Xingyuan Chen, Hongjun Wang, Peng Jin\",\"doi\":\"10.1109/CIS52066.2020.00027\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Although natural language generation (NLG) has achieved great success, there are still many problems with the generated text, if humans carefully examine it. To analyze the problems of NLG, we use manual evaluation methods to annotate and analyze the text generated by NLG. According to the analysis results, we can understand the defects of NLG in-depth, comprehensively, and accurately. Further, these provide cues for future improvement. In this paper, we first use a state-of-the-art Topic-to-Essay generation model to generate texts conditional on some topic words. Then, by analyzing the generated text, we propose an annotation framework, and then quantify the main drawbacks of current NLG, including poor semantic coherence, content duplication, logic errors, and repetition. It shows that the text generated by the current sequence-to-sequence model is still far from human expectation.\",\"PeriodicalId\":106959,\"journal\":{\"name\":\"2020 16th International Conference on Computational Intelligence and Security (CIS)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 16th International Conference on Computational Intelligence and Security (CIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CIS52066.2020.00027\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 16th International Conference on Computational Intelligence and Security (CIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIS52066.2020.00027","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The errors analysis of natural language generation — A case study of Topic-to-Essay generation
Although natural language generation (NLG) has achieved great success, there are still many problems with the generated text, if humans carefully examine it. To analyze the problems of NLG, we use manual evaluation methods to annotate and analyze the text generated by NLG. According to the analysis results, we can understand the defects of NLG in-depth, comprehensively, and accurately. Further, these provide cues for future improvement. In this paper, we first use a state-of-the-art Topic-to-Essay generation model to generate texts conditional on some topic words. Then, by analyzing the generated text, we propose an annotation framework, and then quantify the main drawbacks of current NLG, including poor semantic coherence, content duplication, logic errors, and repetition. It shows that the text generated by the current sequence-to-sequence model is still far from human expectation.