Varun Sai Alaparthi, Ajay Abhaysing Pawar, C. M. Suneera, J. Prakash
{"title":"使用变压器评定易读性","authors":"Varun Sai Alaparthi, Ajay Abhaysing Pawar, C. M. Suneera, J. Prakash","doi":"10.1109/ICCAE55086.2022.9762413","DOIUrl":null,"url":null,"abstract":"Understanding and rating text complexity accurately can have a considerable impact on learning and education. In the past few decades, educators used traditional readability formulas to match texts with the readability level of students. This tends to oversimplify the different dimensions of text difficulty. Presently, transformer-based-language models have brought field of Natural Language Processing to a new era by understanding the text in a better way and achieving great success on many tasks. In this study, we assess the effectiveness of different pre-trained transformers on rating ease of readability. We propose a model built on top of the pre-trained Roberta transformer with weighted pooling, which uses multiple hidden states information effectively to do this task more accurately. Our experiments are done on a Dataset of English excerpts annotated by language experts which is extracted from Kaggle. On this Dataset, Our proposed model achieved 71% improvement over the traditional Flesch formula and a significant boost over other Transformer models and Long Short Term Memory(LSTM).","PeriodicalId":294641,"journal":{"name":"2022 14th International Conference on Computer and Automation Engineering (ICCAE)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Rating Ease of Readability using Transformers\",\"authors\":\"Varun Sai Alaparthi, Ajay Abhaysing Pawar, C. M. Suneera, J. Prakash\",\"doi\":\"10.1109/ICCAE55086.2022.9762413\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Understanding and rating text complexity accurately can have a considerable impact on learning and education. In the past few decades, educators used traditional readability formulas to match texts with the readability level of students. This tends to oversimplify the different dimensions of text difficulty. Presently, transformer-based-language models have brought field of Natural Language Processing to a new era by understanding the text in a better way and achieving great success on many tasks. In this study, we assess the effectiveness of different pre-trained transformers on rating ease of readability. We propose a model built on top of the pre-trained Roberta transformer with weighted pooling, which uses multiple hidden states information effectively to do this task more accurately. Our experiments are done on a Dataset of English excerpts annotated by language experts which is extracted from Kaggle. On this Dataset, Our proposed model achieved 71% improvement over the traditional Flesch formula and a significant boost over other Transformer models and Long Short Term Memory(LSTM).\",\"PeriodicalId\":294641,\"journal\":{\"name\":\"2022 14th International Conference on Computer and Automation Engineering (ICCAE)\",\"volume\":\"37 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-03-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 14th International Conference on Computer and Automation Engineering (ICCAE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCAE55086.2022.9762413\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 14th International Conference on Computer and Automation Engineering (ICCAE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCAE55086.2022.9762413","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Understanding and rating text complexity accurately can have a considerable impact on learning and education. In the past few decades, educators used traditional readability formulas to match texts with the readability level of students. This tends to oversimplify the different dimensions of text difficulty. Presently, transformer-based-language models have brought field of Natural Language Processing to a new era by understanding the text in a better way and achieving great success on many tasks. In this study, we assess the effectiveness of different pre-trained transformers on rating ease of readability. We propose a model built on top of the pre-trained Roberta transformer with weighted pooling, which uses multiple hidden states information effectively to do this task more accurately. Our experiments are done on a Dataset of English excerpts annotated by language experts which is extracted from Kaggle. On this Dataset, Our proposed model achieved 71% improvement over the traditional Flesch formula and a significant boost over other Transformer models and Long Short Term Memory(LSTM).