{"title":"基于BERT的电影评论自动分类系统","authors":"Shruti Jain, Shivani Rana, Rakesh Kanji","doi":"10.2174/2666255816666230507182018","DOIUrl":null,"url":null,"abstract":"\n\nText classification emerged as an important approach to advancing Natural Language Processing (NLP) applications concerning the available text on the web. To analyze the text, many applications are proposed in the literature.\n\n\n\nThe NLP, with the help of deep learning, has achieved great success in automatically sorting text data in predefined classes, but this process is expensive & time-consuming.\n\n\n\nTo overcome this problem, in this paper, various Machine Learning techniques are studied & implemented to generate an automated system for movie review classification.\n\n\n\nThe proposed methodology uses the Bidirectional Encoder Representations of the Transformer (BERT) model for data preparation and predictions using various machine learning algorithms like XG boost, support vector machine, logistic regression, naïve Bayes, and neural network. The algorithms are analyzed based on various performance metrics like accuracy, precision, recall & F1 score.\n\n\n\nThe results reveal that the 2-hidden layer neural network outperforms the other models by achieving more than 0.90 F1 score in the first 15 epochs and 0.99 in just 40 epochs on the IMDB dataset, thus reducing the time to a great extent.\n\n\n\n100% accuracy is attained using a neural network, resulting in a 15% accuracy improvement and 14.6% F1 score improvement over logistic regression.\n","PeriodicalId":36514,"journal":{"name":"Recent Advances in Computer Science and Communications","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automated System for Movie Review Classification using BERT\",\"authors\":\"Shruti Jain, Shivani Rana, Rakesh Kanji\",\"doi\":\"10.2174/2666255816666230507182018\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n\\nText classification emerged as an important approach to advancing Natural Language Processing (NLP) applications concerning the available text on the web. To analyze the text, many applications are proposed in the literature.\\n\\n\\n\\nThe NLP, with the help of deep learning, has achieved great success in automatically sorting text data in predefined classes, but this process is expensive & time-consuming.\\n\\n\\n\\nTo overcome this problem, in this paper, various Machine Learning techniques are studied & implemented to generate an automated system for movie review classification.\\n\\n\\n\\nThe proposed methodology uses the Bidirectional Encoder Representations of the Transformer (BERT) model for data preparation and predictions using various machine learning algorithms like XG boost, support vector machine, logistic regression, naïve Bayes, and neural network. The algorithms are analyzed based on various performance metrics like accuracy, precision, recall & F1 score.\\n\\n\\n\\nThe results reveal that the 2-hidden layer neural network outperforms the other models by achieving more than 0.90 F1 score in the first 15 epochs and 0.99 in just 40 epochs on the IMDB dataset, thus reducing the time to a great extent.\\n\\n\\n\\n100% accuracy is attained using a neural network, resulting in a 15% accuracy improvement and 14.6% F1 score improvement over logistic regression.\\n\",\"PeriodicalId\":36514,\"journal\":{\"name\":\"Recent Advances in Computer Science and Communications\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Recent Advances in Computer Science and Communications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2174/2666255816666230507182018\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Recent Advances in Computer Science and Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2174/2666255816666230507182018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Computer Science","Score":null,"Total":0}
Automated System for Movie Review Classification using BERT
Text classification emerged as an important approach to advancing Natural Language Processing (NLP) applications concerning the available text on the web. To analyze the text, many applications are proposed in the literature.
The NLP, with the help of deep learning, has achieved great success in automatically sorting text data in predefined classes, but this process is expensive & time-consuming.
To overcome this problem, in this paper, various Machine Learning techniques are studied & implemented to generate an automated system for movie review classification.
The proposed methodology uses the Bidirectional Encoder Representations of the Transformer (BERT) model for data preparation and predictions using various machine learning algorithms like XG boost, support vector machine, logistic regression, naïve Bayes, and neural network. The algorithms are analyzed based on various performance metrics like accuracy, precision, recall & F1 score.
The results reveal that the 2-hidden layer neural network outperforms the other models by achieving more than 0.90 F1 score in the first 15 epochs and 0.99 in just 40 epochs on the IMDB dataset, thus reducing the time to a great extent.
100% accuracy is attained using a neural network, resulting in a 15% accuracy improvement and 14.6% F1 score improvement over logistic regression.