{"title":"Deep Learning Model for Identification and Classification of Web based Toxic Comments","authors":"Anubhav Shukla, D. Arora","doi":"10.1109/APSIT58554.2023.10201794","DOIUrl":null,"url":null,"abstract":"Everyday, many individuals face online trolling and receive hate on different social media platforms like Twitter, Instagram to name a few. Often these comments involving racial abuse, hate based on religion, caste are made by anonymous people over the internet, and it is quite a task to keep these comments under control. So, the objective was to develop a Machine Learning Model to help identify these comments. A Deep Learning Model (a sequential model) was made and it was trained to identify and classify a comment based on whether it is an apt comment or not. LSTM (Long Short-Term Memory) is a type of recurrent neural network (RNN) that is particularly well-suited for modeling sequential data, such as text. LSTMs are capable of modeling long-term dependencies in sequential data. In the case of text classification, this means that LSTMs can take into account the context of a word or phrase within a sentence, paragraph, or even an entire document. LSTMs can learn to selectively forget or remember information from the past, which is useful for filtering out noise or irrelevant information in text. LSTMs are well-established in the field of natural language processing (NLP) and have been shown to be effective for various NLP tasks, including sentiment analysis and text classification. Binary cross-entropy is a commonly used loss function in deep learning models for binary classification problems, such as predicting whether a comment is toxic or not. Binary cross-entropy is designed to optimize the model's predictions based on the binary nature of the classification task. It penalizes the model for assigning a low probability to the correct class and rewards it for assigning a high probability to the correct class. The loss function is differentiable, which allows gradient-based optimization methods to be used during training to minimize the loss and improve the model's performance. Binary cross-entropy is a well-established loss function that has been extensively used in the field of deep learning, and there are many tools and frameworks that support it, making it easy to implement in practice. Binary cross-entropy also has a probabilistic interpretation, which can be useful in some applications. For example, it can be used to estimate the probability that a given comment is toxic. Hence, Binary Cross Entropy has been chosen as the loss function for the Deep Learning model.","PeriodicalId":170044,"journal":{"name":"2023 International Conference in Advances in Power, Signal, and Information Technology (APSIT)","volume":"88 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference in Advances in Power, Signal, and Information Technology (APSIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APSIT58554.2023.10201794","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Everyday, many individuals face online trolling and receive hate on different social media platforms like Twitter, Instagram to name a few. Often these comments involving racial abuse, hate based on religion, caste are made by anonymous people over the internet, and it is quite a task to keep these comments under control. So, the objective was to develop a Machine Learning Model to help identify these comments. A Deep Learning Model (a sequential model) was made and it was trained to identify and classify a comment based on whether it is an apt comment or not. LSTM (Long Short-Term Memory) is a type of recurrent neural network (RNN) that is particularly well-suited for modeling sequential data, such as text. LSTMs are capable of modeling long-term dependencies in sequential data. In the case of text classification, this means that LSTMs can take into account the context of a word or phrase within a sentence, paragraph, or even an entire document. LSTMs can learn to selectively forget or remember information from the past, which is useful for filtering out noise or irrelevant information in text. LSTMs are well-established in the field of natural language processing (NLP) and have been shown to be effective for various NLP tasks, including sentiment analysis and text classification. Binary cross-entropy is a commonly used loss function in deep learning models for binary classification problems, such as predicting whether a comment is toxic or not. Binary cross-entropy is designed to optimize the model's predictions based on the binary nature of the classification task. It penalizes the model for assigning a low probability to the correct class and rewards it for assigning a high probability to the correct class. The loss function is differentiable, which allows gradient-based optimization methods to be used during training to minimize the loss and improve the model's performance. Binary cross-entropy is a well-established loss function that has been extensively used in the field of deep learning, and there are many tools and frameworks that support it, making it easy to implement in practice. Binary cross-entropy also has a probabilistic interpretation, which can be useful in some applications. For example, it can be used to estimate the probability that a given comment is toxic. Hence, Binary Cross Entropy has been chosen as the loss function for the Deep Learning model.