Pub Date : 2024-06-10DOI: 10.2174/0126662558279046240126051302
Satendra Kumar, Raj Kumar, A. Saini
One of the challenging problems facing the modern Internet is spam, which can annoy individual customers and wreak financial havoc on businesses. Spam communications target customers without their permission and clog their mailboxes. They consume more time and organizational resources when checking for and deleting spam. Even though most web users openly dislike spam, enough are willing to accept lucrative deals that spam remains a real problem. While most web users are well aware of their hatred of spam, the fact that enough of them still click on commercial offers means spammers can still make money from them. While most customers know what to do, they need clear instructions on avoiding and deleting spam. No matter what you do to eliminate spam, you won't succeed. Filtering is the most straightforward and practical technique in spam-blocking strategies. We present procedures for identifying emails as spam or ham based on text classification. Different methods of e-mail organization preprocessing are interrelated, for example, applying stop word exclusion, stemming, including reduction and highlight selection strategies to extract buzzwords from each quality, and finally, using unique classifiers to Quarantine messages as spam or ham. The Nave Bayes classifier is a good choice. Some classifiers, such as Simple Logistic and Adaboost, perform well. However, the Support Vector Machine Classifier (SVC) outperforms it. Therefore, the SVC makes decisions based on each case's comparisons and perspectives. Many spam separation studies have focused on recent classifier-related challenges. Machine Learning (ML) for spam detection is an important area of modern research. Today, spam detection using ML is an important area of research. Examine the adequacy of the proposed work and recognize the application of multiple learning estimates to extract spam from emails. Similarly, estimates have also been scrutinized.
垃圾邮件是现代互联网面临的挑战性问题之一,它不仅会惹恼个人客户,还会给企业造成经济损失。垃圾邮件在未经客户许可的情况下以客户为目标,堵塞了他们的邮箱。在检查和删除垃圾邮件时,它们会消耗更多的时间和组织资源。尽管大多数网络用户公开表示不喜欢垃圾邮件,但他们愿意接受有利可图的交易,因此垃圾邮件仍然是一个现实问题。虽然大多数网络用户都很清楚自己憎恨垃圾邮件,但他们中仍有足够多的人点击商业广告,这意味着垃圾邮件发送者仍能从中牟利。虽然大多数客户知道该怎么做,但他们需要明确的说明来避免和删除垃圾邮件。无论你用什么方法来消除垃圾邮件,都不会成功。在垃圾邮件拦截策略中,过滤是最直接、最实用的技术。我们介绍了基于文本分类识别垃圾邮件或火腿肠邮件的程序。不同的电子邮件组织预处理方法是相互关联的,例如,应用停止词排除、词干处理、包括缩减和高亮选择策略,从每个质量中提取流行词,最后,使用独特的分类器将邮件隔离为垃圾邮件或火腿肠。一些分类器,如 Simple Logistic 和 Adaboost,表现也不错。不过,支持向量机分类器(SVC)的表现要优于它。因此,支持向量机分类器根据每个案例的比较和视角做出决策。用于垃圾邮件检测的机器学习(ML)是现代研究的一个重要领域。如今,使用 ML 进行垃圾邮件检测是一个重要的研究领域。检查提议的工作是否充分,并认识到应用多种学习估计值从邮件中提取垃圾信息的重要性。同样,也对估计值进行了仔细研究。
{"title":"Supervised Learning based E-mail/ SMS Spam Classifier","authors":"Satendra Kumar, Raj Kumar, A. Saini","doi":"10.2174/0126662558279046240126051302","DOIUrl":"https://doi.org/10.2174/0126662558279046240126051302","url":null,"abstract":"\u0000\u0000One of the challenging problems facing the modern Internet is spam,\u0000which can annoy individual customers and wreak financial havoc on businesses. Spam communications target customers without their permission and clog their mailboxes. They consume\u0000more time and organizational resources when checking for and deleting spam. Even though\u0000most web users openly dislike spam, enough are willing to accept lucrative deals that spam remains a real problem. While most web users are well aware of their hatred of spam, the fact\u0000that enough of them still click on commercial offers means spammers can still make money\u0000from them. While most customers know what to do, they need clear instructions on avoiding\u0000and deleting spam. No matter what you do to eliminate spam, you won't succeed. Filtering is\u0000the most straightforward and practical technique in spam-blocking strategies.\u0000\u0000\u0000\u0000We present procedures for identifying emails as spam or ham based on text classification. Different methods of e-mail organization preprocessing are interrelated, for example, applying stop word exclusion, stemming, including reduction and highlight selection strategies to\u0000extract buzzwords from each quality, and finally, using unique classifiers to Quarantine messages as spam or ham.\u0000\u0000\u0000\u0000The Nave Bayes classifier is a good choice. Some classifiers, such as Simple Logistic\u0000and Adaboost, perform well. However, the Support Vector Machine Classifier (SVC) outperforms it. Therefore, the SVC makes decisions based on each case's comparisons and perspectives.\u0000\u0000\u0000\u0000Many spam separation studies have focused on recent classifier-related challenges. Machine Learning (ML) for spam detection is an important area of modern research. Today,\u0000spam detection using ML is an important area of research. Examine the adequacy of the proposed work and recognize the application of multiple learning estimates to extract spam from\u0000emails. Similarly, estimates have also been scrutinized.\u0000","PeriodicalId":506582,"journal":{"name":"Recent Advances in Computer Science and Communications","volume":" 57","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141366173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-06DOI: 10.2174/0126662558304595240528111535
Sandeep Kumar, Arun Solanki, NZ Jhanjhi
Prior research on abstractive text summarization has predominantly relied on the ROUGE evaluation metric, which, while effective, has limitations in capturing semantic meaning due to its focus on exact word or phrase matching. This deficiency is particularly pronounced in abstractive summarization approaches, where the goal is to generate novel summaries by rephrasing and paraphrasing the source text, highlighting the need for a more nuanced evaluation metric capable of capturing semantic similarity. In this study, the limitations of existing ROUGE metrics are addressed by proposing a novel variant called ROUGE-SS. Unlike traditional ROUGE metrics, ROUGE-SS extends beyond exact word matching to consider synonyms and semantic similarity. Leveraging resources such as the WordNet online dictionary, ROUGE-SS identifies matches between source text and summaries based on both exact word overlaps and semantic context. Experiments are conducted to evaluate the performance of ROUGE-SS compared to other ROUGE variants, particularly in assessing abstractive summarization models. The algorithm for the synonym features (ROUGE-SS) is also proposed. The experiments demonstrate the superior performance of ROUGE-SS in evaluating abstractive text summarization models compared to existing ROUGE variants. ROUGE-SS yields higher F1 scores and better overall performance, achieving a significant reduction in training loss and impressive accuracy. The proposed ROUGE-SS evaluation technique is evaluated in different datasets like CNN/Daily Mail, DUC-2004, Gigawords, and Inshorts News datasets. ROUGE-SS gives better results than other ROUGE variant metrics. The F1-score of the proposed ROUGE-SS metric is improved by an average of 8.8%. These findings underscore the effectiveness of ROUGE-SS in capturing semantic similarity and providing a more comprehensive evaluation metric for abstractive summarization. In conclusion, the introduction of ROUGE-SS represents a significant advancement in the field of abstractive text summarization evaluation. By extending beyond exact word matching to incorporate synonyms and semantic context, ROUGE-SS offers researchers a more effective tool for assessing summarization quality. This study highlights the importance of considering semantic meaning in evaluation metrics and provides a promising direction for future research on abstractive text summarization.
{"title":"ROUGE-SS: A New ROUGE Variant for the Evaluation of Text\u0000Summarization","authors":"Sandeep Kumar, Arun Solanki, NZ Jhanjhi","doi":"10.2174/0126662558304595240528111535","DOIUrl":"https://doi.org/10.2174/0126662558304595240528111535","url":null,"abstract":"\u0000\u0000Prior research on abstractive text summarization has predominantly\u0000relied on the ROUGE evaluation metric, which, while effective, has limitations in capturing\u0000semantic meaning due to its focus on exact word or phrase matching. This deficiency is particularly pronounced in abstractive summarization approaches, where the goal is to generate novel summaries by rephrasing and paraphrasing the source text, highlighting the need for a more\u0000nuanced evaluation metric capable of capturing semantic similarity.\u0000\u0000\u0000\u0000In this study, the limitations of existing ROUGE metrics are addressed by proposing\u0000a novel variant called ROUGE-SS. Unlike traditional ROUGE metrics, ROUGE-SS extends\u0000beyond exact word matching to consider synonyms and semantic similarity. Leveraging resources such as the WordNet online dictionary, ROUGE-SS identifies matches between source\u0000text and summaries based on both exact word overlaps and semantic context. Experiments are\u0000conducted to evaluate the performance of ROUGE-SS compared to other ROUGE variants,\u0000particularly in assessing abstractive summarization models. The algorithm for the synonym\u0000features (ROUGE-SS) is also proposed.\u0000\u0000\u0000\u0000The experiments demonstrate the superior performance of ROUGE-SS in evaluating\u0000abstractive text summarization models compared to existing ROUGE variants. ROUGE-SS\u0000yields higher F1 scores and better overall performance, achieving a significant reduction in\u0000training loss and impressive accuracy. The proposed ROUGE-SS evaluation technique is evaluated in different datasets like CNN/Daily Mail, DUC-2004, Gigawords, and Inshorts News\u0000datasets. ROUGE-SS gives better results than other ROUGE variant metrics. The F1-score of\u0000the proposed ROUGE-SS metric is improved by an average of 8.8%. These findings underscore the effectiveness of ROUGE-SS in capturing semantic similarity and providing a more\u0000comprehensive evaluation metric for abstractive summarization.\u0000\u0000\u0000\u0000In conclusion, the introduction of ROUGE-SS represents a significant advancement in the field of abstractive text summarization evaluation. By extending beyond exact\u0000word matching to incorporate synonyms and semantic context, ROUGE-SS offers researchers\u0000a more effective tool for assessing summarization quality. This study highlights the importance\u0000of considering semantic meaning in evaluation metrics and provides a promising direction for\u0000future research on abstractive text summarization.\u0000","PeriodicalId":506582,"journal":{"name":"Recent Advances in Computer Science and Communications","volume":"207 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141375871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}