首页 > 最新文献

Journal of Trends in Computer Science and Smart Technology最新文献

英文 中文
BERT for Twitter Sentiment Analysis: Achieving High Accuracy and Balanced Performance 用于 Twitter 情感分析的 BERT:实现高精度和均衡性能
Pub Date : 2024-03-01 DOI: 10.36548/jtcsst.2024.1.003
Oladri Renuka, Niranchana Radhakrishnan
The Bidirectional Encoder Representations from Transformers (BERT) model is used in this work to analyse sentiment on Twitter data. A Kaggle dataset of manually annotated and anonymized COVID-19-related tweets was used to refine the model. Location, tweet date, original tweet content, and sentiment labels are all included in the dataset. When compared to the Multinomial Naive Bayes (MNB) baseline, BERT's performance was assessed, and it achieved an overall accuracy of 87% on the test set. The results indicated that for negative feelings, the accuracy was 0.93, the recall was 0.84, and the F1-score was 0.88; for neutral sentiments, the precision was 0.86, the recall was 0.78, and the F1-score was 0.82; and for positive sentiments, the precision was 0.82, the recall was 0.94, and the F1-score was 0.88. The model's proficiency with the linguistic nuances of Twitter, including slang and sarcasm, was demonstrated. This study also identifies the flaws of BERT and makes recommendations for future research paths, such as the integration of external knowledge and alternative designs.
在这项工作中,使用了来自变换器的双向编码器表示(BERT)模型来分析 Twitter 数据的情感。Kaggle 数据集包含人工标注和匿名的 COVID-19 相关推文,用于完善该模型。位置、推文日期、原始推文内容和情感标签都包含在数据集中。与多项式奈何贝叶斯(MNB)基线相比,BERT 的性能得到了评估,它在测试集上的总体准确率达到了 87%。结果表明,对于负面情绪,精确度为 0.93,召回率为 0.84,F1 分数为 0.88;对于中性情绪,精确度为 0.86,召回率为 0.78,F1 分数为 0.82;对于正面情绪,精确度为 0.82,召回率为 0.94,F1 分数为 0.88。该模型熟练掌握了 Twitter 语言的细微差别,包括俚语和讽刺。本研究还指出了 BERT 的缺陷,并对未来的研究路径提出了建议,如整合外部知识和替代设计。
{"title":"BERT for Twitter Sentiment Analysis: Achieving High Accuracy and Balanced Performance","authors":"Oladri Renuka, Niranchana Radhakrishnan","doi":"10.36548/jtcsst.2024.1.003","DOIUrl":"https://doi.org/10.36548/jtcsst.2024.1.003","url":null,"abstract":"The Bidirectional Encoder Representations from Transformers (BERT) model is used in this work to analyse sentiment on Twitter data. A Kaggle dataset of manually annotated and anonymized COVID-19-related tweets was used to refine the model. Location, tweet date, original tweet content, and sentiment labels are all included in the dataset. When compared to the Multinomial Naive Bayes (MNB) baseline, BERT's performance was assessed, and it achieved an overall accuracy of 87% on the test set. The results indicated that for negative feelings, the accuracy was 0.93, the recall was 0.84, and the F1-score was 0.88; for neutral sentiments, the precision was 0.86, the recall was 0.78, and the F1-score was 0.82; and for positive sentiments, the precision was 0.82, the recall was 0.94, and the F1-score was 0.88. The model's proficiency with the linguistic nuances of Twitter, including slang and sarcasm, was demonstrated. This study also identifies the flaws of BERT and makes recommendations for future research paths, such as the integration of external knowledge and alternative designs.","PeriodicalId":107574,"journal":{"name":"Journal of Trends in Computer Science and Smart Technology","volume":"162 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140283545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interactive Guide Assignment System with Destination Recommendation and Built-in Chatbox 带有目的地推荐和内置聊天框的交互式导游分配系统
Pub Date : 2023-09-01 DOI: 10.36548/jtcsst.2023.3.003
Babina Banjara, Jinish Shrestha, Jinu Nyachhyon, Rijan Timilsina, S. Shakya
This proposed system provides a website called 'Safari Nepal', where users can search for destinations and check their location on a map. Users when registering on the website, can fill up the details about themselves and choose to either be a tour guide or a tourist. Based on the search and preferences of the user, similar destinations are recommended to the user via a recommendation system that uses a content-based recommendation feature. This feature works on the data obtained from the user, either explicitly or implicitly. The concept of K-Nearest Neighbours (KNN) and Cosine similarity makes the recommendation more accurate. KNN uses a distance algorithm that sorts from most liked destinations to least liked, based on the preferences of the user. This sorted list of destinations is further filtered by Cosine similarity, which is a measure of how similar two vectors in an inner product space are. It is calculated by taking the cosine of the angle between two vectors and determining whether two vectors are pointing towards the same general direction. Thus, combined KNN and Cosine similarity gives a better recommendation to the user. The map is integrated into the system using Mapbox API. Also, the system connects users with tour guides and gives them space to chat via a chatbox called ‘Travel Buddy’ where they can discuss further the destination, the amount charged by the guide, etc. The chatting feature on the system allows multiple users to connect and make conversations about the destination creating various chatrooms. In the system, the user can also publish their blogs describing their experiences and share their thoughts on particular destinations.
这个拟议中的系统提供了一个名为“Safari尼泊尔”的网站,用户可以在上面搜索目的地,并在地图上查看他们的位置。用户在网站上注册时,可以填写自己的详细信息,选择做导游或游客。根据用户的搜索和偏好,通过使用基于内容的推荐功能的推荐系统向用户推荐相似的目的地。该特性对从用户获得的数据进行显式或隐式处理。k近邻(KNN)和余弦相似度的概念使推荐更加准确。KNN使用一种距离算法,根据用户的偏好,从最受欢迎的目的地到最不受欢迎的目的地进行排序。这个排序的目标列表通过余弦相似性进一步过滤,余弦相似性是衡量内积空间中两个向量的相似程度。它是通过取两个向量夹角的余弦值来计算的,并确定两个向量是否指向相同的大致方向。因此,结合KNN和余弦相似度可以给用户更好的推荐。地图通过Mapbox API集成到系统中。此外,该系统将用户与导游连接起来,并通过一个名为“旅行伙伴”的聊天框为他们提供聊天空间,在那里他们可以进一步讨论目的地,导游收取的费用等。系统上的聊天功能允许多个用户连接并就目的地进行对话,创建各种聊天室。在这个系统中,用户还可以发布他们的博客,描述他们的经历,并分享他们对特定目的地的想法。
{"title":"Interactive Guide Assignment System with Destination Recommendation and Built-in Chatbox","authors":"Babina Banjara, Jinish Shrestha, Jinu Nyachhyon, Rijan Timilsina, S. Shakya","doi":"10.36548/jtcsst.2023.3.003","DOIUrl":"https://doi.org/10.36548/jtcsst.2023.3.003","url":null,"abstract":"This proposed system provides a website called 'Safari Nepal', where users can search for destinations and check their location on a map. Users when registering on the website, can fill up the details about themselves and choose to either be a tour guide or a tourist. Based on the search and preferences of the user, similar destinations are recommended to the user via a recommendation system that uses a content-based recommendation feature. This feature works on the data obtained from the user, either explicitly or implicitly. The concept of K-Nearest Neighbours (KNN) and Cosine similarity makes the recommendation more accurate. KNN uses a distance algorithm that sorts from most liked destinations to least liked, based on the preferences of the user. This sorted list of destinations is further filtered by Cosine similarity, which is a measure of how similar two vectors in an inner product space are. It is calculated by taking the cosine of the angle between two vectors and determining whether two vectors are pointing towards the same general direction. Thus, combined KNN and Cosine similarity gives a better recommendation to the user. The map is integrated into the system using Mapbox API. Also, the system connects users with tour guides and gives them space to chat via a chatbox called ‘Travel Buddy’ where they can discuss further the destination, the amount charged by the guide, etc. The chatting feature on the system allows multiple users to connect and make conversations about the destination creating various chatrooms. In the system, the user can also publish their blogs describing their experiences and share their thoughts on particular destinations.","PeriodicalId":107574,"journal":{"name":"Journal of Trends in Computer Science and Smart Technology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129349508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Brain Tumor Classification using Transfer Learning 利用迁移学习进行脑肿瘤分类
Pub Date : 2023-09-01 DOI: 10.36548/jtcsst.2023.3.002
Dr. Vaibhav Eknath Narawade, Chaitali Shetty, Purva Kharsambale, Samruddhi Bhosale, S. Rout
Brain tumors are one of the more severe medical conditions that can affect both children and adults. Brain tumors make up between 85 and 90 percent of all primary Central Nervous System (CNS) malignancies. Each year, brain tumors are found in about 11,700 persons. The 5-year survival rate is around 34% for males and 36% for female patients with malignant brain or CNS tumors. Brain tumors can be classified as benign, malignant, pituitary, and other forms. Appropriate treatment, meticulous planning, and exact diagnostics must be used to prolong patient lives. The most reliable way for detecting brain cancer is Magnetic Resonance Imaging (MRI). The images are examined by the radiologist. As brain tumors are complex the MRI serve as guide to diagnose the seriousness of the disease. Since the placement and size of the brain tumor seems incredibly abnormal for persons affected by the disease it becomes difficult to properly comprehend the nature of the tumor. For MRI analysis, a qualified neurosurgeon is also necessary. Compiling the results of an MRI can be extremely difficult and time-consuming because there are typically not enough qualified medical professionals and individuals who are knowledgeable about malignancy in poor countries. Thus, this issue can be resolved by an automated cloud-based solution. In the proposed model, The Convolutional Neural Networks (CNN) is used for the classification of the brain tumor dataset with an accuracy of 99%.
脑瘤是一种更严重的疾病,可以影响儿童和成人。脑肿瘤占所有原发性中枢神经系统(CNS)恶性肿瘤的85%至90%。每年大约有11,700人发现脑瘤。恶性脑或中枢神经系统肿瘤患者的5年生存率为男性约34%,女性约36%。脑肿瘤可分为良性、恶性、垂体性和其他形式。为了延长病人的生命,必须使用适当的治疗、周密的计划和准确的诊断。检测脑癌最可靠的方法是磁共振成像(MRI)。放射科医生检查了这些图像。由于脑肿瘤是复杂的,MRI可以作为诊断疾病严重性的指导。由于脑肿瘤的位置和大小对患有这种疾病的人来说似乎异常,因此很难正确理解肿瘤的性质。对于核磁共振分析,一个合格的神经外科医生也是必要的。编制核磁共振成像的结果可能极其困难和耗时,因为在贫穷国家通常没有足够的合格医疗专业人员和了解恶性肿瘤的个人。因此,这个问题可以通过基于云的自动化解决方案来解决。在提出的模型中,使用卷积神经网络(CNN)对脑肿瘤数据集进行分类,准确率为99%。
{"title":"Brain Tumor Classification using Transfer Learning","authors":"Dr. Vaibhav Eknath Narawade, Chaitali Shetty, Purva Kharsambale, Samruddhi Bhosale, S. Rout","doi":"10.36548/jtcsst.2023.3.002","DOIUrl":"https://doi.org/10.36548/jtcsst.2023.3.002","url":null,"abstract":"Brain tumors are one of the more severe medical conditions that can affect both children and adults. Brain tumors make up between 85 and 90 percent of all primary Central Nervous System (CNS) malignancies. Each year, brain tumors are found in about 11,700 persons. The 5-year survival rate is around 34% for males and 36% for female patients with malignant brain or CNS tumors. Brain tumors can be classified as benign, malignant, pituitary, and other forms. Appropriate treatment, meticulous planning, and exact diagnostics must be used to prolong patient lives. The most reliable way for detecting brain cancer is Magnetic Resonance Imaging (MRI). The images are examined by the radiologist. As brain tumors are complex the MRI serve as guide to diagnose the seriousness of the disease. Since the placement and size of the brain tumor seems incredibly abnormal for persons affected by the disease it becomes difficult to properly comprehend the nature of the tumor. For MRI analysis, a qualified neurosurgeon is also necessary. Compiling the results of an MRI can be extremely difficult and time-consuming because there are typically not enough qualified medical professionals and individuals who are knowledgeable about malignancy in poor countries. Thus, this issue can be resolved by an automated cloud-based solution. In the proposed model, The Convolutional Neural Networks (CNN) is used for the classification of the brain tumor dataset with an accuracy of 99%.","PeriodicalId":107574,"journal":{"name":"Journal of Trends in Computer Science and Smart Technology","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116427765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Winnowing vs Extended-Winnowing: A Comparative Analysis of Plagiarism Detection Algorithms 筛选与扩展筛选:抄袭检测算法的比较分析
Pub Date : 2023-09-01 DOI: 10.36548/jtcsst.2023.3.001
Shiva Shrestha, Sushan Shakya, Sandeep Gautam
Plagiarism is the main problem in the digital world, as people use others’ content without giving prior credit to the creator. Therefore, there should be proper and efficient algorithms to find plagiarized content on the Internet. This research proposes two algorithms: the winnowing algorithm and the extended winnowing algorithm. The winnowing algorithm can only calculate the similarity rate between documents, whereas the extended algorithm can mark the plagiarized text segment in the compared records along with their similarity rates. The similarity rate in both algorithms has been calculated using the Jaccard Coefficient. Although the extended algorithm is beneficial as it provides a text marking feature, it consumes more computation power, which is discussed in this study. There are research works done previously using this approach, but none has compared the algorithms’ performance on small texts. Thus, this research utilizes the Twitter form of data to test these algorithms’ performance, as it contains a maximum of 280 characters. The application proposed to detect plagiarism in tweets has been developed using Python as the backend and React as the front-end technology.
剽窃是数字世界的主要问题,因为人们使用他人的内容而事先没有注明作者的名字。因此,应该有适当和有效的算法来发现互联网上的抄袭内容。本研究提出了两种算法:分选算法和扩展分选算法。筛选算法只能计算文档之间的相似率,而扩展算法可以标记比较记录中的剽窃文本片段及其相似率。用Jaccard系数计算了两种算法的相似率。扩展后的算法虽然提供了文本标记的功能,但它消耗了更多的计算能力,这在本研究中进行了讨论。以前也有使用这种方法的研究工作,但没有人比较过算法在小文本上的表现。因此,本研究利用Twitter形式的数据来测试这些算法的性能,因为它最多包含280个字符。我们提出的用于检测推文抄袭的应用程序是使用Python作为后端,React作为前端技术开发的。
{"title":"Winnowing vs Extended-Winnowing: A Comparative Analysis of Plagiarism Detection Algorithms","authors":"Shiva Shrestha, Sushan Shakya, Sandeep Gautam","doi":"10.36548/jtcsst.2023.3.001","DOIUrl":"https://doi.org/10.36548/jtcsst.2023.3.001","url":null,"abstract":"Plagiarism is the main problem in the digital world, as people use others’ content without giving prior credit to the creator. Therefore, there should be proper and efficient algorithms to find plagiarized content on the Internet. This research proposes two algorithms: the winnowing algorithm and the extended winnowing algorithm. The winnowing algorithm can only calculate the similarity rate between documents, whereas the extended algorithm can mark the plagiarized text segment in the compared records along with their similarity rates. The similarity rate in both algorithms has been calculated using the Jaccard Coefficient. Although the extended algorithm is beneficial as it provides a text marking feature, it consumes more computation power, which is discussed in this study. There are research works done previously using this approach, but none has compared the algorithms’ performance on small texts. Thus, this research utilizes the Twitter form of data to test these algorithms’ performance, as it contains a maximum of 280 characters. The application proposed to detect plagiarism in tweets has been developed using Python as the backend and React as the front-end technology.","PeriodicalId":107574,"journal":{"name":"Journal of Trends in Computer Science and Smart Technology","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122814253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Investigating the Scope of Chaos Theory for Cyber Threat Detection 研究混沌理论在网络威胁检测中的应用范围
Pub Date : 2023-09-01 DOI: 10.36548/jtcsst.2023.3.004
Manas Kumar Yogi
The role of chaos theory in the development of cyber threat detection systems is primarily exploratory and theoretical, with limited practical adoption in recent years. Chaos theory offers interesting concepts that have the potential to enhance cyber threat detection capabilities, but its application in the cybersecurity industry faces challenges and limitations. While chaos theory's practical role in cyber threat detection systems remains limited, its principles have the potential to complement existing methodologies and inspire new approaches to address the complex and dynamic nature of cybersecurity threats. As the field progresses, staying informed about the latest research and developments can help gauge the future scope and impact of chaos theory in cyber threat detection. In this paper, the roles and the principles of chaos theory are investigated and this investigation has indicators representing ample scope of chaos theory in design and development of robust frameworks related to cyber threat detection.
混沌理论在网络威胁检测系统发展中的作用主要是探索性和理论性的,近年来的实际应用有限。混沌理论提供了有趣的概念,有可能增强网络威胁检测能力,但其在网络安全行业的应用面临挑战和限制。虽然混沌理论在网络威胁检测系统中的实际作用仍然有限,但其原理有可能补充现有的方法,并激发新的方法来解决网络安全威胁的复杂性和动态性。随着该领域的发展,了解最新的研究和发展可以帮助评估混沌理论在网络威胁检测中的未来范围和影响。在本文中,研究了混沌理论的作用和原理,并且该研究具有代表混沌理论在与网络威胁检测相关的鲁棒框架的设计和开发中的充分范围的指标。
{"title":"Investigating the Scope of Chaos Theory for Cyber Threat Detection","authors":"Manas Kumar Yogi","doi":"10.36548/jtcsst.2023.3.004","DOIUrl":"https://doi.org/10.36548/jtcsst.2023.3.004","url":null,"abstract":"The role of chaos theory in the development of cyber threat detection systems is primarily exploratory and theoretical, with limited practical adoption in recent years. Chaos theory offers interesting concepts that have the potential to enhance cyber threat detection capabilities, but its application in the cybersecurity industry faces challenges and limitations. While chaos theory's practical role in cyber threat detection systems remains limited, its principles have the potential to complement existing methodologies and inspire new approaches to address the complex and dynamic nature of cybersecurity threats. As the field progresses, staying informed about the latest research and developments can help gauge the future scope and impact of chaos theory in cyber threat detection. In this paper, the roles and the principles of chaos theory are investigated and this investigation has indicators representing ample scope of chaos theory in design and development of robust frameworks related to cyber threat detection.","PeriodicalId":107574,"journal":{"name":"Journal of Trends in Computer Science and Smart Technology","volume":"115 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133908998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Strengthening Smart Grid Cybersecurity: An In-Depth Investigation into the Fusion of Machine Learning and Natural Language Processing 加强智能电网网络安全:机器学习与自然语言处理融合的深入研究
Pub Date : 2023-09-01 DOI: 10.36548/jtcsst.2023.3.005
Rahul Kumar Jha
Smart grid technology has transformed electricity distribution and management, but it also exposes critical infrastructures to cybersecurity threats. To mitigate these risks, the integration of machine learning (ML) and natural language processing (NLP) techniques has emerged as a promising approach. This survey paper analyses current research and applications related to ML and NLP integration, exploring methods for risk assessment, log analysis, threat analysis, intrusion detection, and anomaly detection. It also explores challenges, potential opportunities, and future research directions for enhancing smart grid cybersecurity through the synergy of ML and NLP. The study's key contributions include providing a thorough understanding of state-of-the-art techniques and paving the way for more robust and resilient smart grid defences against cyber threats.
智能电网技术改变了电力分配和管理,但也使关键基础设施面临网络安全威胁。为了减轻这些风险,机器学习(ML)和自然语言处理(NLP)技术的集成已经成为一种有前途的方法。本文分析了机器学习和自然语言处理集成的研究现状和应用,探讨了风险评估、日志分析、威胁分析、入侵检测和异常检测的方法。它还探讨了通过ML和NLP的协同作用增强智能电网网络安全的挑战、潜在机遇和未来的研究方向。该研究的主要贡献包括提供对最先进技术的透彻理解,并为更强大、更有弹性的智能电网防御网络威胁铺平道路。
{"title":"Strengthening Smart Grid Cybersecurity: An In-Depth Investigation into the Fusion of Machine Learning and Natural Language Processing","authors":"Rahul Kumar Jha","doi":"10.36548/jtcsst.2023.3.005","DOIUrl":"https://doi.org/10.36548/jtcsst.2023.3.005","url":null,"abstract":"Smart grid technology has transformed electricity distribution and management, but it also exposes critical infrastructures to cybersecurity threats. To mitigate these risks, the integration of machine learning (ML) and natural language processing (NLP) techniques has emerged as a promising approach. This survey paper analyses current research and applications related to ML and NLP integration, exploring methods for risk assessment, log analysis, threat analysis, intrusion detection, and anomaly detection. It also explores challenges, potential opportunities, and future research directions for enhancing smart grid cybersecurity through the synergy of ML and NLP. The study's key contributions include providing a thorough understanding of state-of-the-art techniques and paving the way for more robust and resilient smart grid defences against cyber threats.","PeriodicalId":107574,"journal":{"name":"Journal of Trends in Computer Science and Smart Technology","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125282879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Text based Tweet Classification using Ensemble Classifier 基于文本的集成分类器tweets分类
Pub Date : 2023-06-01 DOI: 10.36548/jtcsst.2023.2.003
Ismankhan Y M
There are so many social networking sites available. Tweets have evolved into a crucial tool for gathering people's thoughts, ideas, behaviours and sentiments surrounding particular entities. One of the most intriguing subjects in this context is analyzing the sentiment of tweets using natural language processing (NLP). Although several methods have been created, the accuracy and effectiveness of those methods for sentiment analysis are yet to be improved. This paper proposes an innovative strategy that takes advantage of machine learning and lexical dictionaries. Tweets are classified using a stacked ensemble model that has Naive Bayes as a base classifier and the Logistic Regression as a meta classifier model. The performance of the proposed method is compared with common machine learning models such as Naïve Bayes and Logistic Regression using the sentiment140 dataset, experiments were carried out and their accuracy was determined. The results of the experiment endorse the proposed methodology. exhibits better outcomes of attaining accuracy score of 86%.
现在有很多社交网站。推特已经发展成为收集人们围绕特定实体的思想、想法、行为和情绪的重要工具。在此背景下,最有趣的主题之一是使用自然语言处理(NLP)分析推文的情绪。虽然已经创建了几种方法,但这些方法的准确性和有效性还有待提高。本文提出了一种利用机器学习和词汇词典的创新策略。tweet使用堆叠集成模型进行分类,该模型以朴素贝叶斯为基础分类器,以逻辑回归为元分类器模型。利用sentiment140数据集,将该方法与Naïve贝叶斯和Logistic回归等常用机器学习模型的性能进行了比较,并进行了实验,确定了其准确性。实验结果证实了所提出的方法。达到了86%的准确率。
{"title":"Text based Tweet Classification using Ensemble Classifier","authors":"Ismankhan Y M","doi":"10.36548/jtcsst.2023.2.003","DOIUrl":"https://doi.org/10.36548/jtcsst.2023.2.003","url":null,"abstract":"There are so many social networking sites available. Tweets have evolved into a crucial tool for gathering people's thoughts, ideas, behaviours and sentiments surrounding particular entities. One of the most intriguing subjects in this context is analyzing the sentiment of tweets using natural language processing (NLP). Although several methods have been created, the accuracy and effectiveness of those methods for sentiment analysis are yet to be improved. This paper proposes an innovative strategy that takes advantage of machine learning and lexical dictionaries. Tweets are classified using a stacked ensemble model that has Naive Bayes as a base classifier and the Logistic Regression as a meta classifier model. The performance of the proposed method is compared with common machine learning models such as Naïve Bayes and Logistic Regression using the sentiment140 dataset, experiments were carried out and their accuracy was determined. The results of the experiment endorse the proposed methodology. exhibits better outcomes of attaining accuracy score of 86%.","PeriodicalId":107574,"journal":{"name":"Journal of Trends in Computer Science and Smart Technology","volume":"174 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125800155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SMART CITY TRAFFIC CONTROL SYSTEM 智慧城市交通控制系统
Pub Date : 2023-06-01 DOI: 10.36548/jtcsst.2023.2.004
Satheeshkumar A, A. M, Hari Thirunavukkarsu A, S. S, G. Manavaalan
Every city in the downtown areas around the world faces a major problem due to heavy traffic, especially during the peak hours. Traditional traffic signals used in manging the traffic allots a fixed time for managing the traffic in a junction of a four way or a two-way crossroads and cannot adjust to account for changes in traffic. The proposed system provides a scheduled crossing time that is automatically adjusted based on the traffic. A long green light is assigned using the proposed to the particular side of the crossroad that faces heavy traffic. For this the suggested model uses IR sensors installed in every 5 meters of the road to detect objects.
世界上每个城市都面临着交通拥堵的大问题,尤其是在高峰时段。用于交通管理的传统交通信号是在四向或双向十字路口的交叉口分配一个固定的时间来管理交通,不能根据交通的变化进行调整。该系统提供了一个计划的过境时间,并根据交通流量自动调整。一个长长的绿灯被分配到十字路口面临交通拥挤的特定一侧。为此,建议的模型使用每隔5米安装在道路上的红外传感器来检测物体。
{"title":"SMART CITY TRAFFIC CONTROL SYSTEM","authors":"Satheeshkumar A, A. M, Hari Thirunavukkarsu A, S. S, G. Manavaalan","doi":"10.36548/jtcsst.2023.2.004","DOIUrl":"https://doi.org/10.36548/jtcsst.2023.2.004","url":null,"abstract":"Every city in the downtown areas around the world faces a major problem due to heavy traffic, especially during the peak hours. Traditional traffic signals used in manging the traffic allots a fixed time for managing the traffic in a junction of a four way or a two-way crossroads and cannot adjust to account for changes in traffic. The proposed system provides a scheduled crossing time that is automatically adjusted based on the traffic. A long green light is assigned using the proposed to the particular side of the crossroad that faces heavy traffic. For this the suggested model uses IR sensors installed in every 5 meters of the road to detect objects.","PeriodicalId":107574,"journal":{"name":"Journal of Trends in Computer Science and Smart Technology","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115259755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Type 2 Diabetes Prediction using K-Nearest Neighbor Algorithm 基于k -最近邻算法的2型糖尿病预测
Pub Date : 2023-06-01 DOI: 10.36548/jtcsst.2023.2.007
S. Suriya, J. Joanish Muthu
Type 2 diabetes is a persistent disorder that affects millions of individuals globally. It is characterised by the excessive levels of glucose within the blood due to insulin resistance or the incapability to supply insulin. Early detection and prediction of type 2 diabetes can improve patient outcomes. K-Nearest Neighbor (KNN) is used in the present model to predict type 2 diabetes. The KNN set of rules is a simple but powerful machine learning set of rules used for categorization and regression. It's far a non-parametric approach that makes predictions based totally on the nearest k-neighbours in a dataset. KNN is widely used in healthcare and scientific studies to expect and classify sicknesses primarily based on the affected person’s data. The intention of this work is to predict the threat of growing type 2 diabetes using the KNN set of rules. Data has been collected from electronic medical records of patients diagnosed with type 2 diabetes and healthy individuals. The dataset consists of various patient attributes, such as age, gender, body mass index, blood pressure, cholesterol levels, and glucose levels. Information has also been collected about lifestyle habits, such as physical activity, smoking status, and alcohol consumption. Data have been pre-processed by removing missing values and outliers, and normalization of the data has been done to ensure that all features have the same scale. Splitting the dataset into training and test sets, with training sets using 80% of the data and test sets using 20% of the data is performed. KNN algorithm have been used to classify the patients into two groups: those at high risk of developing type 2 diabetes and those at low risk. The model's performance has been assessed using a variety of metrics, including accuracy, precision, recall, and F1-score.
2型糖尿病是一种影响全球数百万人的持续性疾病。它的特点是由于胰岛素抵抗或不能提供胰岛素而导致血液中葡萄糖水平过高。2型糖尿病的早期发现和预测可以改善患者的预后。本模型使用k -最近邻(KNN)来预测2型糖尿病。KNN规则集是一个简单但功能强大的机器学习规则集,用于分类和回归。这是一种非参数方法,完全基于数据集中最近的k-邻居进行预测。KNN广泛用于医疗保健和科学研究,主要基于受影响的人的数据来预测和分类疾病。这项工作的目的是使用KNN规则集来预测日益增长的2型糖尿病的威胁。数据是从诊断为2型糖尿病患者和健康个体的电子病历中收集的。该数据集由各种患者属性组成,如年龄、性别、体重指数、血压、胆固醇水平和葡萄糖水平。还收集了有关生活习惯的信息,如体育活动、吸烟状况和饮酒情况。通过去除缺失值和异常值对数据进行预处理,并对数据进行归一化以确保所有特征具有相同的尺度。将数据集分成训练集和测试集,其中训练集使用80%的数据,测试集使用20%的数据。使用KNN算法将患者分为两组:发展为2型糖尿病的高危组和低危组。该模型的性能已使用各种指标进行评估,包括准确性、精度、召回率和f1分数。
{"title":"Type 2 Diabetes Prediction using K-Nearest Neighbor Algorithm","authors":"S. Suriya, J. Joanish Muthu","doi":"10.36548/jtcsst.2023.2.007","DOIUrl":"https://doi.org/10.36548/jtcsst.2023.2.007","url":null,"abstract":"Type 2 diabetes is a persistent disorder that affects millions of individuals globally. It is characterised by the excessive levels of glucose within the blood due to insulin resistance or the incapability to supply insulin. Early detection and prediction of type 2 diabetes can improve patient outcomes. K-Nearest Neighbor (KNN) is used in the present model to predict type 2 diabetes. The KNN set of rules is a simple but powerful machine learning set of rules used for categorization and regression. It's far a non-parametric approach that makes predictions based totally on the nearest k-neighbours in a dataset. KNN is widely used in healthcare and scientific studies to expect and classify sicknesses primarily based on the affected person’s data. The intention of this work is to predict the threat of growing type 2 diabetes using the KNN set of rules. Data has been collected from electronic medical records of patients diagnosed with type 2 diabetes and healthy individuals. The dataset consists of various patient attributes, such as age, gender, body mass index, blood pressure, cholesterol levels, and glucose levels. Information has also been collected about lifestyle habits, such as physical activity, smoking status, and alcohol consumption. Data have been pre-processed by removing missing values and outliers, and normalization of the data has been done to ensure that all features have the same scale. Splitting the dataset into training and test sets, with training sets using 80% of the data and test sets using 20% of the data is performed. KNN algorithm have been used to classify the patients into two groups: those at high risk of developing type 2 diabetes and those at low risk. The model's performance has been assessed using a variety of metrics, including accuracy, precision, recall, and F1-score.","PeriodicalId":107574,"journal":{"name":"Journal of Trends in Computer Science and Smart Technology","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123879649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Vehicular Safety System using Deep Learning and Computer Vision 基于深度学习和计算机视觉的车辆安全系统
Pub Date : 2023-06-01 DOI: 10.36548/jtcsst.2023.2.001
S. Rajkumaran, S. V, Sridevi Sridhar
While many technological solutions have been implemented for accident detection, not many have focused on accident prevention. Accidents have been an everlasting concern as they have caused heavy injuries and death tolls on a large scale. There has been an everlasting increase in the rate of accidents and violation of traffic laws and wrongdoers managing to escape from the legal ramifications of predominantly Hit-and-Run cases. This entails a system to alleviate the occurrence of accidents and deaths caused. Focusing on this, a viable solution that focuses on preventing such circumstances by detecting accident-causing behaviour has been proposed. If accidents take place, it ensures the victim gets their rightful compensation. The research encompasses two modules, Prevention and Recovery. The prevention module uses Deep Learning and Computer Vision to detect whether the driver is drowsy and issues an alert employing CNN. The recovery module focuses on detecting occurrences of accidents and acquiring information about the parties involved in the same. Moreover, the prototype detects drowsiness, and detects and saves the accident footage in real-time enabling information acquisition.
虽然已经实施了许多用于事故检测的技术解决方案,但很少有人关注事故预防。事故一直是一个令人担忧的问题,因为它们造成了大规模的严重伤亡。交通事故、违反交通法规以及肇事者设法逃避以肇事逃逸为主的法律后果的比率一直在持续上升。这需要一个系统来减轻事故的发生和造成的死亡。针对这一点,提出了一种可行的解决方案,即通过检测导致事故的行为来预防此类情况。如果发生事故,它确保受害者得到应有的赔偿。这项研究包括两个模块,预防和恢复。预防模块使用深度学习和计算机视觉来检测驾驶员是否昏昏欲睡,并使用CNN发出警报。恢复模块的重点是检测事故的发生,并获取有关事故各方的信息。此外,该原型还可以检测睡意,并实时检测和保存事故录像,从而实现信息采集。
{"title":"Vehicular Safety System using Deep Learning and Computer Vision","authors":"S. Rajkumaran, S. V, Sridevi Sridhar","doi":"10.36548/jtcsst.2023.2.001","DOIUrl":"https://doi.org/10.36548/jtcsst.2023.2.001","url":null,"abstract":"While many technological solutions have been implemented for accident detection, not many have focused on accident prevention. Accidents have been an everlasting concern as they have caused heavy injuries and death tolls on a large scale. There has been an everlasting increase in the rate of accidents and violation of traffic laws and wrongdoers managing to escape from the legal ramifications of predominantly Hit-and-Run cases. This entails a system to alleviate the occurrence of accidents and deaths caused. Focusing on this, a viable solution that focuses on preventing such circumstances by detecting accident-causing behaviour has been proposed. If accidents take place, it ensures the victim gets their rightful compensation. The research encompasses two modules, Prevention and Recovery. The prevention module uses Deep Learning and Computer Vision to detect whether the driver is drowsy and issues an alert employing CNN. The recovery module focuses on detecting occurrences of accidents and acquiring information about the parties involved in the same. Moreover, the prototype detects drowsiness, and detects and saves the accident footage in real-time enabling information acquisition.","PeriodicalId":107574,"journal":{"name":"Journal of Trends in Computer Science and Smart Technology","volume":"134 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131693655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Trends in Computer Science and Smart Technology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1