{"title":"A Factual Sentiment Analysis on Instagram Data – A Comparative Study Using Machine Learning Algorithms","authors":"A. Ramachandran, Swetha Ashok, Remya Nair T","doi":"10.1109/ACM57404.2022.00009","DOIUrl":null,"url":null,"abstract":"Social media is one of the most significant parts of our daily life. Our social media profiles are a reflection of our emotions. Instagram is the world's most popular photo-based social networking platform, with a reasonably high number of users ranging from regular people to artists, public figures, and top authorities. Users on Instagram may add captions to their images to make them more interesting. In this study, we are focusing on conducting sentiment analysis on Instagram captions by applying three different algorithms. We are concluding that the Logistic Regression algorithm is outperforming along with SMOTE and VADER compared to XG Boost and Random Forest algorithms. We started by acquiring data and dividing it down into little tokens, then we remove connection words and give clean data via the stop word removal mechanism. The cleaned data is then passed via the NLTK (Natural Language Toolkit) passer, which uses the VADER sentiment unit to produce sentiment based on the data. Then applying different algorithms XGBoost, Logistic Regression, and Random Forest on the produced sentiment. The accuracy of algorithms such as XGBoost, Logistic Regression, and Random Forest on sentiment data was also analyzed and tested and can be concluded that Logistic Regression performed well on these kinds of data with more accuracy. Through this work, the accuracy is lifted to a better level and thereby getting a truthful idea of the Instagram captions.","PeriodicalId":322569,"journal":{"name":"2022 Algorithms, Computing and Mathematics Conference (ACM)","volume":"256 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 Algorithms, Computing and Mathematics Conference (ACM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACM57404.2022.00009","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Social media is one of the most significant parts of our daily life. Our social media profiles are a reflection of our emotions. Instagram is the world's most popular photo-based social networking platform, with a reasonably high number of users ranging from regular people to artists, public figures, and top authorities. Users on Instagram may add captions to their images to make them more interesting. In this study, we are focusing on conducting sentiment analysis on Instagram captions by applying three different algorithms. We are concluding that the Logistic Regression algorithm is outperforming along with SMOTE and VADER compared to XG Boost and Random Forest algorithms. We started by acquiring data and dividing it down into little tokens, then we remove connection words and give clean data via the stop word removal mechanism. The cleaned data is then passed via the NLTK (Natural Language Toolkit) passer, which uses the VADER sentiment unit to produce sentiment based on the data. Then applying different algorithms XGBoost, Logistic Regression, and Random Forest on the produced sentiment. The accuracy of algorithms such as XGBoost, Logistic Regression, and Random Forest on sentiment data was also analyzed and tested and can be concluded that Logistic Regression performed well on these kinds of data with more accuracy. Through this work, the accuracy is lifted to a better level and thereby getting a truthful idea of the Instagram captions.