首页 > 最新文献

Online Social Networks and Media最新文献

英文 中文
DisTGranD: Granular event/sub-event classification for disaster response
Q1 Social Sciences Pub Date : 2025-01-01 DOI: 10.1016/j.osnem.2024.100297
Ademola Adesokan , Sanjay Madria , Long Nguyen
Efficient crisis management relies on prompt and precise analysis of disaster data from various sources, including social media. The advantage of fine-grained, annotated, class-labeled data is the provision of a diversified range of information compared to high-level label datasets. In this study, we introduce a dataset richly annotated at a low level to more accurately classify crisis-related communication. To this end, we first present DisTGranD, an extensively annotated dataset of over 47,600 tweets related to earthquakes and hurricanes. The dataset uses the Automatic Content Extraction (ACE) standard to provide detailed classification into dual-layer annotation for events and sub-events and identify critical triggers and supporting arguments. The inter-annotator evaluation of DisTGranD demonstrated high agreement among annotators, with Fleiss Kappa scores of 0.90 and 0.93 for event and sub-event types, respectively. Moreover, a transformer-based embedded phrase extraction method showed XLNet achieving an impressive 96% intra-label similarity score for event type and 97% for sub-event type. We further proposed a novel deep learning classification model, RoBiCCus, which achieved 90% accuracy and F1-Score in the event and sub-event type classification tasks on our DisTGranD dataset and outperformed other models on publicly available disaster datasets. DisTGranD dataset represents a nuanced class-labeled framework for detecting and classifying disaster-related social media content, which can significantly aid decision-making in disaster response. This robust dataset enables deep-learning models to provide insightful, actionable data during crises. Our annotated dataset and code are publicly available on GitHub 1.
{"title":"DisTGranD: Granular event/sub-event classification for disaster response","authors":"Ademola Adesokan ,&nbsp;Sanjay Madria ,&nbsp;Long Nguyen","doi":"10.1016/j.osnem.2024.100297","DOIUrl":"10.1016/j.osnem.2024.100297","url":null,"abstract":"<div><div>Efficient crisis management relies on prompt and precise analysis of disaster data from various sources, including social media. The advantage of fine-grained, annotated, class-labeled data is the provision of a diversified range of information compared to high-level label datasets. In this study, we introduce a dataset richly annotated at a low level to more accurately classify crisis-related communication. To this end, we first present DisTGranD, an extensively annotated dataset of over 47,600 tweets related to earthquakes and hurricanes. The dataset uses the Automatic Content Extraction (ACE) standard to provide detailed classification into dual-layer annotation for events and sub-events and identify critical triggers and supporting arguments. The inter-annotator evaluation of DisTGranD demonstrated high agreement among annotators, with Fleiss Kappa scores of 0.90 and 0.93 for event and sub-event types, respectively. Moreover, a transformer-based embedded phrase extraction method showed XLNet achieving an impressive 96% intra-label similarity score for event type and 97% for sub-event type. We further proposed a novel deep learning classification model, RoBiCCus, which achieved <span><math><mrow><mo>≥</mo><mn>90</mn><mtext>%</mtext></mrow></math></span> accuracy and F1-Score in the event and sub-event type classification tasks on our DisTGranD dataset and outperformed other models on publicly available disaster datasets. DisTGranD dataset represents a nuanced class-labeled framework for detecting and classifying disaster-related social media content, which can significantly aid decision-making in disaster response. This robust dataset enables deep-learning models to provide insightful, actionable data during crises. Our annotated dataset and code are publicly available on GitHub <span><span><sup>1</sup></span></span>.</div></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"45 ","pages":"Article 100297"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143095218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BD2TSumm: A Benchmark Dataset for Abstractive Disaster Tweet Summarization
Q1 Social Sciences Pub Date : 2025-01-01 DOI: 10.1016/j.osnem.2024.100299
Piyush Kumar Garg , Roshni Chakraborty , Sourav Kumar Dandapat
Online social media platforms, such as Twitter, are mediums for valuable updates during disasters. However, the large scale of available information makes it difficult for humans to identify relevant information from the available information. An automatic summary of these tweets provides identification of relevant information easy and ensures a holistic overview of a disaster event to process the aid for disaster response. In literature, there are two types of abstractive disaster tweet summarization approaches based on the format of output summary: key-phrased-based (where summary is a set of key-phrases) and sentence-based (where summary is a paragraph consisting of sentences). Existing sentence-based abstractive approaches are either unsupervised or supervised. However, both types of approaches require a sizable amount of ground-truth summaries for training and/or evaluation such that they work on disaster events irrespective of type and location. The lack of abstractive disaster ground-truth summaries and guidelines for annotation motivates us to come up with a systematic procedure to create abstractive sentence ground-truth summaries of disaster events. Therefore, this paper presents a two-step systematic annotation procedure for sentence-based abstractive summary creation. Additionally, we release BD2TSumm, i.e., a benchmark ground-truth dataset for evaluating the sentence-based abstractive summarization approaches for disaster events. BD2TSumm consists of 15 ground-truth summaries belonging to 5 different continents and both natural and man-made disaster types. Furthermore, to ensure the high quality of the generated ground-truth summaries, we evaluate them qualitatively (using five metrics) and quantitatively (using two metrics). Finally, we compare 12 existing State-Of-The-Art (SOTA) abstractive summarization approaches on these ground-truth summaries using ROUGE-N F1-score.
{"title":"BD2TSumm: A Benchmark Dataset for Abstractive Disaster Tweet Summarization","authors":"Piyush Kumar Garg ,&nbsp;Roshni Chakraborty ,&nbsp;Sourav Kumar Dandapat","doi":"10.1016/j.osnem.2024.100299","DOIUrl":"10.1016/j.osnem.2024.100299","url":null,"abstract":"<div><div>Online social media platforms, such as Twitter, are mediums for valuable updates during disasters. However, the large scale of available information makes it difficult for humans to identify relevant information from the available information. An automatic summary of these tweets provides identification of relevant information easy and ensures a holistic overview of a disaster event to process the aid for disaster response. In literature, there are two types of abstractive disaster tweet summarization approaches based on the format of output summary: key-phrased-based (where summary is a set of key-phrases) and sentence-based (where summary is a paragraph consisting of sentences). Existing sentence-based abstractive approaches are either unsupervised or supervised. However, both types of approaches require a sizable amount of ground-truth summaries for training and/or evaluation such that they work on disaster events irrespective of type and location. The lack of abstractive disaster ground-truth summaries and guidelines for annotation motivates us to come up with a systematic procedure to create abstractive sentence ground-truth summaries of disaster events. Therefore, this paper presents a two-step systematic annotation procedure for sentence-based abstractive summary creation. Additionally, we release <em>BD2TSumm</em>, i.e., a benchmark ground-truth dataset for evaluating the sentence-based abstractive summarization approaches for disaster events. <em>BD2TSumm</em> consists of 15 ground-truth summaries belonging to 5 different continents and both natural and man-made disaster types. Furthermore, to ensure the high quality of the generated ground-truth summaries, we evaluate them qualitatively (using five metrics) and quantitatively (using two metrics). Finally, we compare 12 existing State-Of-The-Art (SOTA) abstractive summarization approaches on these ground-truth summaries using ROUGE-N F1-score.</div></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"45 ","pages":"Article 100299"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143095219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Influencer self-disclosure practices on Instagram: A multi-country longitudinal study
Q1 Social Sciences Pub Date : 2025-01-01 DOI: 10.1016/j.osnem.2024.100298
Thales Bertaglia , Catalina Goanta , Gerasimos Spanakis , Adriana Iamnitchi
This paper presents a longitudinal study of more than ten years of activity on Instagram consisting of over a million posts by 400 content creators from four countries: the US, Brazil, Netherlands and Germany. Our study shows differences in the professionalisation of content monetisation between countries, yet consistent patterns; significant differences in the frequency of posts yet similar user engagement trends; and significant differences in the disclosure of sponsored content in some countries, with a direct connection with national legislation. We analyse shifts in marketing strategies due to legislative and platform feature changes, focusing on how content creators adapt disclosure methods to different legal environments. We also analyse the impact of disclosures and sponsored posts on engagement and conclude that, although sponsored posts have lower engagement on average, properly disclosing ads does not reduce engagement further. Our observations stress the importance of disclosure compliance and can guide authorities in developing and monitoring them more effectively.
{"title":"Influencer self-disclosure practices on Instagram: A multi-country longitudinal study","authors":"Thales Bertaglia ,&nbsp;Catalina Goanta ,&nbsp;Gerasimos Spanakis ,&nbsp;Adriana Iamnitchi","doi":"10.1016/j.osnem.2024.100298","DOIUrl":"10.1016/j.osnem.2024.100298","url":null,"abstract":"<div><div>This paper presents a longitudinal study of more than ten years of activity on Instagram consisting of over a million posts by 400 content creators from four countries: the US, Brazil, Netherlands and Germany. Our study shows differences in the professionalisation of content monetisation between countries, yet consistent patterns; significant differences in the frequency of posts yet similar user engagement trends; and significant differences in the disclosure of sponsored content in some countries, with a direct connection with national legislation. We analyse shifts in marketing strategies due to legislative and platform feature changes, focusing on how content creators adapt disclosure methods to different legal environments. We also analyse the impact of disclosures and sponsored posts on engagement and conclude that, although sponsored posts have lower engagement on average, properly disclosing ads does not reduce engagement further. Our observations stress the importance of disclosure compliance and can guide authorities in developing and monitoring them more effectively.</div></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"45 ","pages":"Article 100298"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143095211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How political symbols spread in online social networks: Using agent-based models to replicate the complex contagion of the yellow ribbon in Twitter
Q1 Social Sciences Pub Date : 2025-01-01 DOI: 10.1016/j.osnem.2025.100300
Francisco J. León-Medina
This paper analyzes the diffusion of the yellow ribbon in Twitter, a political symbol that represents the demand for the release of Catalan prisoners. We gathered data on potential users of the symbol in Twitter (users that publicly backed the cause), including their social network of friendships, and built an agent-based simulation to replicate the diffusion of the symbol in a digital twin version of the observed network. Our hypothesis was that complex contagion is the best explanation of the observed statistical relation between the proportion of adopting neighbors and the probability of adoption. Results show that the complex contagion model outperforms the simple contagion model and generates a better fit between the observed and the simulated pattern when the typical conditions of a complex contagion process are added to the baseline model, that is, when agents are affected by their reference group behavior rather than by the most influential nodes of the network, and when we identify a peripherical and densely connected network community and trigger the process from there. These results widen the set of behaviors whose diffusion can be explained as complex contagion to include adoption in low-risk/low-cost behaviors among people who would usually not resist adoption.
{"title":"How political symbols spread in online social networks: Using agent-based models to replicate the complex contagion of the yellow ribbon in Twitter","authors":"Francisco J. León-Medina","doi":"10.1016/j.osnem.2025.100300","DOIUrl":"10.1016/j.osnem.2025.100300","url":null,"abstract":"<div><div>This paper analyzes the diffusion of the yellow ribbon in Twitter, a political symbol that represents the demand for the release of Catalan prisoners. We gathered data on potential users of the symbol in Twitter (users that publicly backed the cause), including their social network of friendships, and built an agent-based simulation to replicate the diffusion of the symbol in a digital twin version of the observed network. Our hypothesis was that complex contagion is the best explanation of the observed statistical relation between the proportion of adopting neighbors and the probability of adoption. Results show that the complex contagion model outperforms the simple contagion model and generates a better fit between the observed and the simulated pattern when the typical conditions of a complex contagion process are added to the baseline model, that is, when agents are affected by their reference group behavior rather than by the most influential nodes of the network, and when we identify a peripherical and densely connected network community and trigger the process from there. These results widen the set of behaviors whose diffusion can be explained as complex contagion to include adoption in low-risk/low-cost behaviors among people who would usually <em>not</em> resist adoption.</div></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"45 ","pages":"Article 100300"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143095301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Why are you traveling? Inferring trip profiles from online reviews and domain-knowledge
Q1 Social Sciences Pub Date : 2025-01-01 DOI: 10.1016/j.osnem.2024.100296
Lucas G.S. Félix, Washington Cunha, Claudio M.V. de Andrade, Marcos André Gonçalves, Jussara M. Almeida
This paper addresses the task of inferring trip profiles (TPs), which consists of determining the profile of travelers engaged in a particular trip given a set of possible categories. TPs may include working trips, leisure journeys with friends, or family vacations. Travelers with different TPs typically have varied plans regarding destinations and timing. TP inference may provide significant insights for numerous tourism-related services, such as geo-recommender systems and tour planning. We focus on TP inference using TripAdvisor, a prominent tourism-centric social media platform, as our data source. Our goal is to evaluate how effectively we can automatically discern the TP from a user review on this platform. A user review encompasses both textual feedback and domain-specific data (such as a user’s previous visits to the location), which are crucial for accurately characterizing the trip. To achieve this, we assess various feature sets (including text and domain-specific) and implement advanced machine learning models, such as neural Transformers and open-source Large Language Models (Llama 2, Bloom). We examine two variants of the TP inference task—binary and multi-class. Surprisingly, our findings reveal that combining domain-specific features with TF-IDF-based representation in an LGBM model performs as well as more complex Transformer and LLM models, while being much more efficient and interpretable.
{"title":"Why are you traveling? Inferring trip profiles from online reviews and domain-knowledge","authors":"Lucas G.S. Félix,&nbsp;Washington Cunha,&nbsp;Claudio M.V. de Andrade,&nbsp;Marcos André Gonçalves,&nbsp;Jussara M. Almeida","doi":"10.1016/j.osnem.2024.100296","DOIUrl":"10.1016/j.osnem.2024.100296","url":null,"abstract":"<div><div>This paper addresses the task of inferring trip profiles (TPs), which consists of determining the profile of travelers engaged in a particular trip given a set of possible categories. TPs may include working trips, leisure journeys with friends, or family vacations. Travelers with different TPs typically have varied plans regarding destinations and timing. TP inference may provide significant insights for numerous tourism-related services, such as geo-recommender systems and tour planning. We focus on TP inference using TripAdvisor, a prominent tourism-centric social media platform, as our data source. Our goal is to evaluate how effectively we can automatically discern the TP from a user review on this platform. A user review encompasses both textual feedback and domain-specific data (such as a user’s previous visits to the location), which are crucial for accurately characterizing the trip. To achieve this, we assess various feature sets (including text and domain-specific) and implement advanced machine learning models, such as neural Transformers and open-source Large Language Models (Llama 2, Bloom). We examine two variants of the TP inference task—binary and multi-class. Surprisingly, our findings reveal that combining domain-specific features with TF-IDF-based representation in an LGBM model performs as well as more complex Transformer and LLM models, while being much more efficient and interpretable.</div></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"45 ","pages":"Article 100296"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143095300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Harnessing prompt-based large language models for disaster monitoring and automated reporting from social media feedback 利用基于提示的大型语言模型,从社交媒体反馈中进行灾害监测和自动报告
Q1 Social Sciences Pub Date : 2024-11-25 DOI: 10.1016/j.osnem.2024.100295
Riccardo Cantini, Cristian Cosentino, Fabrizio Marozzo, Domenico Talia, Paolo Trunfio
In recent years, social media has emerged as one of the main platforms for real-time reporting of issues during disasters and catastrophic events. While great strides have been made in collecting such information, there remains an urgent need to improve user reports’ automation, aggregation, and organization to streamline various tasks, including rescue operations, resource allocation, and communication with the press. This paper introduces an innovative methodology that leverages the power of prompt-based Large Language Models (LLMs) to strengthen disaster response and management. By analyzing large volumes of user-generated content, our methodology identifies issues reported by citizens who have experienced a disastrous event, such as damaged buildings, broken gas pipelines, and flooding. It also localizes all posts containing references to geographic information in the text, allowing for aggregation of posts that occurred nearby. By leveraging these localized citizen-reported issues, the methodology generates insightful reports full of essential information for emergency services, news agencies, and other interested parties. Extensive experimentation on large datasets validates the accuracy and efficiency of our methodology in classifying posts, detecting sub-events, and producing real-time reports. These findings highlight the practical value of prompt-based LLMs in disaster response, emphasizing their flexibility and adaptability in delivering timely insights that support more effective interventions.
近年来,社交媒体已成为灾害和灾难事件发生期间实时报告问题的主要平台之一。虽然在收集此类信息方面取得了长足进步,但仍迫切需要改进用户报告的自动化、聚合和组织,以简化各种任务,包括救援行动、资源分配以及与新闻界的沟通。本文介绍了一种创新方法,该方法利用基于提示的大型语言模型(LLM)的力量来加强灾难响应和管理。通过分析大量用户生成的内容,我们的方法可识别经历过灾难性事件的市民所报告的问题,如受损的建筑物、破损的天然气管道和洪水。它还能将文本中包含地理信息参考的所有帖子本地化,从而汇总附近发生的帖子。通过利用这些本地化的公民报告问题,该方法可生成富有洞察力的报告,为应急服务、新闻机构和其他相关方提供重要信息。在大型数据集上进行的广泛实验验证了我们的方法在分类帖子、检测子事件和生成实时报告方面的准确性和效率。这些发现凸显了基于提示的 LLM 在灾难响应中的实用价值,强调了它们在提供及时见解以支持更有效干预方面的灵活性和适应性。
{"title":"Harnessing prompt-based large language models for disaster monitoring and automated reporting from social media feedback","authors":"Riccardo Cantini,&nbsp;Cristian Cosentino,&nbsp;Fabrizio Marozzo,&nbsp;Domenico Talia,&nbsp;Paolo Trunfio","doi":"10.1016/j.osnem.2024.100295","DOIUrl":"10.1016/j.osnem.2024.100295","url":null,"abstract":"<div><div>In recent years, social media has emerged as one of the main platforms for real-time reporting of issues during disasters and catastrophic events. While great strides have been made in collecting such information, there remains an urgent need to improve user reports’ automation, aggregation, and organization to streamline various tasks, including rescue operations, resource allocation, and communication with the press. This paper introduces an innovative methodology that leverages the power of prompt-based Large Language Models (LLMs) to strengthen disaster response and management. By analyzing large volumes of user-generated content, our methodology identifies issues reported by citizens who have experienced a disastrous event, such as damaged buildings, broken gas pipelines, and flooding. It also localizes all posts containing references to geographic information in the text, allowing for aggregation of posts that occurred nearby. By leveraging these localized citizen-reported issues, the methodology generates insightful reports full of essential information for emergency services, news agencies, and other interested parties. Extensive experimentation on large datasets validates the accuracy and efficiency of our methodology in classifying posts, detecting sub-events, and producing real-time reports. These findings highlight the practical value of prompt-based LLMs in disaster response, emphasizing their flexibility and adaptability in delivering timely insights that support more effective interventions.</div></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"45 ","pages":"Article 100295"},"PeriodicalIF":0.0,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142702794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HaRNaT - A dynamic hashtag recommendation system using news HaRNaT - 利用新闻的动态标签推荐系统
Q1 Social Sciences Pub Date : 2024-11-23 DOI: 10.1016/j.osnem.2024.100294
Divya Gupta, Shampa Chakraverty
Microblogging platforms such as X and Mastadon have evolved into significant data sources, where the Hashtag Recommendation System (HRS) is being devised to automate the recommendation of hashtags for user queries. We propose a context-sensitive, Machine Learning based HRS named HaRNaT, that strategically leverages news articles to identify pertinent keywords and subjects related to a query. It interprets the fresh context of a query and tracks the evolving dynamics of hashtags to evaluate their relevance in the present context. In contrast to prior methods that primarily rely on microblog content for hashtag recommendation, HaRNaT mines contextually related microblogs and assesses the relevance of co-occurring hashtags with news information. To accomplish this, it evaluates hashtag features, including pertinence, popularity among users, and association with other hashtags. In performance evaluation of HaRNaT trained on these features demonstrates a macro-averaged precision of 84% with Naive Bayes and 80% with Logistic Regression. Compared to Hashtagify- a hashtag search engine, HaRNaT offers a dynamically evolving set of hashtags.
X 和 Mastadon 等微博平台已发展成为重要的数据源,其中的标签推荐系统(HRS)被设计用于为用户查询自动推荐标签。我们提出了一种基于机器学习的上下文敏感型 HRS,名为 HaRNaT,它能战略性地利用新闻文章来识别与查询相关的关键词和主题。它能解释查询的新上下文,并跟踪标签不断变化的动态,以评估其在当前上下文中的相关性。与之前主要依靠微博内容进行标签推荐的方法不同,HaRNaT 可挖掘与上下文相关的微博,并评估与新闻信息共同出现的标签的相关性。为此,它评估了标签的特征,包括相关性、在用户中的流行度以及与其他标签的关联性。根据这些特征对 HaRNaT 进行的性能评估表明,使用 Naive Bayes 算法的宏观平均精确度为 84%,使用 Logistic Regression 算法的宏观平均精确度为 80%。与 Hashtagify(一种标签搜索引擎)相比,HaRNaT 提供了一组动态演化的标签。
{"title":"HaRNaT - A dynamic hashtag recommendation system using news","authors":"Divya Gupta,&nbsp;Shampa Chakraverty","doi":"10.1016/j.osnem.2024.100294","DOIUrl":"10.1016/j.osnem.2024.100294","url":null,"abstract":"<div><div>Microblogging platforms such as <em>X</em> and <em>Mastadon</em> have evolved into significant data sources, where the Hashtag Recommendation System (HRS) is being devised to automate the recommendation of hashtags for user queries. We propose a context-sensitive, Machine Learning based HRS named <em>HaRNaT</em>, that strategically leverages news articles to identify pertinent keywords and subjects related to a query. It interprets the fresh context of a query and tracks the evolving dynamics of hashtags to evaluate their relevance in the present context. In contrast to prior methods that primarily rely on microblog content for hashtag recommendation, <em>HaRNaT</em> mines contextually related microblogs and assesses the relevance of co-occurring hashtags with news information. To accomplish this, it evaluates hashtag features, including pertinence, popularity among users, and association with other hashtags. In performance evaluation of <em>HaRNaT</em> trained on these features demonstrates a macro-averaged precision of 84% with Naive Bayes and 80% with Logistic Regression. Compared to <em>Hashtagify</em>- a hashtag search engine, <em>HaRNaT</em> offers a dynamically evolving set of hashtags.</div></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"45 ","pages":"Article 100294"},"PeriodicalIF":0.0,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142702795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How does user-generated content on Social Media affect stock predictions? A case study on GameStop 社交媒体上的用户生成内容如何影响股票预测?GameStop 案例研究
Q1 Social Sciences Pub Date : 2024-11-01 DOI: 10.1016/j.osnem.2024.100293
Antonino Ferraro , Giancarlo Sperlì
One of the main challenges in the financial market concerns the forecasting of stock behavior, which plays a key role in supporting the financial decisions of investors. In recent years, the large amount of available financial data and the heterogeneous contextual information led researchers to investigate data-driven models using Artificial Intelligence (AI)-based approaches for forecasting stock prices. Recent methodologies focus mainly on analyzing participants from Reddit without considering other social media and how their combination affects the stock market, which remains an open challenge. In this paper, we combine financial data and textual user-generated information, which are provided as input to various deep learning models, to develop a stock forecasting system. The main novelties of the proposal concern the design of a multi-modal approach combining historical stock prices and sentiment scores extracted by different Online Social Networks (OSNs), also unveiling possible correlations about heterogeneous information evaluated during the GameStop squeeze. In particular, we have examined several AI-based models and investigated the impact of textual data inferred from well-known Online Social Networks (i.e., Reddit and Twitter) on stock market behavior by conducting a case study on GameStop. Although users’ dynamic opinions on social networks may have a detrimental impact on the stock prediction task, our investigation has demonstrated the usefulness of assessing user-generated content inferred from various OSNs on the market forecasting problem.
股票行为预测是金融市场面临的主要挑战之一,在支持投资者做出金融决策方面发挥着关键作用。近年来,大量可用的金融数据和异构的上下文信息促使研究人员利用基于人工智能(AI)的方法研究数据驱动模型,以预测股票价格。最近的方法主要集中于分析 Reddit 上的参与者,而没有考虑其他社交媒体以及它们的组合如何影响股市,这仍然是一个有待解决的难题。在本文中,我们结合了金融数据和用户生成的文本信息,将其作为各种深度学习模型的输入,开发了一个股票预测系统。该提案的主要新颖之处在于设计了一种多模式方法,将历史股票价格和不同在线社交网络(OSN)提取的情绪评分结合起来,同时揭示了 GameStop 挤压期间评估的异构信息可能存在的相关性。特别是,我们通过对 GameStop 的案例研究,检验了几种基于人工智能的模型,并调查了从知名在线社交网络(即 Reddit 和 Twitter)中推断出的文本数据对股市行为的影响。虽然用户在社交网络上的动态观点可能会对股票预测任务产生不利影响,但我们的调查证明了评估从各种在线社交网络推断出的用户生成内容对市场预测问题的有用性。
{"title":"How does user-generated content on Social Media affect stock predictions? A case study on GameStop","authors":"Antonino Ferraro ,&nbsp;Giancarlo Sperlì","doi":"10.1016/j.osnem.2024.100293","DOIUrl":"10.1016/j.osnem.2024.100293","url":null,"abstract":"<div><div>One of the main challenges in the financial market concerns the forecasting of stock behavior, which plays a key role in supporting the financial decisions of investors. In recent years, the large amount of available financial data and the heterogeneous contextual information led researchers to investigate data-driven models using Artificial Intelligence (AI)-based approaches for forecasting stock prices. Recent methodologies focus mainly on analyzing participants from Reddit without considering other social media and how their combination affects the stock market, which remains an open challenge. In this paper, we combine financial data and textual user-generated information, which are provided as input to various deep learning models, to develop a stock forecasting system. The main novelties of the proposal concern the design of a multi-modal approach combining historical stock prices and sentiment scores extracted by different Online Social Networks (OSNs), also unveiling possible correlations about heterogeneous information evaluated during the GameStop squeeze. In particular, we have examined several AI-based models and investigated the impact of textual data inferred from well-known Online Social Networks (<em>i.e.</em>, Reddit and Twitter) on stock market behavior by conducting a case study on GameStop. Although users’ dynamic opinions on social networks may have a detrimental impact on the stock prediction task, our investigation has demonstrated the usefulness of assessing user-generated content inferred from various OSNs on the market forecasting problem.</div></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"43 ","pages":"Article 100293"},"PeriodicalIF":0.0,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142653576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Measuring centralization of online platforms through size and interconnection of communities 通过社区规模和相互联系衡量在线平台的集中化程度
Q1 Social Sciences Pub Date : 2024-10-25 DOI: 10.1016/j.osnem.2024.100292
Milo Z. Trujillo, Laurent Hébert-Dufresne, James Bagrow
Decentralization of online social platforms offers a variety of potential benefits, including divesting of moderator and administrator authority among a wider population, allowing a variety of communities with differing social standards to coexist, and making the platform more resilient to technical or social attack. However, a platform offering a decentralized architecture does not guarantee that users will use it in a decentralized way, and measuring the centralization of socio-technical networks is not an easy task. In this paper we introduce a method of characterizing inter-community influence, to measure the impact that removing a community would have on the remainder of a platform. Our approach provides a careful definition of “centralization” appropriate in bipartite user-community socio-technical networks, and demonstrates the inadequacy of more trivial methods for interrogating centralization such as examining the distribution of community sizes. We use this method to compare the structure of five socio-technical platforms, and find that even decentralized platforms like Mastodon are far more centralized than any synthetic networks used for comparison. We discuss how this method can be used to identify when a platform is more centralized than it initially appears, either through inherent social pressure like assortative preferential attachment, or through astroturfing by platform administrators, and how this knowledge can inform platform governance and user trust.
网络社交平台的去中心化提供了各种潜在的好处,包括在更广泛的人群中剥离版主和管理员的权力,允许具有不同社会标准的各种社区共存,以及使平台更能抵御技术或社会攻击。然而,提供去中心化架构的平台并不能保证用户会以去中心化的方式使用该平台,而且衡量社会技术网络的中心化程度并非易事。在本文中,我们介绍了一种描述社区间影响力的方法,以衡量移除一个社区对平台其余部分的影响。我们的方法为 "中心化 "提供了一个适合二方用户--社区社会--技术网络的细致定义,并证明了诸如检查社区规模分布等更琐碎的中心化分析方法的不足之处。我们用这种方法比较了五个社会技术平台的结构,发现即使是像 Mastodon 这样的去中心化平台,其中心化程度也远远高于用于比较的任何合成网络。我们讨论了如何利用这种方法来识别一个平台的中心化程度是否高于其最初的表面现象,这可能是由于固有的社会压力(如同类偏好依附),也可能是由于平台管理员的天马行空,以及这种知识如何为平台治理和用户信任提供信息。
{"title":"Measuring centralization of online platforms through size and interconnection of communities","authors":"Milo Z. Trujillo,&nbsp;Laurent Hébert-Dufresne,&nbsp;James Bagrow","doi":"10.1016/j.osnem.2024.100292","DOIUrl":"10.1016/j.osnem.2024.100292","url":null,"abstract":"<div><div>Decentralization of online social platforms offers a variety of potential benefits, including divesting of moderator and administrator authority among a wider population, allowing a variety of communities with differing social standards to coexist, and making the platform more resilient to technical or social attack. However, a platform offering a decentralized architecture does not guarantee that users will use it in a decentralized way, and measuring the centralization of socio-technical networks is not an easy task. In this paper we introduce a method of characterizing inter-community influence, to measure the impact that removing a community would have on the remainder of a platform. Our approach provides a careful definition of “centralization” appropriate in bipartite user-community socio-technical networks, and demonstrates the inadequacy of more trivial methods for interrogating centralization such as examining the distribution of community sizes. We use this method to compare the structure of five socio-technical platforms, and find that even decentralized platforms like Mastodon are far more centralized than any synthetic networks used for comparison. We discuss how this method can be used to identify when a platform is more centralized than it initially appears, either through inherent social pressure like assortative preferential attachment, or through astroturfing by platform administrators, and how this knowledge can inform platform governance and user trust.</div></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"43 ","pages":"Article 100292"},"PeriodicalIF":0.0,"publicationDate":"2024-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142530114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Crowdsourcing the Mitigation of disinformation and misinformation: The case of spontaneous community-based moderation on Reddit 众包减少虚假信息和错误信息:Reddit 上基于社区的自发审核案例
Q1 Social Sciences Pub Date : 2024-10-19 DOI: 10.1016/j.osnem.2024.100291
Giulio Corsi , Elizabeth Seger , Sean Ó hÉigeartaigh
Community-based content moderation, an approach that utilises user-generated knowledge to shape the ranking and display of online content, is recognised as a potential tool in combating disinformation and misinformation. This study examines this phenomenon on Reddit, which employs a platform-wide content ranking system based on user upvotes and downvotes. By empowering users to influence content visibility, Reddit's system serves as a naturally occurring community moderation mechanism, providing an opportunity to analyse how users engage with this system. Focusing on discussions related to climate change, we observe that in this domain, low-credibility content is spontaneously moderated by Reddit users, although the magnitude of this effect varies across Subreddits. We also identify temporal fluctuations in content removal rates, indicating dynamic and context-dependent patterns influenced by platform policies and socio-political factors. These findings highlight the potential of community-based moderation in mitigating online false information, offering valuable insights for the development of robust social media moderation frameworks.
基于社区的内容节制是一种利用用户生成的知识来影响在线内容的排名和显示的方法,被认为是打击虚假信息和错误信息的潜在工具。本研究对 Reddit 上的这一现象进行了研究,Reddit 采用的是基于用户向上投票和向下投票的全平台内容排名系统。通过授权用户影响内容可见度,Reddit 的系统成为了一种自然形成的社区调节机制,为分析用户如何参与这一系统提供了机会。以气候变化相关讨论为重点,我们观察到,在这一领域,Reddit 用户会自发地对低可信度内容进行审核,尽管这一效果的大小在不同的 Subreddits 中有所不同。我们还发现了内容删除率的时空波动,这表明了受平台政策和社会政治因素影响的动态和上下文依赖模式。这些发现凸显了基于社区的审核在减少网络虚假信息方面的潜力,为开发强大的社交媒体审核框架提供了宝贵的见解。
{"title":"Crowdsourcing the Mitigation of disinformation and misinformation: The case of spontaneous community-based moderation on Reddit","authors":"Giulio Corsi ,&nbsp;Elizabeth Seger ,&nbsp;Sean Ó hÉigeartaigh","doi":"10.1016/j.osnem.2024.100291","DOIUrl":"10.1016/j.osnem.2024.100291","url":null,"abstract":"<div><div>Community-based content moderation, an approach that utilises user-generated knowledge to shape the ranking and display of online content, is recognised as a potential tool in combating disinformation and misinformation. This study examines this phenomenon on Reddit, which employs a platform-wide content ranking system based on user upvotes and downvotes. By empowering users to influence content visibility, Reddit's system serves as a naturally occurring community moderation mechanism, providing an opportunity to analyse how users engage with this system. Focusing on discussions related to climate change, we observe that in this domain, low-credibility content is spontaneously moderated by Reddit users, although the magnitude of this effect varies across Subreddits. We also identify temporal fluctuations in content removal rates, indicating dynamic and context-dependent patterns influenced by platform policies and socio-political factors. These findings highlight the potential of community-based moderation in mitigating online false information, offering valuable insights for the development of robust social media moderation frameworks.</div></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"43 ","pages":"Article 100291"},"PeriodicalIF":0.0,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142530113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Online Social Networks and Media
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1