评估 Twitter 对低可信度内容的算法放大：一项观察研究

IF 3 2区计算机科学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS EPJ Data Science Pub Date : 2024-03-07 DOI:10.1140/epjds/s13688-024-00456-3

Giulio Corsi

{"title":"评估 Twitter 对低可信度内容的算法放大：一项观察研究","authors":"Giulio Corsi","doi":"10.1140/epjds/s13688-024-00456-3","DOIUrl":null,"url":null,"abstract":"<p>Artificial intelligence (AI)-powered recommender systems play a crucial role in determining the content that users are exposed to on social media platforms. However, the behavioural patterns of these systems are often opaque, complicating the evaluation of their impact on the dissemination and consumption of disinformation and misinformation. To begin addressing this evidence gap, this study presents a measurement approach that uses observed digital traces to infer the status of algorithmic amplification of low-credibility content on Twitter over a 14-day period in January 2023. Using an original dataset of ≈ 2.7 million posts on COVID-19 and climate change published on the platform, this study identifies tweets sharing information from low-credibility domains, and uses a bootstrapping model with two stratifications, a tweet’s engagement level and a user’s followers level, to compare any differences in impressions generated between low-credibility and high-credibility samples. Additional stratification variables of toxicity, political bias, and verified status are also examined. This analysis provides valuable observational evidence on whether the Twitter algorithm favours the visibility of low-credibility content, with results indicating that, on aggregate, tweets containing low-credibility URL domains perform better than tweets that do not across both datasets. However, this effect is largely attributable to a difference in high-engagement, high-followers tweets, which are very impactful in terms of impressions generation, and are more likely receive amplified visibility when containing low-credibility content. Furthermore, high toxicity tweets and those with right-leaning bias see heightened amplification, as do low-credibility tweets from verified accounts. Ultimately, this suggests that Twitter’s recommender system may have facilitated the diffusion of false content by amplifying the visibility of low-credibility content with high-engagement generated by very influential users.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"27 1","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2024-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evaluating Twitter’s algorithmic amplification of low-credibility content: an observational study\",\"authors\":\"Giulio Corsi\",\"doi\":\"10.1140/epjds/s13688-024-00456-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Artificial intelligence (AI)-powered recommender systems play a crucial role in determining the content that users are exposed to on social media platforms. However, the behavioural patterns of these systems are often opaque, complicating the evaluation of their impact on the dissemination and consumption of disinformation and misinformation. To begin addressing this evidence gap, this study presents a measurement approach that uses observed digital traces to infer the status of algorithmic amplification of low-credibility content on Twitter over a 14-day period in January 2023. Using an original dataset of ≈ 2.7 million posts on COVID-19 and climate change published on the platform, this study identifies tweets sharing information from low-credibility domains, and uses a bootstrapping model with two stratifications, a tweet’s engagement level and a user’s followers level, to compare any differences in impressions generated between low-credibility and high-credibility samples. Additional stratification variables of toxicity, political bias, and verified status are also examined. This analysis provides valuable observational evidence on whether the Twitter algorithm favours the visibility of low-credibility content, with results indicating that, on aggregate, tweets containing low-credibility URL domains perform better than tweets that do not across both datasets. However, this effect is largely attributable to a difference in high-engagement, high-followers tweets, which are very impactful in terms of impressions generation, and are more likely receive amplified visibility when containing low-credibility content. Furthermore, high toxicity tweets and those with right-leaning bias see heightened amplification, as do low-credibility tweets from verified accounts. Ultimately, this suggests that Twitter’s recommender system may have facilitated the diffusion of false content by amplifying the visibility of low-credibility content with high-engagement generated by very influential users.</p>\",\"PeriodicalId\":11887,\"journal\":{\"name\":\"EPJ Data Science\",\"volume\":\"27 1\",\"pages\":\"\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2024-03-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"EPJ Data Science\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1140/epjds/s13688-024-00456-3\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"EPJ Data Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1140/epjds/s13688-024-00456-3","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

摘要

人工智能（AI）驱动的推荐系统在决定用户在社交媒体平台上接触的内容方面发挥着至关重要的作用。然而，这些系统的行为模式往往是不透明的，这使得评估它们对虚假信息和错误信息的传播和消费的影响变得更加复杂。为了着手解决这一证据缺口，本研究提出了一种测量方法，利用观察到的数字痕迹来推断 2023 年 1 月 14 天内 Twitter 上低可信度内容的算法放大状况。本研究利用平台上发布的有关 COVID-19 和气候变化的≈270 万条帖子的原始数据集，识别出分享低可信度领域信息的推文，并使用具有两个分层（推文参与度和用户关注度）的引导模型，比较低可信度样本和高可信度样本之间产生的印象差异。此外，还考察了毒性、政治偏见和验证状态等其他分层变量。这项分析为推特算法是否有利于低可信度内容的可见性提供了宝贵的观察证据，结果表明，在两个数据集中，包含低可信度 URL 域的推文的总体表现要好于不包含低可信度 URL 域的推文。然而，这种效果主要归因于高参与度、高关注度推文的差异，这些推文在产生印象方面非常有影响力，当包含低可信度内容时，更有可能获得更高的可见度。此外，毒性高的推文和带有右倾偏见的推文，以及来自已验证账户的低可信度推文也会被放大。这最终表明，Twitter 的推荐系统可能通过放大由非常有影响力的用户产生的高参与度的低可信度内容的可见度，促进了虚假内容的传播。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Evaluating Twitter’s algorithmic amplification of low-credibility content: an observational study

Artificial intelligence (AI)-powered recommender systems play a crucial role in determining the content that users are exposed to on social media platforms. However, the behavioural patterns of these systems are often opaque, complicating the evaluation of their impact on the dissemination and consumption of disinformation and misinformation. To begin addressing this evidence gap, this study presents a measurement approach that uses observed digital traces to infer the status of algorithmic amplification of low-credibility content on Twitter over a 14-day period in January 2023. Using an original dataset of ≈ 2.7 million posts on COVID-19 and climate change published on the platform, this study identifies tweets sharing information from low-credibility domains, and uses a bootstrapping model with two stratifications, a tweet’s engagement level and a user’s followers level, to compare any differences in impressions generated between low-credibility and high-credibility samples. Additional stratification variables of toxicity, political bias, and verified status are also examined. This analysis provides valuable observational evidence on whether the Twitter algorithm favours the visibility of low-credibility content, with results indicating that, on aggregate, tweets containing low-credibility URL domains perform better than tweets that do not across both datasets. However, this effect is largely attributable to a difference in high-engagement, high-followers tweets, which are very impactful in terms of impressions generation, and are more likely receive amplified visibility when containing low-credibility content. Furthermore, high toxicity tweets and those with right-leaning bias see heightened amplification, as do low-credibility tweets from verified accounts. Ultimately, this suggests that Twitter’s recommender system may have facilitated the diffusion of false content by amplifying the visibility of low-credibility content with high-engagement generated by very influential users.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

EPJ Data Science MATHEMATICS, INTERDISCIPLINARY APPLICATIONS -

CiteScore

6.10

自引率

5.60%

发文量

审稿时长

13 weeks

期刊介绍： EPJ Data Science covers a broad range of research areas and applications and particularly encourages contributions from techno-socio-economic systems, where it comprises those research lines that now regard the digital “tracks” of human beings as first-order objects for scientific investigation. Topics include, but are not limited to, human behavior, social interaction (including animal societies), economic and financial systems, management and business networks, socio-technical infrastructure, health and environmental systems, the science of science, as well as general risk and crisis scenario forecasting up to and including policy advice.