首页 > 最新文献

Journal of Computational Social Science最新文献

英文 中文
What motivated mitigation policies? A network-based longitudinal analysis of state-level mitigation strategies 缓解政策的动机是什么?基于网络的国家级缓解策略纵向分析
IF 3.2 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2023-03-02 DOI: 10.1007/s42001-023-00214-x
W. Fries
{"title":"What motivated mitigation policies? A network-based longitudinal analysis of state-level mitigation strategies","authors":"W. Fries","doi":"10.1007/s42001-023-00214-x","DOIUrl":"https://doi.org/10.1007/s42001-023-00214-x","url":null,"abstract":"","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":"11 1","pages":""},"PeriodicalIF":3.2,"publicationDate":"2023-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84140759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Tracking moral divergence with DDR in presidential debates over 60 years 追踪60年来与DDR在总统辩论中的道德分歧
IF 3.2 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2023-02-13 DOI: 10.1007/s42001-023-00198-8
Mengyao Xu, Lingshu Hu, G. Cameron
{"title":"Tracking moral divergence with DDR in presidential debates over 60 years","authors":"Mengyao Xu, Lingshu Hu, G. Cameron","doi":"10.1007/s42001-023-00198-8","DOIUrl":"https://doi.org/10.1007/s42001-023-00198-8","url":null,"abstract":"","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":"258 1","pages":"339 - 357"},"PeriodicalIF":3.2,"publicationDate":"2023-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77083694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Enhanced sentiment analysis regarding COVID-19 news from global channels. 加强对来自全球渠道的 COVID-19 新闻的情感分析。
IF 2 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2023-01-01 Epub Date: 2022-11-27 DOI: 10.1007/s42001-022-00189-1
Waseem Ahmad, Bang Wang, Philecia Martin, Minghua Xu, Han Xu

For a healthy society to exist, it is crucial for the media to focus on disease-related issues so that more people are widely aware of them and reduce health risks. Recently, deep neural networks have become a popular tool for textual sentiment analysis, which can provide valuable insights and real-time monitoring and analysis regarding health issues. In this paper, as part of an effort to develop an effective model that can elicit public sentiment on COVID-19 news, we propose a novel approach Cov-Att-BiLSTM for sentiment analysis of COVID-19 news headlines using deep neural networks. We integrate attention mechanisms, embedding techniques, and semantic level data labeling into the prediction process to enhance the accuracy. To evaluate the proposed approach, we compared it to several deep and machine learning classifiers using various metrics of categorization efficiency and prediction quality, and the experimental results demonstrate its superiority with 0.931 testing accuracy. Furthermore, 73,138 pandemic-related tweets posted on six global channels were analyzed by the proposed approach, which accurately reflects global coverage of COVID-19 news and vaccination.

为了实现健康社会,媒体必须关注疾病相关问题,让更多人广泛了解这些问题,降低健康风险。最近,深度神经网络已成为文本情感分析的一种流行工具,它可以为健康问题提供有价值的见解和实时监测与分析。在本文中,为了开发一种有效的模型来激发公众对 COVID-19 新闻的情感,我们提出了一种利用深度神经网络对 COVID-19 新闻标题进行情感分析的新方法 Cov-Att-BiLSTM。我们在预测过程中整合了注意力机制、嵌入技术和语义级数据标签,以提高预测的准确性。为了评估所提出的方法,我们使用分类效率和预测质量的各种指标将其与几种深度学习和机器学习分类器进行了比较,实验结果表明其优越性,测试准确率为 0.931。此外,该方法还分析了全球六个频道发布的 73 138 条大流行病相关推文,准确反映了 COVID-19 新闻和疫苗接种的全球覆盖情况。
{"title":"Enhanced sentiment analysis regarding COVID-19 news from global channels.","authors":"Waseem Ahmad, Bang Wang, Philecia Martin, Minghua Xu, Han Xu","doi":"10.1007/s42001-022-00189-1","DOIUrl":"10.1007/s42001-022-00189-1","url":null,"abstract":"<p><p>For a healthy society to exist, it is crucial for the media to focus on disease-related issues so that more people are widely aware of them and reduce health risks. Recently, deep neural networks have become a popular tool for textual sentiment analysis, which can provide valuable insights and real-time monitoring and analysis regarding health issues. In this paper, as part of an effort to develop an effective model that can elicit public sentiment on COVID-19 news, we propose a novel approach Cov-Att-BiLSTM for sentiment analysis of COVID-19 news headlines using deep neural networks. We integrate attention mechanisms, embedding techniques, and semantic level data labeling into the prediction process to enhance the accuracy. To evaluate the proposed approach, we compared it to several deep and machine learning classifiers using various metrics of categorization efficiency and prediction quality, and the experimental results demonstrate its superiority with 0.931 testing accuracy. Furthermore, 73,138 pandemic-related tweets posted on six global channels were analyzed by the proposed approach, which accurately reflects global coverage of COVID-19 news and vaccination.</p>","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":"6 1","pages":"19-57"},"PeriodicalIF":2.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9702932/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9414432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Varieties of corona news: a cross-national study on the foundations of online misinformation production during the COVID-19 pandemic. 日冕新闻的多样性:关于 COVID-19 大流行期间网络错误信息生产基础的跨国研究。
IF 2 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2023-01-01 Epub Date: 2022-12-13 DOI: 10.1007/s42001-022-00193-5
Cantay Caliskan, Alaz Kilicaslan

Misinformation in the media is produced by hard-to-gauge thought mechanisms employed by individuals or collectivities. In this paper, we shed light on what the country-specific factors of falsehood production in the context of COVID-19 Pandemic might be. Collecting our evidence from the largest misinformation dataset used in the COVID-19 misinformation literature with close to 11,000 pieces of falsehood, we explore patterns of misinformation production by employing a variety of methodological tools including algorithms for text similarity, clustering, network distances, and other statistical tools. Covering news produced in a span of more than 14 months, our paper also differentiates itself by its use of carefully controlled hand-labeling of topics of falsehood. Findings suggest that country-level factors do not provide the strongest support for predicting outcomes of falsehood, except for one phenomenon: in countries with serious press freedom problems and low human development, the mostly unknown authors of misinformation tend to focus on similar content. In addition, the intensity of discussion on animals, predictions and symptoms as part of fake news is the biggest differentiator between nations; whereas news on conspiracies, medical equipment and risk factors offer the least explanation to differentiate. Based on those findings, we discuss some distinct public health and communication strategies to dispel misinformation in countries with particular characteristics. We also emphasize that a global action plan against misinformation is needed given the highly globalized nature of the online media environment.

Supplementary information: The online version contains supplementary material available at 10.1007/s42001-022-00193-5.

媒体中的错误信息是由个人或集体采用的难以测量的思维机制产生的。在本文中,我们将揭示 COVID-19 大流行背景下产生虚假信息的特定国家因素。我们从 COVID-19 虚假信息文献中使用的最大的虚假信息数据集(包含近 11,000 条虚假信息)中收集证据,并通过使用文本相似性算法、聚类、网络距离和其他统计工具等多种方法工具来探索虚假信息的生产模式。我们的论文涵盖了 14 个多月内产生的新闻,其与众不同之处还在于,我们对虚假信息的主题进行了精心控制的手工标记。研究结果表明,国家层面的因素并不能为预测虚假信息的结果提供最有力的支持,但有一个现象除外:在新闻自由问题严重、人类发展水平较低的国家,虚假信息的作者大多不为人知,他们往往关注类似的内容。此外,作为虚假新闻的一部分,对动物、预测和症状的讨论强度是国家间最大的区别因素;而对阴谋、医疗设备和风险因素的新闻提供的区别解释最少。基于这些发现,我们讨论了一些独特的公共卫生和传播策略,以消除具有特殊性的国家的错误信息。我们还强调,鉴于网络媒体环境的高度全球化性质,需要制定一项打击误导的全球行动计划:在线版本包含补充材料,可在 10.1007/s42001-022-00193-5 上查阅。
{"title":"Varieties of corona news: a cross-national study on the foundations of online misinformation production during the COVID-19 pandemic.","authors":"Cantay Caliskan, Alaz Kilicaslan","doi":"10.1007/s42001-022-00193-5","DOIUrl":"10.1007/s42001-022-00193-5","url":null,"abstract":"<p><p>Misinformation in the media is produced by hard-to-gauge thought mechanisms employed by individuals or collectivities. In this paper, we shed light on what the country-specific factors of falsehood production in the context of COVID-19 Pandemic might be. Collecting our evidence from the largest misinformation dataset used in the COVID-19 misinformation literature with close to 11,000 pieces of falsehood, we explore patterns of misinformation production by employing a variety of methodological tools including algorithms for text similarity, clustering, network distances, and other statistical tools. Covering news produced in a span of more than 14 months, our paper also differentiates itself by its use of carefully controlled hand-labeling of topics of falsehood. Findings suggest that country-level factors do not provide the strongest support for predicting outcomes of falsehood, except for one phenomenon: in countries with serious press freedom problems and low human development, the mostly unknown authors of misinformation tend to focus on similar content. In addition, the intensity of discussion on animals, predictions and symptoms as part of fake news is the biggest differentiator between nations; whereas news on conspiracies, medical equipment and risk factors offer the least explanation to differentiate. Based on those findings, we discuss some distinct public health and communication strategies to dispel misinformation in countries with particular characteristics. We also emphasize that a global action plan against misinformation is needed given the highly globalized nature of the online media environment.</p><p><strong>Supplementary information: </strong>The online version contains supplementary material available at 10.1007/s42001-022-00193-5.</p>","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":"6 1","pages":"191-243"},"PeriodicalIF":2.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9746594/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9766277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An AI-based framework for studying visual diversity of urban neighborhoods and its relationship with socio-demographic variables. 基于人工智能的城市社区视觉多样性研究框架及其与社会人口变量的关系。
IF 3.2 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2023-01-01 DOI: 10.1007/s42001-022-00197-1
Md Amiruzzaman, Ye Zhao, Stefanie Amiruzzaman, Aryn C Karpinski, Tsung Heng Wu

This study presents a framework to study quantitatively geographical visual diversities of urban neighborhood from a large collection of street-view images using an Artificial Intelligence (AI)-based image segmentation technique. A variety of diversity indices are computed from the extracted visual semantics. They are utilized to discover the relationships between urban visual appearance and socio-demographic variables. This study also validates the reliability of the method with human evaluators. The methodology and results obtained from this study can potentially be used to study urban features, locate houses, establish services, and better operate municipalities.

本研究提出了一个框架,利用基于人工智能(AI)的图像分割技术,从大量街景图像中定量研究城市街区的地理视觉多样性。从提取的视觉语义中计算各种多样性指数。它们被用来发现城市视觉外观和社会人口变量之间的关系。本研究还通过人工评估验证了该方法的可靠性。从本研究中获得的方法和结果可以潜在地用于研究城市特征、定位房屋、建立服务和更好地运营市政当局。
{"title":"An AI-based framework for studying visual diversity of urban neighborhoods and its relationship with socio-demographic variables.","authors":"Md Amiruzzaman,&nbsp;Ye Zhao,&nbsp;Stefanie Amiruzzaman,&nbsp;Aryn C Karpinski,&nbsp;Tsung Heng Wu","doi":"10.1007/s42001-022-00197-1","DOIUrl":"https://doi.org/10.1007/s42001-022-00197-1","url":null,"abstract":"<p><p>This study presents a framework to study quantitatively geographical visual diversities of urban neighborhood from a large collection of street-view images using an Artificial Intelligence (AI)-based image segmentation technique. A variety of diversity indices are computed from the extracted visual semantics. They are utilized to discover the relationships between urban visual appearance and socio-demographic variables. This study also validates the reliability of the method with human evaluators. The methodology and results obtained from this study can potentially be used to study urban features, locate houses, establish services, and better operate municipalities.</p>","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":"6 1","pages":"315-337"},"PeriodicalIF":3.2,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9795947/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9414054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimating time-series changes in social sentiment @Twitter in U.S. metropolises during the COVID-19 pandemic. 估算新冠疫情期间美国大城市社会情绪@Twitter的时间序列变化。
IF 3.2 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2023-01-01 DOI: 10.1007/s42001-022-00186-4
Ryuichi Saito, Shinichiro Haruyama

Since early 2020, the global coronavirus pandemic has strained economic activities and traditional lifestyles. For such emergencies, our paper proposes a social sentiment estimation model that changes in response to infection conditions and state government orders. By designing mediation keywords that do not directly evoke coronavirus, it is possible to observe sentiment waveforms that vary as confirmed cases increase or decrease and as behavioral restrictions are ordered or lifted over a long period. The model demonstrates guaranteed performance with transformer-based neural network models and has been validated in New York City, Los Angeles, and Chicago, given that coronavirus infections explode in overcrowded cities. The time-series of the extracted social sentiment reflected the infection conditions of each city during the 2-year period from pre-pandemic to the new normal and shows a concurrency of waveforms common to the three cities. The methods of this paper could be applied not only to analysis of the COVID-19 pandemic but also to analyses of a wide range of emergencies and they could be a policy support tool that complements traditional surveys in the future.

自2020年初以来,全球冠状病毒大流行给经济活动和传统生活方式带来了压力。针对此类突发事件,本文提出了一种随感染情况和国家政府命令变化的社会情绪估计模型。通过设计不直接引起冠状病毒的中介关键词,可以观察到随着确诊病例的增加或减少、行为限制的下达或解除而长期变化的情绪波形。该模型利用基于变压器的神经网络模型证明了有保证的性能,并在纽约、洛杉矶和芝加哥得到了验证,因为冠状病毒感染在拥挤的城市中激增。提取的社会情绪时间序列反映了各城市从疫情前到新常态2年期间的感染情况,呈现出3个城市共有的波形并发性。本文的方法不仅可以应用于COVID-19大流行的分析,还可以应用于各种突发事件的分析,它们可以成为未来传统调查的补充政策支持工具。
{"title":"Estimating time-series changes in social sentiment @Twitter in U.S. metropolises during the COVID-19 pandemic.","authors":"Ryuichi Saito,&nbsp;Shinichiro Haruyama","doi":"10.1007/s42001-022-00186-4","DOIUrl":"https://doi.org/10.1007/s42001-022-00186-4","url":null,"abstract":"<p><p>Since early 2020, the global coronavirus pandemic has strained economic activities and traditional lifestyles. For such emergencies, our paper proposes a social sentiment estimation model that changes in response to infection conditions and state government orders. By designing mediation keywords that do not directly evoke coronavirus, it is possible to observe sentiment waveforms that vary as confirmed cases increase or decrease and as behavioral restrictions are ordered or lifted over a long period. The model demonstrates guaranteed performance with transformer-based neural network models and has been validated in New York City, Los Angeles, and Chicago, given that coronavirus infections explode in overcrowded cities. The time-series of the extracted social sentiment reflected the infection conditions of each city during the 2-year period from pre-pandemic to the new normal and shows a concurrency of waveforms common to the three cities. The methods of this paper could be applied not only to analysis of the COVID-19 pandemic but also to analyses of a wide range of emergencies and they could be a policy support tool that complements traditional surveys in the future.</p>","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":"6 1","pages":"359-388"},"PeriodicalIF":3.2,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9660099/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9469439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A comparison of approaches for imbalanced classification problems in the context of retrieving relevant documents for an analysis. 在检索相关文档进行分析的背景下,不平衡分类问题的方法比较。
IF 3.2 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2023-01-01 DOI: 10.1007/s42001-022-00191-7
Sandra Wankmüller

One of the first steps in many text-based social science studies is to retrieve documents that are relevant for an analysis from large corpora of otherwise irrelevant documents. The conventional approach in social science to address this retrieval task is to apply a set of keywords and to consider those documents to be relevant that contain at least one of the keywords. But the application of incomplete keyword lists has a high risk of drawing biased inferences. More complex and costly methods such as query expansion techniques, topic model-based classification rules, and active as well as passive supervised learning could have the potential to more accurately separate relevant from irrelevant documents and thereby reduce the potential size of bias. Yet, whether applying these more expensive approaches increases retrieval performance compared to keyword lists at all, and if so, by how much, is unclear as a comparison of these approaches is lacking. This study closes this gap by comparing these methods across three retrieval tasks associated with a data set of German tweets (Linder in SSRN, 2017. 10.2139/ssrn.3026393), the Social Bias Inference Corpus (SBIC) (Sap et al. in Social bias frames: reasoning about social and power implications of language. In: Jurafsky et al. (eds) Proceedings of the 58th annual meeting of the association for computational linguistics. Association for Computational Linguistics, p 5477-5490, 2020. 10.18653/v1/2020.aclmain.486), and the Reuters-21578 corpus (Lewis in Reuters-21578 (Distribution 1.0). [Data set], 1997. http://www.daviddlewis.com/resources/testcollections/reuters21578/). Results show that query expansion techniques and topic model-based classification rules in most studied settings tend to decrease rather than increase retrieval performance. Active supervised learning, however, if applied on a not too small set of labeled training instances (e.g. 1000 documents), reaches a substantially higher retrieval performance than keyword lists.

许多基于文本的社会科学研究的第一步是从大量无关文档的语料库中检索与分析相关的文档。在社会科学中,解决这一检索任务的传统方法是应用一组关键字,并认为那些包含至少一个关键字的文档是相关的。但应用不完整的关键字列表有很高的风险得出有偏见的推论。更复杂和昂贵的方法,如查询扩展技术、基于主题模型的分类规则、主动和被动监督学习,都有可能更准确地将相关文档与不相关文档分开,从而减少潜在的偏差大小。然而,与关键字列表相比,应用这些更昂贵的方法是否提高了检索性能,如果有的话,提高了多少,由于缺乏对这些方法的比较,目前还不清楚。本研究通过将这些方法与一组德语推文数据集相关的三个检索任务进行比较,缩小了这一差距(Linder in SSRN, 2017)。10.2139/ssrn.3026393),社会偏见推理语料库(SBIC) (Sap et al. Social Bias frames: reasoning about Social and power implications of language)。见:Jurafsky et al.(编)计算语言学协会第58届年会论文集。计算语言学,p 5477-5490, 2020。10.18653/v1/2020.aclmain.486)和Reuters-21578语料库(Lewis in Reuters-21578 (Distribution 1.0))。[数据集],1997。http://www.daviddlewis.com/resources/testcollections/reuters21578/)。结果表明,在大多数研究环境下,查询扩展技术和基于主题模型的分类规则倾向于降低而不是提高检索性能。然而,如果将主动监督学习应用于不太小的标记训练实例集(例如1000个文档),则可以达到比关键字列表高得多的检索性能。
{"title":"A comparison of approaches for imbalanced classification problems in the context of retrieving relevant documents for an analysis.","authors":"Sandra Wankmüller","doi":"10.1007/s42001-022-00191-7","DOIUrl":"https://doi.org/10.1007/s42001-022-00191-7","url":null,"abstract":"<p><p>One of the first steps in many text-based social science studies is to retrieve documents that are relevant for an analysis from large corpora of otherwise irrelevant documents. The conventional approach in social science to address this retrieval task is to apply a set of keywords and to consider those documents to be relevant that contain at least one of the keywords. But the application of incomplete keyword lists has a high risk of drawing biased inferences. More complex and costly methods such as query expansion techniques, topic model-based classification rules, and active as well as passive supervised learning could have the potential to more accurately separate relevant from irrelevant documents and thereby reduce the potential size of bias. Yet, whether applying these more expensive approaches increases retrieval performance compared to keyword lists at all, and if so, by how much, is unclear as a comparison of these approaches is lacking. This study closes this gap by comparing these methods across three retrieval tasks associated with a data set of German tweets (Linder in SSRN, 2017. 10.2139/ssrn.3026393), the Social Bias Inference Corpus (SBIC) (Sap et al. in Social bias frames: reasoning about social and power implications of language. In: Jurafsky et al. (eds) Proceedings of the 58th annual meeting of the association for computational linguistics. Association for Computational Linguistics, p 5477-5490, 2020. 10.18653/v1/2020.aclmain.486), and the Reuters-21578 corpus (Lewis in Reuters-21578 (Distribution 1.0). [Data set], 1997. http://www.daviddlewis.com/resources/testcollections/reuters21578/). Results show that query expansion techniques and topic model-based classification rules in most studied settings tend to decrease rather than increase retrieval performance. Active supervised learning, however, if applied on a not too small set of labeled training instances (e.g. 1000 documents), reaches a substantially higher retrieval performance than keyword lists.</p>","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":"6 1","pages":"91-163"},"PeriodicalIF":3.2,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9762672/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9469919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A scoping review on the use of natural language processing in research on political polarization: trends and research prospects. 自然语言处理在政治极化研究中的应用综述:趋势与研究展望。
IF 2 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2023-01-01 Epub Date: 2022-12-19 DOI: 10.1007/s42001-022-00196-2
Renáta Németh

As part of the "text-as-data" movement, Natural Language Processing (NLP) provides a computational way to examine political polarization. We conducted a methodological scoping review of studies published since 2010 (n = 154) to clarify how NLP research has conceptualized and measured political polarization, and to characterize the degree of integration of the two different research paradigms that meet in this research area. We identified biases toward US context (59%), Twitter data (43%) and machine learning approach (33%). Research covers different layers of the political public sphere (politicians, experts, media, or the lay public), however, very few studies involved more than one layer. Results indicate that only a few studies made use of domain knowledge and a high proportion of the studies were not interdisciplinary. Those studies that made efforts to interpret the results demonstrated that the characteristics of political texts depend not only on the political position of their authors, but also on other often-overlooked factors. Ignoring these factors may lead to overly optimistic performance measures. Also, spurious results may be obtained when causal relations are inferred from textual data. Our paper provides arguments for the integration of explanatory and predictive modeling paradigms, and for a more interdisciplinary approach to polarization research.

Supplementary information: The online version contains supplementary material available at 10.1007/s42001-022-00196-2.

作为“文本即数据”运动的一部分,自然语言处理(NLP)提供了一种计算方法来检查政治两极分化。我们对2010年以来发表的研究(n = 154)进行了方法学范围审查,以阐明NLP研究如何概念化和测量政治两极分化,并表征在该研究领域遇到的两种不同研究范式的整合程度。我们发现了对美国背景(59%)、Twitter数据(43%)和机器学习方法(33%)的偏见。研究涵盖了政治公共领域的不同层面(政治家、专家、媒体或非专业公众),然而,很少有研究涉及超过一个层面。结果表明,利用领域知识的研究较少,非跨学科研究占很大比例。那些努力解释结果的研究表明,政治文本的特征不仅取决于其作者的政治立场,还取决于其他经常被忽视的因素。忽略这些因素可能会导致过于乐观的绩效指标。此外,当从文本数据推断因果关系时,可能会得到虚假的结果。我们的论文为解释和预测模型范式的整合提供了论据,并为极化研究提供了更跨学科的方法。补充信息:在线版本提供的补充资料为10.1007/s42001-022-00196-2。
{"title":"A scoping review on the use of natural language processing in research on political polarization: trends and research prospects.","authors":"Renáta Németh","doi":"10.1007/s42001-022-00196-2","DOIUrl":"10.1007/s42001-022-00196-2","url":null,"abstract":"<p><p>As part of the \"text-as-data\" movement, Natural Language Processing (NLP) provides a computational way to examine political polarization. We conducted a methodological scoping review of studies published since 2010 (<i>n</i> = 154) to clarify how NLP research has conceptualized and measured political polarization, and to characterize the degree of integration of the two different research paradigms that meet in this research area. We identified biases toward US context (59%), Twitter data (43%) and machine learning approach (33%). Research covers different layers of the political public sphere (politicians, experts, media, or the lay public), however, very few studies involved more than one layer. Results indicate that only a few studies made use of domain knowledge and a high proportion of the studies were not interdisciplinary. Those studies that made efforts to interpret the results demonstrated that the characteristics of political texts depend not only on the political position of their authors, but also on other often-overlooked factors. Ignoring these factors may lead to overly optimistic performance measures. Also, spurious results may be obtained when causal relations are inferred from textual data. Our paper provides arguments for the integration of explanatory and predictive modeling paradigms, and for a more interdisciplinary approach to polarization research.</p><p><strong>Supplementary information: </strong>The online version contains supplementary material available at 10.1007/s42001-022-00196-2.</p>","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":"6 1","pages":"289-313"},"PeriodicalIF":2.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9762668/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9469920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
School dropout prediction and feature importance exploration in Malawi using household panel data: machine learning approach 使用家庭面板数据的马拉维辍学预测和特征重要性探索:机器学习方法
IF 3.2 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2022-12-13 DOI: 10.1007/s42001-022-00195-3
Hazal Colak Oz, Çiçek Güven, Gonzalo Nápoles
{"title":"School dropout prediction and feature importance exploration in Malawi using household panel data: machine learning approach","authors":"Hazal Colak Oz, Çiçek Güven, Gonzalo Nápoles","doi":"10.1007/s42001-022-00195-3","DOIUrl":"https://doi.org/10.1007/s42001-022-00195-3","url":null,"abstract":"","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":"22 1","pages":"245 - 287"},"PeriodicalIF":3.2,"publicationDate":"2022-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74285190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Evaluating algorithmic homeless service allocation 评估无家可归者服务分配算法
IF 3.2 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2022-12-10 DOI: 10.1007/s42001-022-00190-8
Wenting Qi, C. Chelmis
{"title":"Evaluating algorithmic homeless service allocation","authors":"Wenting Qi, C. Chelmis","doi":"10.1007/s42001-022-00190-8","DOIUrl":"https://doi.org/10.1007/s42001-022-00190-8","url":null,"abstract":"","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":"24 1","pages":"59 - 89"},"PeriodicalIF":3.2,"publicationDate":"2022-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85090562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Computational Social Science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1