首页 > 最新文献

Sociological Methods & Research最新文献

英文 中文
The Target Study: A Conceptual Model and Framework for Measuring Disparity 目标研究:衡量差异的概念模型和框架
IF 6.3 2区 社会学 Q1 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2025-04-22 DOI: 10.1177/00491241251314037
John W. Jackson, Yea-Jen Hsu, Raquel C. Greer, Romsai T. Boonyasai, Chanelle J. Howe
We present a conceptual model to measure disparity—the target study—where social groups may be similarly situated (i.e., balanced) on allowable covariates. Our model, based on a sampling design, does not intervene to assign social group membership or alter allowable covariates. To address nonrandom sample selection, we extend our model to generalize or transport disparity or to assess disparity after an intervention on eligibility-related variables that eliminates forms of collider-stratification. To avoid bias from differential timing of enrollment, we aggregate time-specific study results by balancing calendar time of enrollment across social groups. To provide a framework for emulating our model, we discuss study designs, data structures, and G-computation and weighting estimators. We compare our sampling-based model to prominent decomposition-based models used in healthcare and algorithmic fairness. We provide R code for all estimators and apply our methods to measure health system disparities in hypertension control using electronic medical records.
我们提出了一个衡量差异的概念模型——目标研究——在允许的协变量上,社会群体可能处于相似的位置(即,平衡)。我们的模型,基于抽样设计,不干预分配社会群体成员或改变允许的协变量。为了解决非随机样本选择问题,我们扩展了我们的模型,以推广或转移差异,或在对排除碰撞分层形式的资格相关变量进行干预后评估差异。为了避免不同入组时间的偏差,我们通过平衡不同社会群体入组的日历时间来汇总特定时间的研究结果。为了提供一个模拟我们模型的框架,我们讨论了研究设计、数据结构、g计算和加权估计器。我们将我们的基于抽样的模型与医疗保健和算法公平中使用的基于分解的模型进行比较。我们为所有的估计器提供了R代码,并应用我们的方法来测量卫生系统在使用电子病历控制高血压方面的差异。
{"title":"The Target Study: A Conceptual Model and Framework for Measuring Disparity","authors":"John W. Jackson, Yea-Jen Hsu, Raquel C. Greer, Romsai T. Boonyasai, Chanelle J. Howe","doi":"10.1177/00491241251314037","DOIUrl":"https://doi.org/10.1177/00491241251314037","url":null,"abstract":"We present a conceptual model to measure disparity—the target study—where social groups may be similarly situated (i.e., balanced) on allowable covariates. Our model, based on a sampling design, does not intervene to assign social group membership or alter allowable covariates. To address nonrandom sample selection, we extend our model to generalize or transport disparity or to assess disparity after an intervention on eligibility-related variables that eliminates forms of collider-stratification. To avoid bias from differential timing of enrollment, we aggregate time-specific study results by balancing calendar time of enrollment across social groups. To provide a framework for emulating our model, we discuss study designs, data structures, and G-computation and weighting estimators. We compare our sampling-based model to prominent decomposition-based models used in healthcare and algorithmic fairness. We provide R code for all estimators and apply our methods to measure health system disparities in hypertension control using electronic medical records.","PeriodicalId":21849,"journal":{"name":"Sociological Methods & Research","volume":"26 1","pages":""},"PeriodicalIF":6.3,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143863022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Networks Beyond Categories: A Computational Approach to Examining Gender Homophily 超越类别的网络:研究性别同源性的计算方法
IF 6.3 2区 社会学 Q1 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2025-04-22 DOI: 10.1177/00491241251321152
Chen-Shuo Hong
Social networks literature has explored homophily, the tendency to associate with similar others, as a critical boundary-making process contributing to segregated networks along the lines of identities. Yet, social network research generally conceptualizes identities as sociodemographic categories and seldom considers the inherently continuous and heterogeneous nature of differences. Drawing upon the infracategorical model of inequality, this study demonstrates that a computational approach – combining machine learning and exponential random graph models (ERGMs) – can capture the role of categorical conformity in network structures. Through a case study of gender segregation in friendships, this study presents a workflow for developing a machine-learning-based gender conformity measure and applying it to guide the social network analysis of cultural matching. Results show that adolescents with similar gender conformity are more likely to form friendships, net of homophily based on categorical gender and other controls, and homophily by gender conformity mediates homophily by categorical gender. The study concludes by discussing the limitations of this computational approach and its unique strengths in enhancing theories on categories, boundaries, and stratification.
社会网络文学探讨了同质性,即与相似的人联系的倾向,作为一个关键的边界制定过程,有助于沿着身份的路线隔离网络。然而,社会网络研究通常将身份概念化为社会人口分类,很少考虑差异的内在连续性和异质性。利用不平等的次分类模型,本研究证明了一种计算方法-结合机器学习和指数随机图模型(ergm) -可以捕捉网络结构中分类一致性的作用。通过对友谊中性别隔离的案例研究,本研究提出了一个开发基于机器学习的性别一致性测量的工作流程,并将其应用于指导文化匹配的社会网络分析。结果表明,具有相似性别一致性的青少年更容易形成友谊、基于类别性别和其他控制的同质网络,性别一致性介导类别性别的同质。研究最后讨论了这种计算方法的局限性,以及它在加强分类、边界和分层理论方面的独特优势。
{"title":"Networks Beyond Categories: A Computational Approach to Examining Gender Homophily","authors":"Chen-Shuo Hong","doi":"10.1177/00491241251321152","DOIUrl":"https://doi.org/10.1177/00491241251321152","url":null,"abstract":"Social networks literature has explored homophily, the tendency to associate with similar others, as a critical boundary-making process contributing to segregated networks along the lines of identities. Yet, social network research generally conceptualizes identities as sociodemographic categories and seldom considers the inherently continuous and heterogeneous nature of differences. Drawing upon the infracategorical model of inequality, this study demonstrates that a computational approach – combining machine learning and exponential random graph models (ERGMs) – can capture the role of categorical conformity in network structures. Through a case study of gender segregation in friendships, this study presents a workflow for developing a machine-learning-based gender conformity measure and applying it to guide the social network analysis of cultural matching. Results show that adolescents with similar gender conformity are more likely to form friendships, net of homophily based on categorical gender and other controls, and homophily by gender conformity mediates homophily by categorical gender. The study concludes by discussing the limitations of this computational approach and its unique strengths in enhancing theories on categories, boundaries, and stratification.","PeriodicalId":21849,"journal":{"name":"Sociological Methods & Research","volume":"32 1","pages":""},"PeriodicalIF":6.3,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143862884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Mixed Subjects Design: Treating Large Language Models as Potentially Informative Observations 混合主题设计:将大型语言模型视为潜在的信息观察
IF 6.3 2区 社会学 Q1 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2025-04-22 DOI: 10.1177/00491241251326865
David Broska, Michael Howes, Austin van Loon
Large language models (LLMs) provide cost-effective but possibly inaccurate predictions of human behavior. Despite growing evidence that predicted and observed behavior are often not interchangeable , there is limited guidance on using LLMs to obtain valid estimates of causal effects and other parameters. We argue that LLM predictions should be treated as potentially informative observations, while human subjects serve as a gold standard in a mixed subjects design . This paradigm preserves validity and offers more precise estimates at a lower cost than experiments relying exclusively on human subjects. We demonstrate—and extend—prediction-powered inference (PPI), a method that combines predictions and observations. We define the PPI correlation as a measure of interchangeability and derive the effective sample size for PPI. We also introduce a power analysis to optimally choose between informative but costly human subjects and less informative but cheap predictions of human behavior. Mixed subjects designs could enhance scientific productivity and reduce inequality in access to costly evidence.
大型语言模型(llm)提供了具有成本效益但可能不准确的人类行为预测。尽管越来越多的证据表明,预测和观察到的行为往往不能互换,但使用llm获得因果效应和其他参数的有效估计的指导有限。我们认为法学硕士预测应该被视为潜在的信息观察,而人类受试者在混合受试者设计中充当黄金标准。这种模式保持了有效性,并以较低的成本提供了比完全依赖人类受试者的实验更精确的估计。我们展示并扩展了预测驱动推理(PPI),这是一种结合预测和观察的方法。我们将PPI相关性定义为互换性的度量,并推导出PPI的有效样本量。我们还引入了功率分析,以在信息丰富但成本高昂的人类受试者和信息较少但成本低廉的人类行为预测之间进行最佳选择。混合主题设计可以提高科学生产力,减少获取昂贵证据方面的不平等。
{"title":"The Mixed Subjects Design: Treating Large Language Models as Potentially Informative Observations","authors":"David Broska, Michael Howes, Austin van Loon","doi":"10.1177/00491241251326865","DOIUrl":"https://doi.org/10.1177/00491241251326865","url":null,"abstract":"Large language models (LLMs) provide cost-effective but possibly inaccurate predictions of human behavior. Despite growing evidence that predicted and observed behavior are often not <jats:italic>interchangeable</jats:italic> , there is limited guidance on using LLMs to obtain valid estimates of causal effects and other parameters. We argue that LLM predictions should be treated as potentially informative observations, while human subjects serve as a gold standard in a <jats:italic>mixed subjects design</jats:italic> . This paradigm preserves validity and offers more precise estimates at a lower cost than experiments relying exclusively on human subjects. We demonstrate—and extend—prediction-powered inference (PPI), a method that combines predictions and observations. We define the <jats:italic>PPI correlation</jats:italic> as a measure of interchangeability and derive the <jats:italic>effective sample size</jats:italic> for PPI. We also introduce a power analysis to optimally choose between <jats:italic>informative but costly</jats:italic> human subjects and <jats:italic>less informative but cheap</jats:italic> predictions of human behavior. Mixed subjects designs could enhance scientific productivity and reduce inequality in access to costly evidence.","PeriodicalId":21849,"journal":{"name":"Sociological Methods & Research","volume":"4 1","pages":""},"PeriodicalIF":6.3,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143862886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Social Mobility as Causal Intervention 社会流动作为因果干预
IF 6.3 2区 社会学 Q1 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2025-04-21 DOI: 10.1177/00491241251320963
Lai Wei, Yu Xie
The study of mobility effects is an important subject of study in sociology. Empirical investigations of individual mobility effects, however, have been hindered by one fundamental limitation, the unidentifiability of mobility effects when origin and destination are held constant. Given this fundamental limitation, we propose to reconceptualize mobility effects from the micro- to macro-level. Instead of micro-level mobility effects, the primary focus of the past literature, we ask alternative research questions about macro-level mobility effects: What happens to the population distribution of an outcome if we manipulate the mobility regime, that is, if we alter the observed association between social origin and social destination? We relate individual-level mobility experience to macro-level mobility effects under special interventions. The proposed method bridges the macro and micro agendas in social stratification research, and has wider applications in social stratification beyond the study of mobility effects. We illustrate the method with two analyses that evaluate the impact of social mobility on average fertility and income inequality in the United States. We provide an open-source software, the R package socmob , that implements the method.
流动效应研究是社会学的一个重要研究课题。然而,对个人流动效应的实证研究却受到一个基本限制的阻碍,即在原籍地和目的地不变的情况下,流动效应是不可识别的。鉴于这一基本限制,我们建议从微观到宏观层面重新认识流动效应。与以往文献主要关注的微观层面的流动效应不同,我们提出了有关宏观层面流动效应的其他研究问题:如果我们操纵流动制度,也就是说,如果我们改变观察到的社会原籍地和社会目的地之间的关联,结果的人口分布会发生什么变化?我们将个人层面的流动经验与特殊干预下的宏观流动效应联系起来。所提出的方法在社会分层研究的宏观和微观议程之间架起了一座桥梁,在流动效应研究之外的社会分层领域也有更广泛的应用。我们用两个分析来说明该方法,这两个分析评估了社会流动性对美国平均生育率和收入不平等的影响。我们提供了一个开源软件,即实现该方法的 R 软件包 socmob。
{"title":"Social Mobility as Causal Intervention","authors":"Lai Wei, Yu Xie","doi":"10.1177/00491241251320963","DOIUrl":"https://doi.org/10.1177/00491241251320963","url":null,"abstract":"The study of mobility effects is an important subject of study in sociology. Empirical investigations of individual mobility effects, however, have been hindered by one fundamental limitation, the unidentifiability of mobility effects when origin and destination are held constant. Given this fundamental limitation, we propose to reconceptualize mobility effects from the micro- to macro-level. Instead of micro-level mobility effects, the primary focus of the past literature, we ask alternative research questions about macro-level mobility effects: What happens to the population distribution of an outcome if we manipulate the mobility regime, that is, if we alter the observed association between social origin and social destination? We relate individual-level mobility experience to macro-level mobility effects under special interventions. The proposed method bridges the macro and micro agendas in social stratification research, and has wider applications in social stratification beyond the study of mobility effects. We illustrate the method with two analyses that evaluate the impact of social mobility on average fertility and income inequality in the United States. We provide an open-source software, the R package <jats:italic>socmob</jats:italic> , that implements the method.","PeriodicalId":21849,"journal":{"name":"Sociological Methods & Research","volume":"1 1","pages":""},"PeriodicalIF":6.3,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143857717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correcting the Measurement Errors of AI-Assisted Labeling in Image Analysis Using Design-Based Supervised Learning 基于设计的监督学习修正图像分析中人工智能辅助标注的测量误差
IF 6.3 2区 社会学 Q1 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2025-04-21 DOI: 10.1177/00491241251333372
Alessandra Rister Portinari Maranca, Jihoon Chung, Musashi Hinck, Adam D. Wolsky, Naoki Egami, Brandon M. Stewart
Generative artificial intelligence (AI) has shown incredible leaps in performance across data of a variety of modalities including texts, images, audio, and videos. This affords social scientists the ability to annotate variables of interest from unstructured media. While rapidly improving, these methods are far from perfect and, as we show, even ignoring the small amounts of error in high accuracy systems can lead to substantial bias and invalid confidence intervals in downstream analysis. We review how using design-based supervised learning (DSL) guarantees asymptotic unbiasedness and proper confidence interval coverage by making use of a small number of expert annotations. While originally developed for use with large language models in text, we present a series of applications in the context of image analysis, including an investigation of visual predictors of the perceived level of violence in protest images, an analysis of the images shared in the Black Lives Matter movement on Twitter, and a study of U.S. outlets reporting of immigrant caravans. These applications are representative of the type of analysis performed in the visual social science landscape today, and our analyses will exemplify how DSL helps us attain statistical guarantees while using automated methods to reduce human labor.
生成式人工智能(AI)在各种形式的数据(包括文本、图像、音频和视频)上表现出了令人难以置信的飞跃。这使社会科学家能够从非结构化媒体中注释感兴趣的变量。在快速改进的同时,这些方法还远远不够完美,正如我们所示,即使忽略高精度系统中的少量误差,也会导致下游分析中的大量偏差和无效置信区间。我们回顾了如何使用基于设计的监督学习(DSL)通过使用少量专家注释来保证渐近无偏性和适当的置信区间覆盖。虽然最初是为了在文本中使用大型语言模型而开发的,但我们在图像分析的背景下提出了一系列应用程序,包括对抗议图像中感知到的暴力程度的视觉预测因素的调查,对Twitter上“黑人的命也是命”运动中分享的图像的分析,以及对美国媒体报道移民大篷车的研究。这些应用程序代表了今天在视觉社会科学领域中执行的分析类型,我们的分析将举例说明DSL如何帮助我们在使用自动化方法减少人力劳动的同时获得统计保证。
{"title":"Correcting the Measurement Errors of AI-Assisted Labeling in Image Analysis Using Design-Based Supervised Learning","authors":"Alessandra Rister Portinari Maranca, Jihoon Chung, Musashi Hinck, Adam D. Wolsky, Naoki Egami, Brandon M. Stewart","doi":"10.1177/00491241251333372","DOIUrl":"https://doi.org/10.1177/00491241251333372","url":null,"abstract":"Generative artificial intelligence (AI) has shown incredible leaps in performance across data of a variety of modalities including texts, images, audio, and videos. This affords social scientists the ability to annotate variables of interest from unstructured media. While rapidly improving, these methods are far from perfect and, as we show, even ignoring the small amounts of error in high accuracy systems can lead to substantial bias and invalid confidence intervals in downstream analysis. We review how using design-based supervised learning (DSL) guarantees asymptotic unbiasedness and proper confidence interval coverage by making use of a small number of expert annotations. While originally developed for use with large language models in text, we present a series of applications in the context of image analysis, including an investigation of visual predictors of the perceived level of violence in protest images, an analysis of the images shared in the Black Lives Matter movement on Twitter, and a study of U.S. outlets reporting of immigrant caravans. These applications are representative of the type of analysis performed in the visual social science landscape today, and our analyses will exemplify how DSL helps us attain statistical guarantees while using automated methods to reduce human labor.","PeriodicalId":21849,"journal":{"name":"Sociological Methods & Research","volume":"3 1","pages":""},"PeriodicalIF":6.3,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143857722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Bias. How Do Generative Language Models Answer Opinion Polls? 机器的偏见。生成语言模型如何回答民意调查?
IF 6.3 2区 社会学 Q1 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2025-04-21 DOI: 10.1177/00491241251330582
Julien Boelaert, Samuel Coavoux, Étienne Ollion, Ivaylo Petev, Patrick Präg
Generative artificial intelligence (AI) is increasingly presented as a potential substitute for humans, including as research subjects. However, there is no scientific consensus on how closely these in silico clones can emulate survey respondents. While some defend the use of these “synthetic users,” others point toward social biases in the responses provided by large language models (LLMs). In this article, we demonstrate that these critics are right to be wary of using generative AI to emulate respondents, but probably not for the right reasons. Our results show (i) that to date, models cannot replace research subjects for opinion or attitudinal research; (ii) that they display a strong bias and a low variance on each topic; and (iii) that this bias randomly varies from one topic to the next. We label this pattern “machine bias,” a concept we define, and whose consequences for LLM-based research we further explore.
生成式人工智能(AI)越来越多地被认为是人类的潜在替代品,包括作为研究对象。然而,对于这些在计算机上克隆的人能在多大程度上模仿调查对象,目前还没有科学共识。虽然有些人为使用这些“合成用户”辩护,但其他人指出大型语言模型(llm)提供的响应存在社会偏见。在本文中,我们证明了这些批评者对使用生成人工智能来模仿受访者持谨慎态度是正确的,但可能不是出于正确的原因。我们的研究结果表明(i)迄今为止,模型不能取代研究对象的意见或态度研究;(ii)他们在每个主题上表现出强烈的偏见和低方差;(iii)这种偏见会随话题的不同而随机变化。我们将这种模式称为“机器偏差”,这是我们定义的一个概念,我们将进一步探索其对基于法学硕士的研究的影响。
{"title":"Machine Bias. How Do Generative Language Models Answer Opinion Polls?","authors":"Julien Boelaert, Samuel Coavoux, Étienne Ollion, Ivaylo Petev, Patrick Präg","doi":"10.1177/00491241251330582","DOIUrl":"https://doi.org/10.1177/00491241251330582","url":null,"abstract":"Generative artificial intelligence (AI) is increasingly presented as a potential substitute for humans, including as research subjects. However, there is no scientific consensus on how closely these in silico clones can emulate survey respondents. While some defend the use of these “synthetic users,” others point toward social biases in the responses provided by large language models (LLMs). In this article, we demonstrate that these critics are right to be wary of using generative AI to emulate respondents, but probably not for the right reasons. Our results show (i) that to date, models cannot replace research subjects for opinion or attitudinal research; (ii) that they display a strong bias and a low variance on each topic; and (iii) that this bias randomly varies from one topic to the next. We label this pattern “machine bias,” a concept we define, and whose consequences for LLM-based research we further explore.","PeriodicalId":21849,"journal":{"name":"Sociological Methods & Research","volume":"37 1","pages":""},"PeriodicalIF":6.3,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143853640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Insight-Inference Loop: Efficient Text Classification via Natural Language Inference and Threshold-Tuning 洞察-推理循环:基于自然语言推理和阈值调优的高效文本分类
IF 6.3 2区 社会学 Q1 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2025-04-19 DOI: 10.1177/00491241251326819
Sandrine Chausson, Marion Fourcade, David J. Harding, Björn Ross, Grégory Renard
Modern computational text classification methods have brought social scientists tantalizingly close to the goal of unlocking vast insights buried in text data—from centuries of historical documents to streams of social media posts. Yet three barriers still stand in the way: the tedious labor of manual text annotation, the technical complexity that keeps these tools out of reach for many researchers, and, perhaps most critically, the challenge of bridging the gap between sophisticated algorithms and the deep theoretical understanding social scientists have already developed about human interactions, social structures, and institutions. To counter these limitations, we propose an approach to large-scale text analysis that requires substantially less human-labeled data, and no machine learning expertise, and efficiently integrates the social scientist into critical steps in the workflow. This approach, which allows the detection of statements in text, relies on large language models pre-trained for natural language inference, and a “few-shot” threshold-tuning algorithm rooted in active learning principles. We describe and showcase our approach by analyzing tweets collected during the 2020 U.S. presidential election campaign, and benchmark it against various computational approaches across three datasets.
现代计算文本分类方法已经让社会科学家们悄然接近了揭开埋藏在文本数据--从数百年的历史文献到社交媒体帖子流--中的巨大洞察力的目标。然而,有三个障碍仍然阻碍着我们:人工文本注释的繁琐劳动、技术的复杂性使许多研究人员无法使用这些工具,而最关键的也许是,在复杂的算法与社会科学家对人类互动、社会结构和制度的深刻理论理解之间架起桥梁的挑战。为了克服这些局限性,我们提出了一种大规模文本分析方法,它大大减少了对人类标注数据的需求,也不需要机器学习方面的专业知识,而且能将社会科学家有效地整合到工作流程的关键步骤中。这种方法可以检测文本中的语句,依赖于为自然语言推理预先训练的大型语言模型,以及植根于主动学习原理的 "少量 "阈值调整算法。我们通过分析 2020 年美国总统竞选期间收集的推文来描述和展示我们的方法,并在三个数据集上与各种计算方法进行比较。
{"title":"The Insight-Inference Loop: Efficient Text Classification via Natural Language Inference and Threshold-Tuning","authors":"Sandrine Chausson, Marion Fourcade, David J. Harding, Björn Ross, Grégory Renard","doi":"10.1177/00491241251326819","DOIUrl":"https://doi.org/10.1177/00491241251326819","url":null,"abstract":"Modern computational text classification methods have brought social scientists tantalizingly close to the goal of unlocking vast insights buried in text data—from centuries of historical documents to streams of social media posts. Yet three barriers still stand in the way: the tedious labor of manual text annotation, the technical complexity that keeps these tools out of reach for many researchers, and, perhaps most critically, the challenge of bridging the gap between sophisticated algorithms and the deep theoretical understanding social scientists have already developed about human interactions, social structures, and institutions. To counter these limitations, we propose an approach to large-scale text analysis that requires substantially less human-labeled data, and no machine learning expertise, and efficiently integrates the social scientist into critical steps in the workflow. This approach, which allows the detection of statements in text, relies on large language models pre-trained for natural language inference, and a “few-shot” threshold-tuning algorithm rooted in active learning principles. We describe and showcase our approach by analyzing tweets collected during the 2020 U.S. presidential election campaign, and benchmark it against various computational approaches across three datasets.","PeriodicalId":21849,"journal":{"name":"Sociological Methods & Research","volume":"1 1","pages":""},"PeriodicalIF":6.3,"publicationDate":"2025-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143851025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Locating Cultural Holes Brokers in Diffusion Dynamics Across Bright Symbolic Boundaries 在跨越明亮符号边界的扩散动力学中定位文化漏洞经纪人
IF 6.3 2区 社会学 Q1 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2025-03-19 DOI: 10.1177/00491241251322517
Diego F. Leal
Although the literature on cultural holes has expanded considerably in recent years, there is no concrete measure in that literature to locate cultural holes brokers. This article develops a conceptual framework grounded in social network theory and cultural sociology to propose a specific solution to fill this measurement gap. Agent-based computational experiments are leveraged to develop a theoretical test of the analytic purchase and distinctiveness of the proposed measure, termed potential for intercultural brokerage (PIB). Results demonstrate the effectiveness of PIB in locating early adopters that can achieve widespread levels of diffusion in societies segregated along bright symbolic boundaries. Findings also show the superiority of PIB when compared to classic alternative measures in the network literature that focus on locating early adopters based on structural holes (e.g., network constraint, effective size), geodesics (e.g., betweenness centrality), and degree (e.g., degree centrality), among other classic network measures. Broader implications of these findings for brokerage theory are discussed herein.
虽然近年来关于文化漏洞的文献有了很大的扩展,但这些文献中并没有具体的措施来定位文化漏洞经纪人。本文发展了一个基于社会网络理论和文化社会学的概念框架,提出了一个具体的解决方案来填补这一测量差距。基于代理的计算实验被用来开发分析购买的理论测试和所提议的度量的独特性,称为跨文化经纪潜力(PIB)。结果表明,PIB在寻找早期采用者方面是有效的,这些采用者可以在沿着明亮的象征性边界隔离的社会中实现广泛的传播水平。研究结果还表明,与网络文献中的经典替代测量方法相比,PIB的优势在于基于结构孔(如网络约束、有效规模)、测地线(如中间中心性)和程度(如度中心性)等经典网络测量方法来定位早期采用者。本文将讨论这些发现对经纪理论的更广泛影响。
{"title":"Locating Cultural Holes Brokers in Diffusion Dynamics Across Bright Symbolic Boundaries","authors":"Diego F. Leal","doi":"10.1177/00491241251322517","DOIUrl":"https://doi.org/10.1177/00491241251322517","url":null,"abstract":"Although the literature on cultural holes has expanded considerably in recent years, there is no concrete measure in that literature to locate cultural holes brokers. This article develops a conceptual framework grounded in social network theory and cultural sociology to propose a specific solution to fill this measurement gap. Agent-based computational experiments are leveraged to develop a theoretical test of the analytic purchase and distinctiveness of the proposed measure, termed potential for intercultural brokerage (PIB). Results demonstrate the effectiveness of PIB in locating early adopters that can achieve widespread levels of diffusion in societies segregated along bright symbolic boundaries. Findings also show the superiority of PIB when compared to classic alternative measures in the network literature that focus on locating early adopters based on structural holes (e.g., network constraint, effective size), geodesics (e.g., betweenness centrality), and degree (e.g., degree centrality), among other classic network measures. Broader implications of these findings for brokerage theory are discussed herein.","PeriodicalId":21849,"journal":{"name":"Sociological Methods & Research","volume":"19 1","pages":""},"PeriodicalIF":6.3,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143661168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Learning With DAGs 利用 DAG 进行深度学习
IF 6.3 2区 社会学 Q1 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2025-03-12 DOI: 10.1177/00491241251319291
Sourabh Balgi, Adel Daoud, Jose M. Peña, Geoffrey T. Wodtke, Jesse Zhou
Social science theories often postulate systems of causal relationships among variables, which are commonly represented using directed acyclic graphs (DAGs). As non-parametric causal models, DAGs require no assumptions about the functional form of the hypothesized relationships. Nevertheless, to simplify empirical evaluation, researchers typically invoke such assumptions anyway, even though they are often arbitrary and do not reflect any theoretical content or prior knowledge. Moreover, functional form assumptions can engender bias, whenever they fail to accurately capture the true complexity of the system. In this article, we introduce causal-graphical normalizing flows (cGNFs), a novel approach to causal inference that leverages deep neural networks to empirically evaluate theories represented as DAGs. Unlike conventional methods, cGNFs model the full joint distribution of the data using a DAG specified by the analyst, without relying on stringent assumptions about functional form. This enables flexible, non-parametric estimation of any causal estimand identified from the DAG, including total effects, direct and indirect effects, and path-specific effects. We illustrate the method with a reanalysis of Blau and Duncan’s ( 1967 ) model of status attainment and Zhou’s ( 2019 ) model of controlled mobility. The article concludes with a discussion of current limitations and directions for future development.
社会科学理论经常假设变量之间的因果关系系统,这通常用有向无环图(dag)来表示。作为非参数因果模型,dag不需要对假设关系的功能形式进行假设。然而,为了简化经验评估,研究人员通常会调用这样的假设,即使它们往往是武断的,不反映任何理论内容或先验知识。此外,只要功能形式假设不能准确地捕捉系统的真正复杂性,就会产生偏差。在本文中,我们介绍了因果图归一化流(cgnf),这是一种新的因果推理方法,它利用深度神经网络来经验地评估以dag表示的理论。与传统方法不同,cGNFs使用分析师指定的DAG对数据的完整联合分布进行建模,而不依赖于对功能形式的严格假设。这使得从DAG中识别的任何因果估计能够灵活,非参数估计,包括总影响,直接和间接影响,以及路径特定影响。我们通过重新分析Blau和Duncan(1967)的地位获得模型和Zhou(2019)的控制流动性模型来说明该方法。文章最后讨论了当前的局限性和未来的发展方向。
{"title":"Deep Learning With DAGs","authors":"Sourabh Balgi, Adel Daoud, Jose M. Peña, Geoffrey T. Wodtke, Jesse Zhou","doi":"10.1177/00491241251319291","DOIUrl":"https://doi.org/10.1177/00491241251319291","url":null,"abstract":"Social science theories often postulate systems of causal relationships among variables, which are commonly represented using directed acyclic graphs (DAGs). As non-parametric causal models, DAGs require no assumptions about the functional form of the hypothesized relationships. Nevertheless, to simplify empirical evaluation, researchers typically invoke such assumptions anyway, even though they are often arbitrary and do not reflect any theoretical content or prior knowledge. Moreover, functional form assumptions can engender bias, whenever they fail to accurately capture the true complexity of the system. In this article, we introduce causal-graphical normalizing flows (cGNFs), a novel approach to causal inference that leverages deep neural networks to empirically evaluate theories represented as DAGs. Unlike conventional methods, cGNFs model the full joint distribution of the data using a DAG specified by the analyst, without relying on stringent assumptions about functional form. This enables flexible, non-parametric estimation of any causal estimand identified from the DAG, including total effects, direct and indirect effects, and path-specific effects. We illustrate the method with a reanalysis of Blau and Duncan’s ( <jats:xref ref-type=\"bibr\">1967</jats:xref> ) model of status attainment and Zhou’s ( <jats:xref ref-type=\"bibr\">2019</jats:xref> ) model of controlled mobility. The article concludes with a discussion of current limitations and directions for future development.","PeriodicalId":21849,"journal":{"name":"Sociological Methods & Research","volume":"11 1","pages":""},"PeriodicalIF":6.3,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143608045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
When to Use Counterfactuals in Causal Historiography: Methods for Semantics and Inference 在因果史学中何时使用反事实:语义学和推理方法
IF 6.3 2区 社会学 Q1 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2025-02-03 DOI: 10.1177/00491241251314039
Tay Jeong
According to the interventionist framework of actual causality, causal claims in history are ultimately claims about special types of functional dependencies between variables, which consist not only of actual events but also of corresponding counterfactual states of affairs. Instead of advocating the methodological use of counterfactuals tout court, we propose specific circumstances in historical writing where counterfactual reasoning comes in most handy. At the level of semantics, that is, the specification of the variables and their possible values, an explicit specification of the latent contrast classes becomes particularly useful in situations where one may be prompted to take an event that is pre-empted by the antecedent of interest as its proper causal contrast. At the level of inference, we argue that cases in which two or more antecedents appear to be playing a similar role tend to fumble our pretheoretical intuition about cause and propose a sequence of counterfactual tests based on actual examples from causal historiography.
根据实际因果关系的干预主义框架,历史上的因果关系主张最终是关于变量之间特殊类型的功能依赖关系的主张,这些变量不仅包括实际事件,还包括相应的反事实状态。我们不提倡在法庭上使用反事实的方法论,而是在历史写作中提出反事实推理最方便的具体情况。在语义层面,即变量及其可能值的说明,潜在对比类的显式说明在人们可能被提示采取被感兴趣的先行词抢先的事件作为其适当的因果对比的情况下变得特别有用。在推理层面,我们认为,如果两个或两个以上的前因似乎扮演着相似的角色,那么我们对因果的理论前直觉就会出错,并根据因果史学的实际例子提出一系列反事实检验。
{"title":"When to Use Counterfactuals in Causal Historiography: Methods for Semantics and Inference","authors":"Tay Jeong","doi":"10.1177/00491241251314039","DOIUrl":"https://doi.org/10.1177/00491241251314039","url":null,"abstract":"According to the interventionist framework of actual causality, causal claims in history are ultimately claims about special types of functional dependencies between variables, which consist not only of actual events but also of corresponding counterfactual states of affairs. Instead of advocating the methodological use of counterfactuals tout court, we propose specific circumstances in historical writing where counterfactual reasoning comes in most handy. At the level of semantics, that is, the specification of the variables and their possible values, an explicit specification of the latent contrast classes becomes particularly useful in situations where one may be prompted to take an event that is pre-empted by the antecedent of interest as its proper causal contrast. At the level of inference, we argue that cases in which two or more antecedents appear to be playing a similar role tend to fumble our pretheoretical intuition about cause and propose a sequence of counterfactual tests based on actual examples from causal historiography.","PeriodicalId":21849,"journal":{"name":"Sociological Methods & Research","volume":"10 1","pages":""},"PeriodicalIF":6.3,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143084171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Sociological Methods & Research
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1