首页 > 最新文献

Political Analysis最新文献

英文 中文
Relatio: Text Semantics Capture Political and Economic Narratives – ERRATUM 关系:文本语义捕捉政治和经济叙事-勘误
2区 社会学 Q1 POLITICAL SCIENCE Pub Date : 2023-05-19 DOI: 10.1017/pan.2023.15
Elliott Ash, Germain Gauthier, Philine Widmer
An abstract is not available for this content. As you have access to this content, full HTML content is provided on this page. A PDF of this content is also available in through the ‘Save PDF’ action button.
此内容没有摘要。当您可以访问此内容时,该页上会提供完整的HTML内容。此内容的PDF也可以通过“保存PDF”操作按钮获得。
{"title":"<scp>Relatio</scp>: Text Semantics Capture Political and Economic Narratives – ERRATUM","authors":"Elliott Ash, Germain Gauthier, Philine Widmer","doi":"10.1017/pan.2023.15","DOIUrl":"https://doi.org/10.1017/pan.2023.15","url":null,"abstract":"An abstract is not available for this content. As you have access to this content, full HTML content is provided on this page. A PDF of this content is also available in through the ‘Save PDF’ action button.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135626082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Role of Majority Status in Close Election Studies 多数地位在势均力敌的选举研究中的作用
2区 社会学 Q1 POLITICAL SCIENCE Pub Date : 2023-04-18 DOI: 10.1017/pan.2023.14
Matteo Alpino, Marta Crispino
Abstract Many studies exploit close elections in a regression discontinuity framework to identify partisan effects, that is, the effect of having a given party in office on some outcome. We argue that, when conducted on single-member districts, such design may identify a compound effect: the partisan effect, plus the majority status effect, that is, the effect of being represented by a member of the legislative majority. We provide a simple strategy to disentangle the two, and test it with simulations. Finally, we show the empirical relevance of this issue using real data.
许多研究利用回归不连续框架中的接近选举来确定党派效应,即给定政党执政对某些结果的影响。我们认为,当在单一成员选区进行时,这种设计可能会发现一种复合效应:党派效应加上多数地位效应,即由立法多数成员代表的效应。我们提供了一个简单的策略来解开这两者的纠缠,并通过模拟进行了测试。最后,我们用实际数据展示了这个问题的经验相关性。
{"title":"The Role of Majority Status in Close Election Studies","authors":"Matteo Alpino, Marta Crispino","doi":"10.1017/pan.2023.14","DOIUrl":"https://doi.org/10.1017/pan.2023.14","url":null,"abstract":"Abstract Many studies exploit close elections in a regression discontinuity framework to identify partisan effects, that is, the effect of having a given party in office on some outcome. We argue that, when conducted on single-member districts, such design may identify a compound effect: the partisan effect, plus the majority status effect, that is, the effect of being represented by a member of the legislative majority. We provide a simple strategy to disentangle the two, and test it with simulations. Finally, we show the empirical relevance of this issue using real data.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135927278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Simulating Party Shares 模拟各方股份
IF 5.4 2区 社会学 Q1 POLITICAL SCIENCE Pub Date : 2023-04-11 DOI: 10.1017/pan.2023.13
D. Cohen, Chris Hanretty
We tackle the problem of simulating seat- and vote-shares for a party system of a given size. We show how these shares can be generated using unordered and ordered Dirichlet distributions. We show that a distribution with a mean vector given by the rule described in Taagepera and Allik (2006, Electoral Studies 25, 696–713) fits real-world data almost as well as a saturated model where there is a parameter for each rank/system size combination.
我们解决了一个给定规模的政党系统的席位和选票份额的模拟问题。我们将展示如何使用无序和有序的狄利克雷分布生成这些共享。我们表明,taagpera和Allik (2006, Electoral Studies 25, 696-713)中描述的规则给出的平均向量分布几乎与真实世界的数据一样适合,其中每个等级/系统规模组合都有一个参数。
{"title":"Simulating Party Shares","authors":"D. Cohen, Chris Hanretty","doi":"10.1017/pan.2023.13","DOIUrl":"https://doi.org/10.1017/pan.2023.13","url":null,"abstract":"\u0000 We tackle the problem of simulating seat- and vote-shares for a party system of a given size. We show how these shares can be generated using unordered and ordered Dirichlet distributions. We show that a distribution with a mean vector given by the rule described in Taagepera and Allik (2006, Electoral Studies 25, 696–713) fits real-world data almost as well as a saturated model where there is a parameter for each rank/system size combination.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":" ","pages":""},"PeriodicalIF":5.4,"publicationDate":"2023-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44901798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PAN volume 31 issue 2 Cover and Front matter PAN第31卷第2期封面和封面问题
IF 5.4 2区 社会学 Q1 POLITICAL SCIENCE Pub Date : 2023-04-01 DOI: 10.1017/pan.2023.10
{"title":"PAN volume 31 issue 2 Cover and Front matter","authors":"","doi":"10.1017/pan.2023.10","DOIUrl":"https://doi.org/10.1017/pan.2023.10","url":null,"abstract":"","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":" ","pages":"f1 - f3"},"PeriodicalIF":5.4,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44442483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PAN volume 31 issue 2 Cover and Back matter PAN第31卷第2期封面和封底
IF 5.4 2区 社会学 Q1 POLITICAL SCIENCE Pub Date : 2023-04-01 DOI: 10.1017/pan.2023.11
{"title":"PAN volume 31 issue 2 Cover and Back matter","authors":"","doi":"10.1017/pan.2023.11","DOIUrl":"https://doi.org/10.1017/pan.2023.11","url":null,"abstract":"","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":" ","pages":"b1 - b3"},"PeriodicalIF":5.4,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47899969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
It’s All in the Name: A Character-Based Approach to Infer Religion 这一切都在名字里:一种基于人物的方法来推断宗教
2区 社会学 Q1 POLITICAL SCIENCE Pub Date : 2023-03-23 DOI: 10.1017/pan.2023.6
Rochana Chaturvedi, Sugat Chaturvedi
Abstract Large-scale microdata on group identity are critical for studies on identity politics and violence but remain largely unavailable for developing countries. We use personal names to infer religion in South Asia—where religion is a salient social division, and yet, disaggregated data on it are scarce. Existing work predicts religion using a dictionary-based method and, therefore, cannot classify unseen names. We provide character-based machine-learning models that can classify unseen names too with high accuracy. Our models are also much faster and, hence, scalable to large datasets. We explain the classification decisions of one of our models using the layer-wise relevance propagation technique. The character patterns learned by the classifier are rooted in the linguistic origins of names. We apply these to infer the religion of electoral candidates using historical data on Indian elections and observe a trend of declining Muslim representation. Our approach can be used to detect identity groups across the world for whom the underlying names might have different linguistic roots.
群体认同的大规模微观数据对于身份政治和暴力的研究至关重要,但在发展中国家仍然难以获得。在南亚,我们用人名来推断宗教信仰——在那里,宗教是一个显著的社会分支,然而,关于它的分类数据却很少。现有的研究使用基于字典的方法来预测宗教,因此无法对未见过的名字进行分类。我们提供了基于字符的机器学习模型,可以对未见过的名字进行高精度分类。我们的模型也更快,因此可以扩展到大型数据集。我们使用分层相关传播技术解释其中一个模型的分类决策。分类器学习的字符模式根植于名字的语言来源。我们利用印度选举的历史数据来推断选举候选人的宗教信仰,并观察到穆斯林代表人数下降的趋势。我们的方法可以用来检测世界各地的身份群体,他们的潜在名字可能有不同的语言根源。
{"title":"It’s All in the Name: A Character-Based Approach to Infer Religion","authors":"Rochana Chaturvedi, Sugat Chaturvedi","doi":"10.1017/pan.2023.6","DOIUrl":"https://doi.org/10.1017/pan.2023.6","url":null,"abstract":"Abstract Large-scale microdata on group identity are critical for studies on identity politics and violence but remain largely unavailable for developing countries. We use personal names to infer religion in South Asia—where religion is a salient social division, and yet, disaggregated data on it are scarce. Existing work predicts religion using a dictionary-based method and, therefore, cannot classify unseen names. We provide character-based machine-learning models that can classify unseen names too with high accuracy. Our models are also much faster and, hence, scalable to large datasets. We explain the classification decisions of one of our models using the layer-wise relevance propagation technique. The character patterns learned by the classifier are rooted in the linguistic origins of names. We apply these to infer the religion of electoral candidates using historical data on Indian elections and observe a trend of declining Muslim representation. Our approach can be used to detect identity groups across the world for whom the underlying names might have different linguistic roots.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136151852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Integrating Data Across Misaligned Spatial Units 跨错位空间单元集成数据
IF 5.4 2区 社会学 Q1 POLITICAL SCIENCE Pub Date : 2023-03-23 DOI: 10.1017/pan.2023.5
Y. Zhukov, Jason S. Byers, Marty Davidson, Ken Kollman
Theoretical units of interest often do not align with the spatial units at which data are available. This problem is pervasive in political science, particularly in subnational empirical research that requires integrating data across incompatible geographic units (e.g., administrative areas, electoral constituencies, and grid cells). Overcoming this challenge requires researchers not only to align the scale of empirical and theoretical units, but also to understand the consequences of this change of support for measurement error and statistical inference. We show how the accuracy of transformed values and the estimation of regression coefficients depend on the degree of nesting (i.e., whether units fall completely and neatly inside each other) and on the relative scale of source and destination units (i.e., aggregation, disaggregation, and hybrid). We introduce simple, nonparametric measures of relative nesting and scale, as ex ante indicators of spatial transformation complexity and error susceptibility. Using election data and Monte Carlo simulations, we show that these measures are strongly predictive of transformation quality across multiple change-of-support methods. We propose several validation procedures and provide open-source software to make transformation options more accessible, customizable, and intuitive.
感兴趣的理论单位通常与可用数据的空间单位不一致。这一问题在政治学中普遍存在,特别是在需要整合不兼容地理单元(如行政区域、选区和网格单元)数据的国家以下实证研究中。克服这一挑战不仅需要研究人员调整经验单位和理论单位的规模,还需要了解这种对测量误差和统计推断的支持变化的后果。我们展示了转换值的准确性和回归系数的估计如何取决于嵌套程度(即单元是否完全整齐地落在彼此内部)以及源单元和目的单元的相对规模(即聚合、分解和混合)。我们引入了相对嵌套和尺度的简单非参数度量,作为空间变换复杂性和误差敏感性的事前指标。使用选举数据和蒙特卡洛模拟,我们表明,这些度量对支持方法的多种变化的转换质量具有很强的预测性。我们提出了几个验证程序,并提供了开源软件,使转换选项更易于访问、定制和直观。
{"title":"Integrating Data Across Misaligned Spatial Units","authors":"Y. Zhukov, Jason S. Byers, Marty Davidson, Ken Kollman","doi":"10.1017/pan.2023.5","DOIUrl":"https://doi.org/10.1017/pan.2023.5","url":null,"abstract":"\u0000 Theoretical units of interest often do not align with the spatial units at which data are available. This problem is pervasive in political science, particularly in subnational empirical research that requires integrating data across incompatible geographic units (e.g., administrative areas, electoral constituencies, and grid cells). Overcoming this challenge requires researchers not only to align the scale of empirical and theoretical units, but also to understand the consequences of this change of support for measurement error and statistical inference. We show how the accuracy of transformed values and the estimation of regression coefficients depend on the degree of nesting (i.e., whether units fall completely and neatly inside each other) and on the relative scale of source and destination units (i.e., aggregation, disaggregation, and hybrid). We introduce simple, nonparametric measures of relative nesting and scale, as ex ante indicators of spatial transformation complexity and error susceptibility. Using election data and Monte Carlo simulations, we show that these measures are strongly predictive of transformation quality across multiple change-of-support methods. We propose several validation procedures and provide open-source software to make transformation options more accessible, customizable, and intuitive.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":" ","pages":""},"PeriodicalIF":5.4,"publicationDate":"2023-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41679125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Introducing an Interpretable Deep Learning Approach to Domain-Specific Dictionary Creation: A Use Case for Conflict Prediction 引入一种可解释的深度学习方法来创建特定领域的词典:冲突预测的一个用例
IF 5.4 2区 社会学 Q1 POLITICAL SCIENCE Pub Date : 2023-03-22 DOI: 10.1017/pan.2023.7
Sonja Häffner, Martin Hofer, Maximilian Nagl, Julian Walterskirchen
Abstract Recent advancements in natural language processing (NLP) methods have significantly improved their performance. However, more complex NLP models are more difficult to interpret and computationally expensive. Therefore, we propose an approach to dictionary creation that carefully balances the trade-off between complexity and interpretability. This approach combines a deep neural network architecture with techniques to improve model explainability to automatically build a domain-specific dictionary. As an illustrative use case of our approach, we create an objective dictionary that can infer conflict intensity from text data. We train the neural networks on a corpus of conflict reports and match them with conflict event data. This corpus consists of over 14,000 expert-written International Crisis Group (ICG) CrisisWatch reports between 2003 and 2021. Sensitivity analysis is used to extract the weighted words from the neural network to build the dictionary. In order to evaluate our approach, we compare our results to state-of-the-art deep learning language models, text-scaling methods, as well as standard, nonspecialized, and conflict event dictionary approaches. We are able to show that our approach outperforms other approaches while retaining interpretability.
摘要自然语言处理(NLP)方法的最新进展显著提高了它们的性能。然而,更复杂的NLP模型更难解释,并且计算成本更高。因此,我们提出了一种字典创建方法,该方法仔细平衡复杂性和可解释性之间的权衡。这种方法将深度神经网络架构与提高模型可解释性的技术相结合,以自动构建特定领域的词典。作为我们方法的一个示例性用例,我们创建了一个客观的字典,可以从文本数据中推断冲突强度。我们在冲突报告的语料库上训练神经网络,并将它们与冲突事件数据进行匹配。该语料库由国际危机组织(ICG)2003年至2021年间14000多份专家撰写的危机观察报告组成。灵敏度分析用于从神经网络中提取加权词来构建字典。为了评估我们的方法,我们将我们的结果与最先进的深度学习语言模型、文本缩放方法以及标准、非专业和冲突事件字典方法进行了比较。我们能够证明我们的方法在保持可解释性的同时优于其他方法。
{"title":"Introducing an Interpretable Deep Learning Approach to Domain-Specific Dictionary Creation: A Use Case for Conflict Prediction","authors":"Sonja Häffner, Martin Hofer, Maximilian Nagl, Julian Walterskirchen","doi":"10.1017/pan.2023.7","DOIUrl":"https://doi.org/10.1017/pan.2023.7","url":null,"abstract":"Abstract Recent advancements in natural language processing (NLP) methods have significantly improved their performance. However, more complex NLP models are more difficult to interpret and computationally expensive. Therefore, we propose an approach to dictionary creation that carefully balances the trade-off between complexity and interpretability. This approach combines a deep neural network architecture with techniques to improve model explainability to automatically build a domain-specific dictionary. As an illustrative use case of our approach, we create an objective dictionary that can infer conflict intensity from text data. We train the neural networks on a corpus of conflict reports and match them with conflict event data. This corpus consists of over 14,000 expert-written International Crisis Group (ICG) CrisisWatch reports between 2003 and 2021. Sensitivity analysis is used to extract the weighted words from the neural network to build the dictionary. In order to evaluate our approach, we compare our results to state-of-the-art deep learning language models, text-scaling methods, as well as standard, nonspecialized, and conflict event dictionary approaches. We are able to show that our approach outperforms other approaches while retaining interpretability.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":"31 1","pages":"481 - 499"},"PeriodicalIF":5.4,"publicationDate":"2023-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44604107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Mapping Literature with Networks: An Application to Redistricting 用网络绘制文献:在重划中的应用
IF 5.4 2区 社会学 Q1 POLITICAL SCIENCE Pub Date : 2023-03-21 DOI: 10.1017/pan.2023.4
Adeline Lo, Devin Judge-Lord, Kyler Hudson, Kenneth R. Mayer
Abstract Understanding the gaps and connections across existing theories and findings is a perennial challenge in scientific research. Systematically reviewing scholarship is especially challenging for researchers who may lack domain expertise, including junior scholars or those exploring new substantive territory. Conversely, senior scholars may rely on long-standing assumptions and social networks that exclude new research. In both cases, ad hoc literature reviews hinder accumulation of knowledge. Scholars are rarely systematic in selecting relevant prior work or then identifying patterns across their sample. To encourage systematic, replicable, and transparent methods for assessing literature, we propose an accessible network-based framework for reviewing scholarship. In our method, we consider a literature as a network of recurring concepts (nodes) and theorized relationships among them (edges). Network statistics and visualization allow researchers to see patterns and offer reproducible characterizations of assertions about the major themes in existing literature. Critically, our approach is systematic and powerful but also low cost; it requires researchers to enter relationships they observe in prior studies into a simple spreadsheet—a task accessible to new and experienced researchers alike. Our open-source R package enables researchers to leverage powerful network analysis while minimizing software-specific knowledge. We demonstrate this approach by reviewing redistricting literature.
理解现有理论和发现之间的差距和联系是科学研究中的一个长期挑战。对于缺乏专业知识的研究人员,包括初级学者或探索新的实质性领域的研究人员,系统地审查学术成果尤其具有挑战性。相反,资深学者可能依赖于长期存在的假设和排除新研究的社会网络。在这两种情况下,特别的文献综述阻碍了知识的积累。学者们很少系统地选择相关的先前工作,然后在他们的样本中识别模式。为了鼓励系统的、可复制的和透明的评估文献的方法,我们提出了一个可访问的基于网络的框架来审查奖学金。在我们的方法中,我们将文献视为反复出现的概念(节点)和它们之间的理论化关系(边)的网络。网络统计和可视化使研究人员能够看到模式,并对现有文献中的主要主题的断言提供可重复的特征描述。关键是,我们的方法是系统的、强大的,但成本也很低;它要求研究人员将他们在以前的研究中观察到的关系输入到一个简单的电子表格中,这是一个对新手和有经验的研究人员都可以访问的任务。我们的开源R包使研究人员能够利用强大的网络分析,同时最大限度地减少软件特定知识。我们通过回顾重新划分选区的文献来证明这种方法。
{"title":"Mapping Literature with Networks: An Application to Redistricting","authors":"Adeline Lo, Devin Judge-Lord, Kyler Hudson, Kenneth R. Mayer","doi":"10.1017/pan.2023.4","DOIUrl":"https://doi.org/10.1017/pan.2023.4","url":null,"abstract":"Abstract Understanding the gaps and connections across existing theories and findings is a perennial challenge in scientific research. Systematically reviewing scholarship is especially challenging for researchers who may lack domain expertise, including junior scholars or those exploring new substantive territory. Conversely, senior scholars may rely on long-standing assumptions and social networks that exclude new research. In both cases, ad hoc literature reviews hinder accumulation of knowledge. Scholars are rarely systematic in selecting relevant prior work or then identifying patterns across their sample. To encourage systematic, replicable, and transparent methods for assessing literature, we propose an accessible network-based framework for reviewing scholarship. In our method, we consider a literature as a network of recurring concepts (nodes) and theorized relationships among them (edges). Network statistics and visualization allow researchers to see patterns and offer reproducible characterizations of assertions about the major themes in existing literature. Critically, our approach is systematic and powerful but also low cost; it requires researchers to enter relationships they observe in prior studies into a simple spreadsheet—a task accessible to new and experienced researchers alike. Our open-source R package enables researchers to leverage powerful network analysis while minimizing software-specific knowledge. We demonstrate this approach by reviewing redistricting literature.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":"31 1","pages":"669 - 678"},"PeriodicalIF":5.4,"publicationDate":"2023-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43203141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Topic Classification for Political Texts with Pretrained Language Models 基于预训练语言模型的政治文本主题分类
IF 5.4 2区 社会学 Q1 POLITICAL SCIENCE Pub Date : 2023-03-08 DOI: 10.1017/pan.2023.3
Yu Wang
Abstract Supervised topic classification requires labeled data. This often becomes a bottleneck as high-quality labeled data are expensive to acquire. To overcome the data scarcity problem, scholars have recently proposed to use cross-domain topic classification to take advantage of preexisting labeled datasets. Cross-domain topic classification only requires limited annotation in the target domain to verify its cross-domain accuracy. In this letter, we propose supervised topic classification with pretrained language models as an alternative. We show that language models fine-tuned with 70% of the small annotated dataset in the target corpus could outperform models trained using large cross-domain datasets by 27% and that models fine-tuned with 10% of the annotated dataset could already outperform the cross-domain classifiers. Our models are competitive in terms of training time and inference time. Researchers interested in supervised learning with limited labeled data should find our results useful. Our code and data are publicly available.1
摘要有监督的主题分类需要标记的数据。这往往成为一个瓶颈,因为高质量的标记数据获取成本高昂。为了克服数据稀缺的问题,学者们最近提出使用跨领域主题分类来利用预先存在的标记数据集。跨域主题分类只需要在目标域中进行有限的注释,即可验证其跨域准确性。在这封信中,我们提出了使用预先训练的语言模型进行监督主题分类的替代方案。我们表明,用目标语料库中70%的小注释数据集微调的语言模型可以比用大跨域数据集训练的模型好27%,用10%的注释数据集调优的模型已经可以比跨域分类器好。我们的模型在训练时间和推理时间方面具有竞争力。对有限标记数据的监督学习感兴趣的研究人员应该会发现我们的结果很有用。我们的代码和数据是公开的。1
{"title":"Topic Classification for Political Texts with Pretrained Language Models","authors":"Yu Wang","doi":"10.1017/pan.2023.3","DOIUrl":"https://doi.org/10.1017/pan.2023.3","url":null,"abstract":"Abstract Supervised topic classification requires labeled data. This often becomes a bottleneck as high-quality labeled data are expensive to acquire. To overcome the data scarcity problem, scholars have recently proposed to use cross-domain topic classification to take advantage of preexisting labeled datasets. Cross-domain topic classification only requires limited annotation in the target domain to verify its cross-domain accuracy. In this letter, we propose supervised topic classification with pretrained language models as an alternative. We show that language models fine-tuned with 70% of the small annotated dataset in the target corpus could outperform models trained using large cross-domain datasets by 27% and that models fine-tuned with 10% of the annotated dataset could already outperform the cross-domain classifiers. Our models are competitive in terms of training time and inference time. Researchers interested in supervised learning with limited labeled data should find our results useful. Our code and data are publicly available.1","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":"31 1","pages":"662 - 668"},"PeriodicalIF":5.4,"publicationDate":"2023-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44224436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Political Analysis
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1