An abstract is not available for this content. As you have access to this content, full HTML content is provided on this page. A PDF of this content is also available in through the ‘Save PDF’ action button.
{"title":"<scp>Relatio</scp>: Text Semantics Capture Political and Economic Narratives – ERRATUM","authors":"Elliott Ash, Germain Gauthier, Philine Widmer","doi":"10.1017/pan.2023.15","DOIUrl":"https://doi.org/10.1017/pan.2023.15","url":null,"abstract":"An abstract is not available for this content. As you have access to this content, full HTML content is provided on this page. A PDF of this content is also available in through the ‘Save PDF’ action button.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135626082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Many studies exploit close elections in a regression discontinuity framework to identify partisan effects, that is, the effect of having a given party in office on some outcome. We argue that, when conducted on single-member districts, such design may identify a compound effect: the partisan effect, plus the majority status effect, that is, the effect of being represented by a member of the legislative majority. We provide a simple strategy to disentangle the two, and test it with simulations. Finally, we show the empirical relevance of this issue using real data.
{"title":"The Role of Majority Status in Close Election Studies","authors":"Matteo Alpino, Marta Crispino","doi":"10.1017/pan.2023.14","DOIUrl":"https://doi.org/10.1017/pan.2023.14","url":null,"abstract":"Abstract Many studies exploit close elections in a regression discontinuity framework to identify partisan effects, that is, the effect of having a given party in office on some outcome. We argue that, when conducted on single-member districts, such design may identify a compound effect: the partisan effect, plus the majority status effect, that is, the effect of being represented by a member of the legislative majority. We provide a simple strategy to disentangle the two, and test it with simulations. Finally, we show the empirical relevance of this issue using real data.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135927278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We tackle the problem of simulating seat- and vote-shares for a party system of a given size. We show how these shares can be generated using unordered and ordered Dirichlet distributions. We show that a distribution with a mean vector given by the rule described in Taagepera and Allik (2006, Electoral Studies 25, 696–713) fits real-world data almost as well as a saturated model where there is a parameter for each rank/system size combination.
{"title":"Simulating Party Shares","authors":"D. Cohen, Chris Hanretty","doi":"10.1017/pan.2023.13","DOIUrl":"https://doi.org/10.1017/pan.2023.13","url":null,"abstract":"\u0000 We tackle the problem of simulating seat- and vote-shares for a party system of a given size. We show how these shares can be generated using unordered and ordered Dirichlet distributions. We show that a distribution with a mean vector given by the rule described in Taagepera and Allik (2006, Electoral Studies 25, 696–713) fits real-world data almost as well as a saturated model where there is a parameter for each rank/system size combination.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":" ","pages":""},"PeriodicalIF":5.4,"publicationDate":"2023-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44901798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Large-scale microdata on group identity are critical for studies on identity politics and violence but remain largely unavailable for developing countries. We use personal names to infer religion in South Asia—where religion is a salient social division, and yet, disaggregated data on it are scarce. Existing work predicts religion using a dictionary-based method and, therefore, cannot classify unseen names. We provide character-based machine-learning models that can classify unseen names too with high accuracy. Our models are also much faster and, hence, scalable to large datasets. We explain the classification decisions of one of our models using the layer-wise relevance propagation technique. The character patterns learned by the classifier are rooted in the linguistic origins of names. We apply these to infer the religion of electoral candidates using historical data on Indian elections and observe a trend of declining Muslim representation. Our approach can be used to detect identity groups across the world for whom the underlying names might have different linguistic roots.
{"title":"It’s All in the Name: A Character-Based Approach to Infer Religion","authors":"Rochana Chaturvedi, Sugat Chaturvedi","doi":"10.1017/pan.2023.6","DOIUrl":"https://doi.org/10.1017/pan.2023.6","url":null,"abstract":"Abstract Large-scale microdata on group identity are critical for studies on identity politics and violence but remain largely unavailable for developing countries. We use personal names to infer religion in South Asia—where religion is a salient social division, and yet, disaggregated data on it are scarce. Existing work predicts religion using a dictionary-based method and, therefore, cannot classify unseen names. We provide character-based machine-learning models that can classify unseen names too with high accuracy. Our models are also much faster and, hence, scalable to large datasets. We explain the classification decisions of one of our models using the layer-wise relevance propagation technique. The character patterns learned by the classifier are rooted in the linguistic origins of names. We apply these to infer the religion of electoral candidates using historical data on Indian elections and observe a trend of declining Muslim representation. Our approach can be used to detect identity groups across the world for whom the underlying names might have different linguistic roots.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136151852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Y. Zhukov, Jason S. Byers, Marty Davidson, Ken Kollman
Theoretical units of interest often do not align with the spatial units at which data are available. This problem is pervasive in political science, particularly in subnational empirical research that requires integrating data across incompatible geographic units (e.g., administrative areas, electoral constituencies, and grid cells). Overcoming this challenge requires researchers not only to align the scale of empirical and theoretical units, but also to understand the consequences of this change of support for measurement error and statistical inference. We show how the accuracy of transformed values and the estimation of regression coefficients depend on the degree of nesting (i.e., whether units fall completely and neatly inside each other) and on the relative scale of source and destination units (i.e., aggregation, disaggregation, and hybrid). We introduce simple, nonparametric measures of relative nesting and scale, as ex ante indicators of spatial transformation complexity and error susceptibility. Using election data and Monte Carlo simulations, we show that these measures are strongly predictive of transformation quality across multiple change-of-support methods. We propose several validation procedures and provide open-source software to make transformation options more accessible, customizable, and intuitive.
{"title":"Integrating Data Across Misaligned Spatial Units","authors":"Y. Zhukov, Jason S. Byers, Marty Davidson, Ken Kollman","doi":"10.1017/pan.2023.5","DOIUrl":"https://doi.org/10.1017/pan.2023.5","url":null,"abstract":"\u0000 Theoretical units of interest often do not align with the spatial units at which data are available. This problem is pervasive in political science, particularly in subnational empirical research that requires integrating data across incompatible geographic units (e.g., administrative areas, electoral constituencies, and grid cells). Overcoming this challenge requires researchers not only to align the scale of empirical and theoretical units, but also to understand the consequences of this change of support for measurement error and statistical inference. We show how the accuracy of transformed values and the estimation of regression coefficients depend on the degree of nesting (i.e., whether units fall completely and neatly inside each other) and on the relative scale of source and destination units (i.e., aggregation, disaggregation, and hybrid). We introduce simple, nonparametric measures of relative nesting and scale, as ex ante indicators of spatial transformation complexity and error susceptibility. Using election data and Monte Carlo simulations, we show that these measures are strongly predictive of transformation quality across multiple change-of-support methods. We propose several validation procedures and provide open-source software to make transformation options more accessible, customizable, and intuitive.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":" ","pages":""},"PeriodicalIF":5.4,"publicationDate":"2023-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41679125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sonja Häffner, Martin Hofer, Maximilian Nagl, Julian Walterskirchen
Abstract Recent advancements in natural language processing (NLP) methods have significantly improved their performance. However, more complex NLP models are more difficult to interpret and computationally expensive. Therefore, we propose an approach to dictionary creation that carefully balances the trade-off between complexity and interpretability. This approach combines a deep neural network architecture with techniques to improve model explainability to automatically build a domain-specific dictionary. As an illustrative use case of our approach, we create an objective dictionary that can infer conflict intensity from text data. We train the neural networks on a corpus of conflict reports and match them with conflict event data. This corpus consists of over 14,000 expert-written International Crisis Group (ICG) CrisisWatch reports between 2003 and 2021. Sensitivity analysis is used to extract the weighted words from the neural network to build the dictionary. In order to evaluate our approach, we compare our results to state-of-the-art deep learning language models, text-scaling methods, as well as standard, nonspecialized, and conflict event dictionary approaches. We are able to show that our approach outperforms other approaches while retaining interpretability.
{"title":"Introducing an Interpretable Deep Learning Approach to Domain-Specific Dictionary Creation: A Use Case for Conflict Prediction","authors":"Sonja Häffner, Martin Hofer, Maximilian Nagl, Julian Walterskirchen","doi":"10.1017/pan.2023.7","DOIUrl":"https://doi.org/10.1017/pan.2023.7","url":null,"abstract":"Abstract Recent advancements in natural language processing (NLP) methods have significantly improved their performance. However, more complex NLP models are more difficult to interpret and computationally expensive. Therefore, we propose an approach to dictionary creation that carefully balances the trade-off between complexity and interpretability. This approach combines a deep neural network architecture with techniques to improve model explainability to automatically build a domain-specific dictionary. As an illustrative use case of our approach, we create an objective dictionary that can infer conflict intensity from text data. We train the neural networks on a corpus of conflict reports and match them with conflict event data. This corpus consists of over 14,000 expert-written International Crisis Group (ICG) CrisisWatch reports between 2003 and 2021. Sensitivity analysis is used to extract the weighted words from the neural network to build the dictionary. In order to evaluate our approach, we compare our results to state-of-the-art deep learning language models, text-scaling methods, as well as standard, nonspecialized, and conflict event dictionary approaches. We are able to show that our approach outperforms other approaches while retaining interpretability.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":"31 1","pages":"481 - 499"},"PeriodicalIF":5.4,"publicationDate":"2023-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44604107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Adeline Lo, Devin Judge-Lord, Kyler Hudson, Kenneth R. Mayer
Abstract Understanding the gaps and connections across existing theories and findings is a perennial challenge in scientific research. Systematically reviewing scholarship is especially challenging for researchers who may lack domain expertise, including junior scholars or those exploring new substantive territory. Conversely, senior scholars may rely on long-standing assumptions and social networks that exclude new research. In both cases, ad hoc literature reviews hinder accumulation of knowledge. Scholars are rarely systematic in selecting relevant prior work or then identifying patterns across their sample. To encourage systematic, replicable, and transparent methods for assessing literature, we propose an accessible network-based framework for reviewing scholarship. In our method, we consider a literature as a network of recurring concepts (nodes) and theorized relationships among them (edges). Network statistics and visualization allow researchers to see patterns and offer reproducible characterizations of assertions about the major themes in existing literature. Critically, our approach is systematic and powerful but also low cost; it requires researchers to enter relationships they observe in prior studies into a simple spreadsheet—a task accessible to new and experienced researchers alike. Our open-source R package enables researchers to leverage powerful network analysis while minimizing software-specific knowledge. We demonstrate this approach by reviewing redistricting literature.
{"title":"Mapping Literature with Networks: An Application to Redistricting","authors":"Adeline Lo, Devin Judge-Lord, Kyler Hudson, Kenneth R. Mayer","doi":"10.1017/pan.2023.4","DOIUrl":"https://doi.org/10.1017/pan.2023.4","url":null,"abstract":"Abstract Understanding the gaps and connections across existing theories and findings is a perennial challenge in scientific research. Systematically reviewing scholarship is especially challenging for researchers who may lack domain expertise, including junior scholars or those exploring new substantive territory. Conversely, senior scholars may rely on long-standing assumptions and social networks that exclude new research. In both cases, ad hoc literature reviews hinder accumulation of knowledge. Scholars are rarely systematic in selecting relevant prior work or then identifying patterns across their sample. To encourage systematic, replicable, and transparent methods for assessing literature, we propose an accessible network-based framework for reviewing scholarship. In our method, we consider a literature as a network of recurring concepts (nodes) and theorized relationships among them (edges). Network statistics and visualization allow researchers to see patterns and offer reproducible characterizations of assertions about the major themes in existing literature. Critically, our approach is systematic and powerful but also low cost; it requires researchers to enter relationships they observe in prior studies into a simple spreadsheet—a task accessible to new and experienced researchers alike. Our open-source R package enables researchers to leverage powerful network analysis while minimizing software-specific knowledge. We demonstrate this approach by reviewing redistricting literature.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":"31 1","pages":"669 - 678"},"PeriodicalIF":5.4,"publicationDate":"2023-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43203141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Supervised topic classification requires labeled data. This often becomes a bottleneck as high-quality labeled data are expensive to acquire. To overcome the data scarcity problem, scholars have recently proposed to use cross-domain topic classification to take advantage of preexisting labeled datasets. Cross-domain topic classification only requires limited annotation in the target domain to verify its cross-domain accuracy. In this letter, we propose supervised topic classification with pretrained language models as an alternative. We show that language models fine-tuned with 70% of the small annotated dataset in the target corpus could outperform models trained using large cross-domain datasets by 27% and that models fine-tuned with 10% of the annotated dataset could already outperform the cross-domain classifiers. Our models are competitive in terms of training time and inference time. Researchers interested in supervised learning with limited labeled data should find our results useful. Our code and data are publicly available.1
{"title":"Topic Classification for Political Texts with Pretrained Language Models","authors":"Yu Wang","doi":"10.1017/pan.2023.3","DOIUrl":"https://doi.org/10.1017/pan.2023.3","url":null,"abstract":"Abstract Supervised topic classification requires labeled data. This often becomes a bottleneck as high-quality labeled data are expensive to acquire. To overcome the data scarcity problem, scholars have recently proposed to use cross-domain topic classification to take advantage of preexisting labeled datasets. Cross-domain topic classification only requires limited annotation in the target domain to verify its cross-domain accuracy. In this letter, we propose supervised topic classification with pretrained language models as an alternative. We show that language models fine-tuned with 70% of the small annotated dataset in the target corpus could outperform models trained using large cross-domain datasets by 27% and that models fine-tuned with 10% of the annotated dataset could already outperform the cross-domain classifiers. Our models are competitive in terms of training time and inference time. Researchers interested in supervised learning with limited labeled data should find our results useful. Our code and data are publicly available.1","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":"31 1","pages":"662 - 668"},"PeriodicalIF":5.4,"publicationDate":"2023-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44224436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}