Abstract Conjoint analysis is widely used for estimating the effects of a large number of treatments on multidimensional decision-making. However, it is this substantive advantage that leads to a statistically undesirable property, multiple hypothesis testing. Existing applications of conjoint analysis except for a few do not correct for the number of hypotheses to be tested, and empirical guidance on the choice of multiple testing correction methods has not been provided. This paper first shows that even when none of the treatments has any effect, the standard analysis pipeline produces at least one statistically significant estimate of average marginal component effects in more than 90% of experimental trials. Then, we conduct a simulation study to compare three well-known methods for multiple testing correction, the Bonferroni correction, the Benjamini–Hochberg procedure, and the adaptive shrinkage (Ash). All three methods are more accurate in recovering the truth than the conventional analysis without correction. Moreover, the Ash method outperforms in avoiding false negatives, while reducing false positives similarly to the other methods. Finally, we show how conclusions drawn from empirical analysis may differ with and without correction by reanalyzing applications on public attitudes toward immigration and partner countries of trade agreements.
{"title":"Multiple Hypothesis Testing in Conjoint Analysis","authors":"Guoer Liu, Y. Shiraito","doi":"10.1017/pan.2022.30","DOIUrl":"https://doi.org/10.1017/pan.2022.30","url":null,"abstract":"Abstract Conjoint analysis is widely used for estimating the effects of a large number of treatments on multidimensional decision-making. However, it is this substantive advantage that leads to a statistically undesirable property, multiple hypothesis testing. Existing applications of conjoint analysis except for a few do not correct for the number of hypotheses to be tested, and empirical guidance on the choice of multiple testing correction methods has not been provided. This paper first shows that even when none of the treatments has any effect, the standard analysis pipeline produces at least one statistically significant estimate of average marginal component effects in more than 90% of experimental trials. Then, we conduct a simulation study to compare three well-known methods for multiple testing correction, the Bonferroni correction, the Benjamini–Hochberg procedure, and the adaptive shrinkage (Ash). All three methods are more accurate in recovering the truth than the conventional analysis without correction. Moreover, the Ash method outperforms in avoiding false negatives, while reducing false positives similarly to the other methods. Finally, we show how conclusions drawn from empirical analysis may differ with and without correction by reanalyzing applications on public attitudes toward immigration and partner countries of trade agreements.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":"31 1","pages":"380 - 395"},"PeriodicalIF":5.4,"publicationDate":"2023-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43757773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Casey Crisman-Cox, O. Gasparyan, Curtis S. Signorino
Abstract Separation or “perfect prediction” is a common problem in discrete choice models that, in practice, leads to inflated point estimates and standard errors. Standard statistical packages do not provide clear advice on how to correct these problems. Furthermore, separation can go completely undiagnosed in fitting advanced models that optimize a user-supplied log-likelihood rather than relying on pre-programmed estimation procedures. In this paper, we both describe the problems that separation can cause and address the issue of detecting it in empirical models of strategic interaction. We then consider several solutions based on penalized maximum likelihood estimation. Using Monte Carlo experiments and a replication study, we demonstrate that when separation is detected in the data, the penalized methods we consider are superior to ordinary maximum likelihood estimators.
{"title":"Detecting and Correcting for Separation in Strategic Choice Models","authors":"Casey Crisman-Cox, O. Gasparyan, Curtis S. Signorino","doi":"10.1017/pan.2022.36","DOIUrl":"https://doi.org/10.1017/pan.2022.36","url":null,"abstract":"Abstract Separation or “perfect prediction” is a common problem in discrete choice models that, in practice, leads to inflated point estimates and standard errors. Standard statistical packages do not provide clear advice on how to correct these problems. Furthermore, separation can go completely undiagnosed in fitting advanced models that optimize a user-supplied log-likelihood rather than relying on pre-programmed estimation procedures. In this paper, we both describe the problems that separation can cause and address the issue of detecting it in empirical models of strategic interaction. We then consider several solutions based on penalized maximum likelihood estimation. Using Monte Carlo experiments and a replication study, we demonstrate that when separation is detected in the data, the penalized methods we consider are superior to ordinary maximum likelihood estimators.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":"31 1","pages":"414 - 429"},"PeriodicalIF":5.4,"publicationDate":"2023-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43320213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sahar Abi-Hassan, J. Box-Steffensmeier, Dino P. Christenson, A. Kaufman, Brian Libgober
Abstract Interest group ideology is theoretically and empirically critical in the study of American politics, yet our measurement of this key concept is lacking both in scope and time. By leveraging network science and ideal point estimation, we provide a novel measure of ideology for amicus curiae briefs and organized interests with accompanying uncertainty estimates. Our Amicus Curiae Network scores cover more than 12,000 unique groups and more than 11,000 briefs across 95 years, providing the largest and longest measure of organized interest ideologies to date. Substantively, the scores reveal that: interests before the Court are ideologically polarized, despite variance in their coalition strategies; interests that donate to campaigns are more conservative and balanced than those that do not; and amicus curiae briefs were more common from liberal organizations until the 1980s, with ideological representation virtually balanced since then.
{"title":"The Ideologies of Organized Interests and Amicus Curiae Briefs: Large-Scale, Social Network Imputation of Ideal Points","authors":"Sahar Abi-Hassan, J. Box-Steffensmeier, Dino P. Christenson, A. Kaufman, Brian Libgober","doi":"10.1017/pan.2022.34","DOIUrl":"https://doi.org/10.1017/pan.2022.34","url":null,"abstract":"Abstract Interest group ideology is theoretically and empirically critical in the study of American politics, yet our measurement of this key concept is lacking both in scope and time. By leveraging network science and ideal point estimation, we provide a novel measure of ideology for amicus curiae briefs and organized interests with accompanying uncertainty estimates. Our Amicus Curiae Network scores cover more than 12,000 unique groups and more than 11,000 briefs across 95 years, providing the largest and longest measure of organized interest ideologies to date. Substantively, the scores reveal that: interests before the Court are ideologically polarized, despite variance in their coalition strategies; interests that donate to campaigns are more conservative and balanced than those that do not; and amicus curiae briefs were more common from liberal organizations until the 1980s, with ideological representation virtually balanced since then.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":"31 1","pages":"396 - 413"},"PeriodicalIF":5.4,"publicationDate":"2023-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44753764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Established approaches to analyze multilingual text corpora require either a duplication of analysts’ efforts or high-quality machine translation (MT). In this paper, I argue that multilingual sentence embedding (MSE) is an attractive alternative approach to language-independent text representation. To support this argument, I evaluate MSE for cross-lingual supervised text classification. Specifically, I assess how reliably MSE-based classifiers detect manifesto sentences’ topics and positions compared to classifiers trained using bag-of-words representations of machine-translated texts, and how this depends on the amount of training data. These analyses show that when training data are relatively scarce (e.g., 20K or less-labeled sentences), MSE-based classifiers can be more reliable and are at least no less reliable than their MT-based counterparts. Furthermore, I examine how reliable MSE-based classifiers label sentences written in languages not in the training data, focusing on the task of discriminating sentences that discuss the issue of immigration from those that do not. This analysis shows that compared to the within-language classification benchmark, such “cross-lingual transfer” tends to result in fewer reliability losses when relying on the MSE instead of the MT approach. This study thus presents an important addition to the cross-lingual text analysis toolkit.
{"title":"Cross-Lingual Classification of Political Texts Using Multilingual Sentence Embeddings","authors":"Hauke Licht","doi":"10.1017/pan.2022.29","DOIUrl":"https://doi.org/10.1017/pan.2022.29","url":null,"abstract":"Abstract Established approaches to analyze multilingual text corpora require either a duplication of analysts’ efforts or high-quality machine translation (MT). In this paper, I argue that multilingual sentence embedding (MSE) is an attractive alternative approach to language-independent text representation. To support this argument, I evaluate MSE for cross-lingual supervised text classification. Specifically, I assess how reliably MSE-based classifiers detect manifesto sentences’ topics and positions compared to classifiers trained using bag-of-words representations of machine-translated texts, and how this depends on the amount of training data. These analyses show that when training data are relatively scarce (e.g., 20K or less-labeled sentences), MSE-based classifiers can be more reliable and are at least no less reliable than their MT-based counterparts. Furthermore, I examine how reliable MSE-based classifiers label sentences written in languages not in the training data, focusing on the task of discriminating sentences that discuss the issue of immigration from those that do not. This analysis shows that compared to the within-language classification benchmark, such “cross-lingual transfer” tends to result in fewer reliability losses when relying on the MSE instead of the MT approach. This study thus presents an important addition to the cross-lingual text analysis toolkit.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":"31 1","pages":"366 - 379"},"PeriodicalIF":5.4,"publicationDate":"2023-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47594409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Despite the ongoing success of populist parties in many parts of the world, we lack comprehensive information about parties’ level of populism over time. A recent contribution to Political Analysis by Di Cocco and Monechi (DCM) suggests that this research gap can be closed by predicting parties’ populism scores from their election manifestos using supervised machine learning. In this paper, we provide a detailed discussion of the suggested approach. Building on recent debates about the validation of machine-learning models, we argue that the validity checks provided in DCM’s paper are insufficient. We conduct a series of additional validity checks and empirically demonstrate that the approach is not suitable for deriving populism scores from texts. We conclude that measuring populism over time and between countries remains an immense challenge for empirical research. More generally, our paper illustrates the importance of more comprehensive validations of supervised machine-learning models.
{"title":"When Correlation Is Not Enough: Validating Populism Scores from Supervised Machine-Learning Models","authors":"Michael Jankowski, Robert A. Huber","doi":"10.1017/pan.2022.32","DOIUrl":"https://doi.org/10.1017/pan.2022.32","url":null,"abstract":"Abstract Despite the ongoing success of populist parties in many parts of the world, we lack comprehensive information about parties’ level of populism over time. A recent contribution to Political Analysis by Di Cocco and Monechi (DCM) suggests that this research gap can be closed by predicting parties’ populism scores from their election manifestos using supervised machine learning. In this paper, we provide a detailed discussion of the suggested approach. Building on recent debates about the validation of machine-learning models, we argue that the validity checks provided in DCM’s paper are insufficient. We conduct a series of additional validity checks and empirically demonstrate that the approach is not suitable for deriving populism scores from texts. We conclude that measuring populism over time and between countries remains an immense challenge for empirical research. More generally, our paper illustrates the importance of more comprehensive validations of supervised machine-learning models.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135014097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Contagion across various types of connections is a central process in the study of many political phenomena (e.g., democratization, civil conflict, and voter turnout). Over the last decade, the methodological literature addressing the challenges in causally identifying contagion in networks has exploded. In one of the foundational works in this literature, Shalizi and Thomas (2011, Sociological Methods and Research 40, 211–239.) propose a permutation test for contagion in longitudinal network data that is not confounded by selection (e.g., homophily). We illustrate the properties of this test via simulation. We assess its statistical power under various conditions of the data, including the nature of the contagion, the structure of the network through which contagion occurs, and the number of time periods included in the data. We then apply this test to an example domain that is commonly considered in the context of observational research on contagion—the international spread of democracy. We find evidence of international contagion of democracy. We conclude with a discussion of the practical applicability of the Shalizi and Thomas test to the study of contagion in political networks.
{"title":"Contagion, Confounding, and Causality: Confronting the Three C’s of Observational Political Networks Research","authors":"Medha Uppala, B. Desmarais","doi":"10.1017/pan.2022.35","DOIUrl":"https://doi.org/10.1017/pan.2022.35","url":null,"abstract":"Abstract Contagion across various types of connections is a central process in the study of many political phenomena (e.g., democratization, civil conflict, and voter turnout). Over the last decade, the methodological literature addressing the challenges in causally identifying contagion in networks has exploded. In one of the foundational works in this literature, Shalizi and Thomas (2011, Sociological Methods and Research 40, 211–239.) propose a permutation test for contagion in longitudinal network data that is not confounded by selection (e.g., homophily). We illustrate the properties of this test via simulation. We assess its statistical power under various conditions of the data, including the nature of the contagion, the structure of the network through which contagion occurs, and the number of time periods included in the data. We then apply this test to an example domain that is commonly considered in the context of observational research on contagion—the international spread of democracy. We find evidence of international contagion of democracy. We conclude with a discussion of the practical applicability of the Shalizi and Thomas test to the study of contagion in political networks.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":"31 1","pages":"472 - 479"},"PeriodicalIF":5.4,"publicationDate":"2023-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49300066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Standard methods for measuring latent traits from categorical data assume that response functions are monotonic. This assumption is violated when individuals from both extremes respond identically, but for conflicting reasons. Two survey respondents may “disagree” with a statement for opposing motivations, liberal and conservative justices may dissent from the same Supreme Court decision but provide ideologically contradictory rationales, and in legislative settings, ideological opposites may join together to oppose moderate legislation in pursuit of antithetical goals. In this article, we introduce a scaling model that accommodates ends against the middle responses and provide a novel estimation approach that improves upon existing routines. We apply this method to survey data, voting data from the U.S. Supreme Court, and the 116th Congress, and show that it outperforms standard methods in terms of both congruence with qualitative insights and model fit. This suggests that our proposed method may offer improved one-dimensional estimates of latent traits in many important settings.
{"title":"Ends Against the Middle: Measuring Latent Traits when Opposites Respond the Same Way for Antithetical Reasons","authors":"JBrandon Duck-Mayr, J. Montgomery","doi":"10.1017/pan.2022.33","DOIUrl":"https://doi.org/10.1017/pan.2022.33","url":null,"abstract":"Abstract Standard methods for measuring latent traits from categorical data assume that response functions are monotonic. This assumption is violated when individuals from both extremes respond identically, but for conflicting reasons. Two survey respondents may “disagree” with a statement for opposing motivations, liberal and conservative justices may dissent from the same Supreme Court decision but provide ideologically contradictory rationales, and in legislative settings, ideological opposites may join together to oppose moderate legislation in pursuit of antithetical goals. In this article, we introduce a scaling model that accommodates ends against the middle responses and provide a novel estimation approach that improves upon existing routines. We apply this method to survey data, voting data from the U.S. Supreme Court, and the 116th Congress, and show that it outperforms standard methods in terms of both congruence with qualitative insights and model fit. This suggests that our proposed method may offer improved one-dimensional estimates of latent traits in many important settings.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":"31 1","pages":"606 - 625"},"PeriodicalIF":5.4,"publicationDate":"2023-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47195160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Evan T. R. Rosenman, Cory McCartan, Santiago Olivella
Abstract The output of predictive models is routinely recalibrated by reconciling low-level predictions with known quantities defined at higher levels of aggregation. For example, models predicting vote probabilities at the individual level in U.S. elections can be adjusted so that their aggregation matches the observed vote totals in each county, thus producing better-calibrated predictions. In this research note, we provide theoretical grounding for one of the most commonly used recalibration strategies, known colloquially as the “logit shift.” Typically cast as a heuristic adjustment strategy (whereby a constant correction on the logit scale is found, such that aggregated predictions match target totals), we show that the logit shift offers a fast and accurate approximation to a principled, but computationally impractical adjustment strategy: computing the posterior prediction probabilities, conditional on the observed totals. After deriving analytical bounds on the quality of the approximation, we illustrate its accuracy using Monte Carlo simulations. We also discuss scenarios in which the logit shift is less effective at recalibrating predictions: when the target totals are defined only for highly heterogeneous populations, and when the original predictions correctly capture the mean of true individual probabilities, but fail to capture the shape of their distribution.
{"title":"Recalibration of Predicted Probabilities Using the “Logit Shift”: Why Does It Work, and When Can It Be Expected to Work Well?","authors":"Evan T. R. Rosenman, Cory McCartan, Santiago Olivella","doi":"10.1017/pan.2022.31","DOIUrl":"https://doi.org/10.1017/pan.2022.31","url":null,"abstract":"Abstract The output of predictive models is routinely recalibrated by reconciling low-level predictions with known quantities defined at higher levels of aggregation. For example, models predicting vote probabilities at the individual level in U.S. elections can be adjusted so that their aggregation matches the observed vote totals in each county, thus producing better-calibrated predictions. In this research note, we provide theoretical grounding for one of the most commonly used recalibration strategies, known colloquially as the “logit shift.” Typically cast as a heuristic adjustment strategy (whereby a constant correction on the logit scale is found, such that aggregated predictions match target totals), we show that the logit shift offers a fast and accurate approximation to a principled, but computationally impractical adjustment strategy: computing the posterior prediction probabilities, conditional on the observed totals. After deriving analytical bounds on the quality of the approximation, we illustrate its accuracy using Monte Carlo simulations. We also discuss scenarios in which the logit shift is less effective at recalibrating predictions: when the target totals are defined only for highly heterogeneous populations, and when the original predictions correctly capture the mean of true individual probabilities, but fail to capture the shape of their distribution.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":"31 1","pages":"651 - 661"},"PeriodicalIF":5.4,"publicationDate":"2023-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49567363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Scholars, pundits, and politicians use opinion surveys to study citizen beliefs about political facts, such as the current unemployment rate, and more conspiratorial beliefs, such as whether Barack Obama was born abroad. Many studies, however, ignore acquiescence-response bias, the tendency for survey respondents to endorse any assertion made in a survey question regardless of content. With new surveys fielding questions asked in recent scholarship, we show that acquiescence bias inflates estimated incidence of conspiratorial beliefs and political misperceptions in the United States and China by up to 50%. Acquiescence bias is disproportionately prevalent among more ideological respondents, inflating correlations between political ideology such as conservatism and endorsement of conspiracies or misperception of facts. We propose and demonstrate two methods to correct for acquiescence bias.
{"title":"Acquiescence Bias Inflates Estimates of Conspiratorial Beliefs and Political Misperceptions","authors":"Seth J. Hill, Margaret E. Roberts","doi":"10.1017/pan.2022.28","DOIUrl":"https://doi.org/10.1017/pan.2022.28","url":null,"abstract":"Abstract Scholars, pundits, and politicians use opinion surveys to study citizen beliefs about political facts, such as the current unemployment rate, and more conspiratorial beliefs, such as whether Barack Obama was born abroad. Many studies, however, ignore acquiescence-response bias, the tendency for survey respondents to endorse any assertion made in a survey question regardless of content. With new surveys fielding questions asked in recent scholarship, we show that acquiescence bias inflates estimated incidence of conspiratorial beliefs and political misperceptions in the United States and China by up to 50%. Acquiescence bias is disproportionately prevalent among more ideological respondents, inflating correlations between political ideology such as conservatism and endorsement of conspiracies or misperception of facts. We propose and demonstrate two methods to correct for acquiescence bias.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":"31 1","pages":"575 - 590"},"PeriodicalIF":5.4,"publicationDate":"2023-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49139877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}