Philip Moniz, Rodrigo Ramirez-Perez, Erin Hartman, Stephen Jessee
Survey experiments on probability samples are a popular method for investigating population-level causal questions due to their strong internal validity. However, lower survey response rates and an increased reliance on online convenience samples raise questions about the generalizability of survey experiments. We examine this concern using data from a collection of 50 survey experiments which represent a wide range of social science studies. Recruitment for these studies employed a unique double sampling strategy that first obtains a sample of “eager” respondents and then employs much more aggressive recruitment methods with the goal of adding “reluctant” respondents to the sample in a second sampling wave. This approach substantially increases the number of reluctant respondents who participate and also allows for straightforward categorization of eager and reluctant survey respondents within each sample. We find no evidence that treatment effects for eager and reluctant respondents differ substantially. Within demographic categories often used for weighting surveys, there is also little evidence of response heterogeneity between eager and reluctant respondents. Our results suggest that social science findings based on survey experiments, even in the modern era of very low response rates, provide reasonable estimates of population average treatment effects among a deeper pool of survey respondents in a wide range of settings.
{"title":"Generalizing toward Nonrespondents: Effect Estimates in Survey Experiments Are Broadly Similar for Eager and Reluctant Participants","authors":"Philip Moniz, Rodrigo Ramirez-Perez, Erin Hartman, Stephen Jessee","doi":"10.1017/pan.2024.8","DOIUrl":"https://doi.org/10.1017/pan.2024.8","url":null,"abstract":"\u0000 Survey experiments on probability samples are a popular method for investigating population-level causal questions due to their strong internal validity. However, lower survey response rates and an increased reliance on online convenience samples raise questions about the generalizability of survey experiments. We examine this concern using data from a collection of 50 survey experiments which represent a wide range of social science studies. Recruitment for these studies employed a unique double sampling strategy that first obtains a sample of “eager” respondents and then employs much more aggressive recruitment methods with the goal of adding “reluctant” respondents to the sample in a second sampling wave. This approach substantially increases the number of reluctant respondents who participate and also allows for straightforward categorization of eager and reluctant survey respondents within each sample. We find no evidence that treatment effects for eager and reluctant respondents differ substantially. Within demographic categories often used for weighting surveys, there is also little evidence of response heterogeneity between eager and reluctant respondents. Our results suggest that social science findings based on survey experiments, even in the modern era of very low response rates, provide reasonable estimates of population average treatment effects among a deeper pool of survey respondents in a wide range of settings.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":null,"pages":null},"PeriodicalIF":5.4,"publicationDate":"2024-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141126441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
James Bisbee, Joshua D. Clinton, C. Dorff, Brenton Kenkel, Jennifer M. Larson
Large language models (LLMs) offer new research possibilities for social scientists, but their potential as “synthetic data” is still largely unknown. In this paper, we investigate how accurately the popular LLM ChatGPT can recover public opinion, prompting the LLM to adopt different “personas” and then provide feeling thermometer scores for 11 sociopolitical groups. The average scores generated by ChatGPT correspond closely to the averages in our baseline survey, the 2016–2020 American National Election Study (ANES). Nevertheless, sampling by ChatGPT is not reliable for statistical inference: there is less variation in responses than in the real surveys, and regression coefficients often differ significantly from equivalent estimates obtained using ANES data. We also document how the distribution of synthetic responses varies with minor changes in prompt wording, and we show how the same prompt yields significantly different results over a 3-month period. Altogether, our findings raise serious concerns about the quality, reliability, and reproducibility of synthetic survey data generated by LLMs.
{"title":"Synthetic Replacements for Human Survey Data? The Perils of Large Language Models","authors":"James Bisbee, Joshua D. Clinton, C. Dorff, Brenton Kenkel, Jennifer M. Larson","doi":"10.1017/pan.2024.5","DOIUrl":"https://doi.org/10.1017/pan.2024.5","url":null,"abstract":"\u0000 Large language models (LLMs) offer new research possibilities for social scientists, but their potential as “synthetic data” is still largely unknown. In this paper, we investigate how accurately the popular LLM ChatGPT can recover public opinion, prompting the LLM to adopt different “personas” and then provide feeling thermometer scores for 11 sociopolitical groups. The average scores generated by ChatGPT correspond closely to the averages in our baseline survey, the 2016–2020 American National Election Study (ANES). Nevertheless, sampling by ChatGPT is not reliable for statistical inference: there is less variation in responses than in the real surveys, and regression coefficients often differ significantly from equivalent estimates obtained using ANES data. We also document how the distribution of synthetic responses varies with minor changes in prompt wording, and we show how the same prompt yields significantly different results over a 3-month period. Altogether, our findings raise serious concerns about the quality, reliability, and reproducibility of synthetic survey data generated by LLMs.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":null,"pages":null},"PeriodicalIF":5.4,"publicationDate":"2024-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140964704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Used by politicians, journalists, and citizens, Twitter has been the most important social media platform to investigate political phenomena such as hate speech, polarization, or terrorism for over a decade. A high proportion of Twitter studies of emotionally charged or controversial content limit their ability to replicate findings due to incomplete Twitter-related replication data and the inability to recrawl their datasets entirely. This paper shows that these Twitter studies and their findings are considerably affected by nonrandom tweet mortality and data access restrictions imposed by the platform. While sensitive datasets suffer a notably higher removal rate than nonsensitive datasets, attempting to replicate key findings of Kim’s (2023, Political Science Research and Methods 11, 673–695) influential study on the content of violent tweets leads to significantly different results. The results highlight that access to complete replication data is particularly important in light of dynamically changing social media research conditions. Thus, the study raises concerns and potential solutions about the broader implications of nonrandom tweet mortality for future social media research on Twitter and similar platforms.
{"title":"NonRandom Tweet Mortality and Data Access Restrictions: Compromising the Replication of Sensitive Twitter Studies","authors":"Andreas Küpfer","doi":"10.1017/pan.2024.7","DOIUrl":"https://doi.org/10.1017/pan.2024.7","url":null,"abstract":"\u0000 Used by politicians, journalists, and citizens, Twitter has been the most important social media platform to investigate political phenomena such as hate speech, polarization, or terrorism for over a decade. A high proportion of Twitter studies of emotionally charged or controversial content limit their ability to replicate findings due to incomplete Twitter-related replication data and the inability to recrawl their datasets entirely. This paper shows that these Twitter studies and their findings are considerably affected by nonrandom tweet mortality and data access restrictions imposed by the platform. While sensitive datasets suffer a notably higher removal rate than nonsensitive datasets, attempting to replicate key findings of Kim’s (2023, Political Science Research and Methods 11, 673–695) influential study on the content of violent tweets leads to significantly different results. The results highlight that access to complete replication data is particularly important in light of dynamically changing social media research conditions. Thus, the study raises concerns and potential solutions about the broader implications of nonrandom tweet mortality for future social media research on Twitter and similar platforms.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":null,"pages":null},"PeriodicalIF":5.4,"publicationDate":"2024-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140965760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
When researchers design an experiment, they usually hold potentially relevant features of the experiment constant. We call these details the “topic” of the experiment. For example, researchers studying the impact of party cues on attitudes must inform respondents of the parties’ positions on a particular policy. In doing so, researchers implement just one of many possible designs . Clifford, Leeper, and Rainey (2023. “Generalizing Survey Experiments Using Topic Sampling: An Application to Party Cues.” Forthcoming in Political Behavior. https://doi.org/10.1007/s11109-023-09870-1) argue that researchers should implement many of the possible designs in parallel—what they call “topic sampling”—to generalize to a larger population of topics. We describe two estimators for topic-sampling designs: First, we describe a nonparametric estimator of the typical effect that is unbiased under the assumptions of the design; and second, we describe a hierarchical model that researchers can use to describe the heterogeneity. We suggest describing the heterogeneity across topics in three ways: (1) the standard deviation in treatment effects across topics, (2) the treatment effects for particular topics, and (3) how the treatment effects for particular topics vary with topic-level predictors. We evaluate the performance of the hierarchical model using the Strengthening Democracy Challenge megastudy and show that the hierarchical model works well.
{"title":"Estimators for Topic-Sampling Designs","authors":"Scott Clifford, Carlisle Rainey","doi":"10.1017/pan.2024.1","DOIUrl":"https://doi.org/10.1017/pan.2024.1","url":null,"abstract":"\u0000 When researchers design an experiment, they usually hold potentially relevant features of the experiment constant. We call these details the “topic” of the experiment. For example, researchers studying the impact of party cues on attitudes must inform respondents of the parties’ positions on a particular policy. In doing so, researchers implement just one of many possible designs . Clifford, Leeper, and Rainey (2023. “Generalizing Survey Experiments Using Topic Sampling: An Application to Party Cues.” Forthcoming in Political Behavior. https://doi.org/10.1007/s11109-023-09870-1) argue that researchers should implement many of the possible designs in parallel—what they call “topic sampling”—to generalize to a larger population of topics. We describe two estimators for topic-sampling designs: First, we describe a nonparametric estimator of the typical effect that is unbiased under the assumptions of the design; and second, we describe a hierarchical model that researchers can use to describe the heterogeneity. We suggest describing the heterogeneity across topics in three ways: (1) the standard deviation in treatment effects across topics, (2) the treatment effects for particular topics, and (3) how the treatment effects for particular topics vary with topic-level predictors. We evaluate the performance of the hierarchical model using the Strengthening Democracy Challenge megastudy and show that the hierarchical model works well.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":null,"pages":null},"PeriodicalIF":5.4,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140985053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Theoretical expectations regarding communication patterns between legislators and outside agents, such as lobbyists, agency officials, or policy experts, often depend on the relationship between legislators’ and agents’ preferences. However, legislators and nonelected outside agents evaluate the merits of policies using distinct criteria and considerations. We develop a measurement method that flexibly estimates the policy preferences for a class of outside agents—witnesses in committee hearings—separate from that of legislators’ and compute their preference distance across the two dimensions. In our application to Medicare hearings, we find that legislators in the U.S. Congress heavily condition their questioning of witnesses on preference distance, showing that legislators tend to seek policy information from like-minded experts in committee hearings. We do not find this result using a conventional measurement placing both actors on one dimension. The contrast in results lends support for the construct validity of our proposed preference measures.
{"title":"Flexible Estimation of Policy Preferences for Witnesses in Committee Hearings","authors":"K. Esterling, Ju Yeon Park","doi":"10.1017/pan.2024.6","DOIUrl":"https://doi.org/10.1017/pan.2024.6","url":null,"abstract":"\u0000 Theoretical expectations regarding communication patterns between legislators and outside agents, such as lobbyists, agency officials, or policy experts, often depend on the relationship between legislators’ and agents’ preferences. However, legislators and nonelected outside agents evaluate the merits of policies using distinct criteria and considerations. We develop a measurement method that flexibly estimates the policy preferences for a class of outside agents—witnesses in committee hearings—separate from that of legislators’ and compute their preference distance across the two dimensions. In our application to Medicare hearings, we find that legislators in the U.S. Congress heavily condition their questioning of witnesses on preference distance, showing that legislators tend to seek policy information from like-minded experts in committee hearings. We do not find this result using a conventional measurement placing both actors on one dimension. The contrast in results lends support for the construct validity of our proposed preference measures.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":null,"pages":null},"PeriodicalIF":5.4,"publicationDate":"2024-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140996362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Researchers are often interested in whether discrimination on the basis of racial cues persists above and beyond discrimination on the basis of nonracial attributes that decision makers—e.g., employers and legislators—infer from such cues. We show that existing audit experiments may be unable to parse these mechanisms because of an asymmetry in when decision makers are exposed to cues of race and additional signals intended to rule out discrimination due to other attributes. For example, email audit experiments typically cue race via the name in the email address, at which point legislators can choose to open the email, but cue other attributes in the body of the email, which decision makers can be exposed to only after opening the email. We derive the bias resulting from this asymmetry and then propose two distinct solutions for email audit experiments. The first exposes decision makers to all cues before the decision to open. The second crafts the email to ensure no discrimination in opening and then exposes decision makers to all cues in the body of the email after opening. This second solution works without measures of opening, but can be improved when researchers do measure opening, even if with error.
{"title":"Audit Experiments of Racial Discrimination and the Importance of Symmetry in Exposure to Cues","authors":"Thomas Leavitt, Viviana Rivera-Burgos","doi":"10.1017/pan.2024.3","DOIUrl":"https://doi.org/10.1017/pan.2024.3","url":null,"abstract":"\u0000 Researchers are often interested in whether discrimination on the basis of racial cues persists above and beyond discrimination on the basis of nonracial attributes that decision makers—e.g., employers and legislators—infer from such cues. We show that existing audit experiments may be unable to parse these mechanisms because of an asymmetry in when decision makers are exposed to cues of race and additional signals intended to rule out discrimination due to other attributes. For example, email audit experiments typically cue race via the name in the email address, at which point legislators can choose to open the email, but cue other attributes in the body of the email, which decision makers can be exposed to only after opening the email. We derive the bias resulting from this asymmetry and then propose two distinct solutions for email audit experiments. The first exposes decision makers to all cues before the decision to open. The second crafts the email to ensure no discrimination in opening and then exposes decision makers to all cues in the body of the email after opening. This second solution works without measures of opening, but can be improved when researchers do measure opening, even if with error.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":null,"pages":null},"PeriodicalIF":5.4,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141010354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We show that, in some ranked ballot elections, it may be possible to violate the secret vote. There are so many ways to rank even a handful of candidates that many possible rankings might not be cast by any voter. So, a vote buyer could pay someone to rank the candidates a certain way and then use the announced election results to verify that the voter followed through. We examine the feasibility of this attack both theoretically and empirically, focusing on instant runoff voting (IRV). Although many IRV elections have few enough candidates that this scheme is not feasible, we use data from San Francisco and a proposed election rule change in Oakland to show that some important IRV elections can have large numbers of unused rankings. There is no evidence that this vote-buying scheme has ever been used. However, its existence has implications for the administration and security of IRV elections. This scheme is more feasible when more candidates can be ranked in the election and when the election results report all the ways that candidates were ranked.
{"title":"Votes Can Be Confidently Bought in Some Ranked Ballot Elections, and What to Do about It","authors":"Jack R. Williams, Samuel Baltz, Charles Stewart","doi":"10.1017/pan.2024.4","DOIUrl":"https://doi.org/10.1017/pan.2024.4","url":null,"abstract":"\u0000 We show that, in some ranked ballot elections, it may be possible to violate the secret vote. There are so many ways to rank even a handful of candidates that many possible rankings might not be cast by any voter. So, a vote buyer could pay someone to rank the candidates a certain way and then use the announced election results to verify that the voter followed through. We examine the feasibility of this attack both theoretically and empirically, focusing on instant runoff voting (IRV). Although many IRV elections have few enough candidates that this scheme is not feasible, we use data from San Francisco and a proposed election rule change in Oakland to show that some important IRV elections can have large numbers of unused rankings. There is no evidence that this vote-buying scheme has ever been used. However, its existence has implications for the administration and security of IRV elections. This scheme is more feasible when more candidates can be ranked in the election and when the election results report all the ways that candidates were ranked.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":null,"pages":null},"PeriodicalIF":5.4,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141006196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Copulas are helpful in studying joint distributions of two variables, in particular, when confounders are unobserved. However, most conventional copulas cannot model joint distributions where one variable does not increase or decrease in the other in a monotonic manner. For instance, suppose that two variables are linearly positively correlated for one type of unit and negatively for another type of unit. If the type is unobserved, we can observe only a mixture of both types. Seemingly, one variable tends to take either a high or low value (or a middle value) when the other variable is small (large), or vice versa. To address this issue, I consider an overlooked copula with trigonometric functions (Chesneau [2021, Applied Mathematics, 1(1), pp. 3–17]) that I name the “normal mode copula.” I apply the copula to a dataset about government formation and duration to demonstrate that the normal mode copula has better performance than other conventional copulas.
{"title":"Normal Mode Copulas for Nonmonotonic Dependence","authors":"Kentato Fukumoto","doi":"10.1017/pan.2023.45","DOIUrl":"https://doi.org/10.1017/pan.2023.45","url":null,"abstract":"\u0000 Copulas are helpful in studying joint distributions of two variables, in particular, when confounders are unobserved. However, most conventional copulas cannot model joint distributions where one variable does not increase or decrease in the other in a monotonic manner. For instance, suppose that two variables are linearly positively correlated for one type of unit and negatively for another type of unit. If the type is unobserved, we can observe only a mixture of both types. Seemingly, one variable tends to take either a high or low value (or a middle value) when the other variable is small (large), or vice versa. To address this issue, I consider an overlooked copula with trigonometric functions (Chesneau [2021, Applied Mathematics, 1(1), pp. 3–17]) that I name the “normal mode copula.” I apply the copula to a dataset about government formation and duration to demonstrate that the normal mode copula has better performance than other conventional copulas.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":null,"pages":null},"PeriodicalIF":5.4,"publicationDate":"2024-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139841397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Copulas are helpful in studying joint distributions of two variables, in particular, when confounders are unobserved. However, most conventional copulas cannot model joint distributions where one variable does not increase or decrease in the other in a monotonic manner. For instance, suppose that two variables are linearly positively correlated for one type of unit and negatively for another type of unit. If the type is unobserved, we can observe only a mixture of both types. Seemingly, one variable tends to take either a high or low value (or a middle value) when the other variable is small (large), or vice versa. To address this issue, I consider an overlooked copula with trigonometric functions (Chesneau [2021, Applied Mathematics, 1(1), pp. 3–17]) that I name the “normal mode copula.” I apply the copula to a dataset about government formation and duration to demonstrate that the normal mode copula has better performance than other conventional copulas.
{"title":"Normal Mode Copulas for Nonmonotonic Dependence","authors":"Kentato Fukumoto","doi":"10.1017/pan.2023.45","DOIUrl":"https://doi.org/10.1017/pan.2023.45","url":null,"abstract":"\u0000 Copulas are helpful in studying joint distributions of two variables, in particular, when confounders are unobserved. However, most conventional copulas cannot model joint distributions where one variable does not increase or decrease in the other in a monotonic manner. For instance, suppose that two variables are linearly positively correlated for one type of unit and negatively for another type of unit. If the type is unobserved, we can observe only a mixture of both types. Seemingly, one variable tends to take either a high or low value (or a middle value) when the other variable is small (large), or vice versa. To address this issue, I consider an overlooked copula with trigonometric functions (Chesneau [2021, Applied Mathematics, 1(1), pp. 3–17]) that I name the “normal mode copula.” I apply the copula to a dataset about government formation and duration to demonstrate that the normal mode copula has better performance than other conventional copulas.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":null,"pages":null},"PeriodicalIF":5.4,"publicationDate":"2024-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139781765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Angela Lai, Megan A. Brown, James Bisbee, Joshua A. Tucker, Jonathan Nagler, Richard Bonneau
We present a method for estimating the ideology of political YouTube videos. The subfield of estimating ideology as a latent variable has often focused on traditional actors such as legislators, while more recent work has used social media data to estimate the ideology of ordinary users, political elites, and media sources. We build on this work to estimate the ideology of a political YouTube video. First, we start with a matrix of political Reddit posts linking to YouTube videos and apply correspondence analysis to place those videos in an ideological space. Second, we train a language model with those estimated ideologies as training labels, enabling us to estimate the ideologies of videos not posted on Reddit. These predicted ideologies are then validated against human labels. We demonstrate the utility of this method by applying it to the watch histories of survey respondents to evaluate the prevalence of echo chambers on YouTube in addition to the association between video ideology and viewer engagement. Our approach gives video-level scores based only on supplied text metadata, is scalable, and can be easily adjusted to account for changes in the ideological landscape.
{"title":"Estimating the Ideology of Political YouTube Videos","authors":"Angela Lai, Megan A. Brown, James Bisbee, Joshua A. Tucker, Jonathan Nagler, Richard Bonneau","doi":"10.1017/pan.2023.42","DOIUrl":"https://doi.org/10.1017/pan.2023.42","url":null,"abstract":"\u0000 We present a method for estimating the ideology of political YouTube videos. The subfield of estimating ideology as a latent variable has often focused on traditional actors such as legislators, while more recent work has used social media data to estimate the ideology of ordinary users, political elites, and media sources. We build on this work to estimate the ideology of a political YouTube video. First, we start with a matrix of political Reddit posts linking to YouTube videos and apply correspondence analysis to place those videos in an ideological space. Second, we train a language model with those estimated ideologies as training labels, enabling us to estimate the ideologies of videos not posted on Reddit. These predicted ideologies are then validated against human labels. We demonstrate the utility of this method by applying it to the watch histories of survey respondents to evaluate the prevalence of echo chambers on YouTube in addition to the association between video ideology and viewer engagement. Our approach gives video-level scores based only on supplied text metadata, is scalable, and can be easily adjusted to account for changes in the ideological landscape.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":null,"pages":null},"PeriodicalIF":5.4,"publicationDate":"2024-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139841347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}