主题、概念和度量：验证主题作为度量的众包程序

IF 4.7 2区社会学 Q1 POLITICAL SCIENCE Political Analysis Pub Date : 2021-09-27 DOI:10.1017/pan.2021.33

Luwei Ying, J. Montgomery, Brandon M Stewart

{"title":"主题、概念和度量：验证主题作为度量的众包程序","authors":"Luwei Ying, J. Montgomery, Brandon M Stewart","doi":"10.1017/pan.2021.33","DOIUrl":null,"url":null,"abstract":"Abstract Topic models, as developed in computer science, are effective tools for exploring and summarizing large document collections. When applied in social science research, however, they are commonly used for measurement, a task that requires careful validation to ensure that the model outputs actually capture the desired concept of interest. In this paper, we review current practices for topic validation in the field and show that extensive model validation is increasingly rare, or at least not systematically reported in papers and appendices. To supplement current practices, we refine an existing crowd-sourcing method by Chang and coauthors for validating topic quality and go on to create new procedures for validating conceptual labels provided by the researcher. We illustrate our method with an analysis of Facebook posts by U.S. Senators and provide software and guidance for researchers wishing to validate their own topic models. While tailored, case-specific validation exercises will always be best, we aim to improve standard practices by providing a general-purpose tool to validate topics as measures.","PeriodicalId":48270,"journal":{"name":"Political Analysis","volume":"30 1","pages":"570 - 589"},"PeriodicalIF":4.7000,"publicationDate":"2021-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":"{\"title\":\"Topics, Concepts, and Measurement: A Crowdsourced Procedure for Validating Topics as Measures\",\"authors\":\"Luwei Ying, J. Montgomery, Brandon M Stewart\",\"doi\":\"10.1017/pan.2021.33\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract Topic models, as developed in computer science, are effective tools for exploring and summarizing large document collections. When applied in social science research, however, they are commonly used for measurement, a task that requires careful validation to ensure that the model outputs actually capture the desired concept of interest. In this paper, we review current practices for topic validation in the field and show that extensive model validation is increasingly rare, or at least not systematically reported in papers and appendices. To supplement current practices, we refine an existing crowd-sourcing method by Chang and coauthors for validating topic quality and go on to create new procedures for validating conceptual labels provided by the researcher. We illustrate our method with an analysis of Facebook posts by U.S. Senators and provide software and guidance for researchers wishing to validate their own topic models. While tailored, case-specific validation exercises will always be best, we aim to improve standard practices by providing a general-purpose tool to validate topics as measures.\",\"PeriodicalId\":48270,\"journal\":{\"name\":\"Political Analysis\",\"volume\":\"30 1\",\"pages\":\"570 - 589\"},\"PeriodicalIF\":4.7000,\"publicationDate\":\"2021-09-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"22\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Political Analysis\",\"FirstCategoryId\":\"90\",\"ListUrlMain\":\"https://doi.org/10.1017/pan.2021.33\",\"RegionNum\":2,\"RegionCategory\":\"社会学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"POLITICAL SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Political Analysis","FirstCategoryId":"90","ListUrlMain":"https://doi.org/10.1017/pan.2021.33","RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"POLITICAL SCIENCE","Score":null,"Total":0}

引用次数: 22

摘要

摘要主题模型是在计算机科学中发展起来的，是探索和总结大型文档集的有效工具。然而，当应用于社会科学研究时，它们通常用于测量，这项任务需要仔细验证，以确保模型输出实际捕捉到所需的兴趣概念。在本文中，我们回顾了该领域主题验证的当前实践，并表明广泛的模型验证越来越罕见，或者至少在论文和附录中没有系统地报告。为了补充当前的实践，我们改进了Chang和合著者现有的众包方法，以验证主题质量，并继续创建新的程序来验证研究人员提供的概念标签。我们通过分析美国参议员在脸书上的帖子来说明我们的方法，并为希望验证自己的主题模型的研究人员提供软件和指导。虽然量身定制的、针对具体案例的验证练习总是最好的，但我们的目标是通过提供一个通用工具来验证主题作为衡量标准来改进标准实践。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Topics, Concepts, and Measurement: A Crowdsourced Procedure for Validating Topics as Measures

Abstract Topic models, as developed in computer science, are effective tools for exploring and summarizing large document collections. When applied in social science research, however, they are commonly used for measurement, a task that requires careful validation to ensure that the model outputs actually capture the desired concept of interest. In this paper, we review current practices for topic validation in the field and show that extensive model validation is increasingly rare, or at least not systematically reported in papers and appendices. To supplement current practices, we refine an existing crowd-sourcing method by Chang and coauthors for validating topic quality and go on to create new procedures for validating conceptual labels provided by the researcher. We illustrate our method with an analysis of Facebook posts by U.S. Senators and provide software and guidance for researchers wishing to validate their own topic models. While tailored, case-specific validation exercises will always be best, we aim to improve standard practices by providing a general-purpose tool to validate topics as measures.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Political Analysis POLITICAL SCIENCE-

CiteScore

8.80

自引率

3.70%

发文量

期刊介绍： Political Analysis chronicles these exciting developments by publishing the most sophisticated scholarship in the field. It is the place to learn new methods, to find some of the best empirical scholarship, and to publish your best research.