首页 > 最新文献

Sociological Methodology最新文献

英文 中文
Deciding on the Starting Number of Classes of a Latent Class Tree. 决定潜在类树的起始类数。
IF 3 2区 社会学 Q1 Social Sciences Pub Date : 2018-08-01 Epub Date: 2018-06-21 DOI: 10.1177/0081175018780170
Mattis van den Bergh, Geert H van Kollenburg, Jeroen K Vermunt

In recent studies, latent class tree (LCT) modeling has been proposed as a convenient alternative to standard latent class (LC) analysis. Instead of using an estimation method in which all classes are formed simultaneously given the specified number of classes, in LCT analysis a hierarchical structure of mutually linked classes is obtained by sequentially splitting classes into two subclasses. The resulting tree structure gives a clear insight into how the classes are formed and how solutions with different numbers of classes are substantively linked to one another. A limitation of the current LCT modeling approach is that it allows only for binary splits, which in certain situations may be too restrictive. Especially at the root node of the tree, where an initial set of classes is created based on the most dominant associations present in the data, it may make sense to use a model with more than two classes. In this article, we propose a modification of the LCT approach that allows for a nonbinary split at the root node, and we provide methods to determine the appropriate number of classes in this first split, based either on theoretical grounds or on a relative improvement of fit measure. This novel approach also can be seen as a hybrid of a standard LC model and a binary LCT model, in which an initial, oversimplified but interpretable model is refined using an LCT approach. Furthermore, we show how to apply an LCT model when a nonstandard LC model is required. These new approaches are illustrated using two empirical applications: one on social capital and the other on (post)materialism.

在最近的研究中,潜在类树(LCT)模型被提出作为标准潜在类(LC)分析的一种方便的替代方法。在LCT分析中,不是使用给定一定数量的类同时形成所有类的估计方法,而是通过将类依次分成两个子类来获得相互联系的类的层次结构。由此产生的树状结构可以清楚地了解类是如何形成的,以及具有不同数量的类的解是如何相互联系的。当前LCT建模方法的一个限制是它只允许二进制分割,这在某些情况下可能过于严格。特别是在树的根节点上,根据数据中最主要的关联创建一组初始类,因此使用具有两个以上类的模型可能是有意义的。在本文中,我们提出了对LCT方法的修改,允许在根节点进行非二元分割,并提供了基于理论依据或相对改进的拟合度量来确定第一次分割中适当数量的类的方法。这种新颖的方法也可以看作是标准LC模型和二元LCT模型的混合体,在二元LCT模型中,使用LCT方法对初始的、过度简化的但可解释的模型进行了改进。此外,我们还展示了在需要非标准LC模型时如何应用LCT模型。这些新方法是用两个实证应用来说明的:一个是关于社会资本的,另一个是关于(后)唯物主义的。
{"title":"Deciding on the Starting Number of Classes of a Latent Class Tree.","authors":"Mattis van den Bergh,&nbsp;Geert H van Kollenburg,&nbsp;Jeroen K Vermunt","doi":"10.1177/0081175018780170","DOIUrl":"https://doi.org/10.1177/0081175018780170","url":null,"abstract":"<p><p>In recent studies, latent class tree (LCT) modeling has been proposed as a convenient alternative to standard latent class (LC) analysis. Instead of using an estimation method in which all classes are formed simultaneously given the specified number of classes, in LCT analysis a hierarchical structure of mutually linked classes is obtained by sequentially splitting classes into two subclasses. The resulting tree structure gives a clear insight into how the classes are formed and how solutions with different numbers of classes are substantively linked to one another. A limitation of the current LCT modeling approach is that it allows only for binary splits, which in certain situations may be too restrictive. Especially at the root node of the tree, where an initial set of classes is created based on the most dominant associations present in the data, it may make sense to use a model with more than two classes. In this article, we propose a modification of the LCT approach that allows for a nonbinary split at the root node, and we provide methods to determine the appropriate number of classes in this first split, based either on theoretical grounds or on a relative improvement of fit measure. This novel approach also can be seen as a hybrid of a standard LC model and a binary LCT model, in which an initial, oversimplified but interpretable model is refined using an LCT approach. Furthermore, we show how to apply an LCT model when a nonstandard LC model is required. These new approaches are illustrated using two empirical applications: one on social capital and the other on (post)materialism.</p>","PeriodicalId":48140,"journal":{"name":"Sociological Methodology","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/0081175018780170","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36859532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Comment: Bayes, Model Uncertainty, and Learning from Data 评论:贝叶斯、模型不确定性和从数据中学习
IF 3 2区 社会学 Q1 Social Sciences Pub Date : 2018-08-01 DOI: 10.1177/0081175018799095
B. Western
Robert M. O’Brien is a professor emeritus at the University of Oregon. He specializes in criminology and quantitative methods. Within criminology, he focuses on the methods used to gather criminological data, on the analysis of crime rates, and on the task of extricating the effects of ages, periods, and cohorts on crime rates. His most recent publication on that topic, “Homicide Arrest Rate Trends in the United States: The Contributions of Periods and Cohorts (1965–2015),” appeared in 2018 in the Journal of Quantitative Criminology. In quantitative methods, some of his contributions involve the effects of using interval data as ordinal, generalizability theory, identification in structural equation modeling measurement models, the use of multicollinearity indices, and an obsession with age-period-cohort models. In 2015 he published a book on this topic, Age-Period-Cohort Models: Approaches and Analyses with Aggregate Data (Chapman & Hall, 2015).
Robert M.O'Brien是俄勒冈大学的名誉教授。他专门研究犯罪学和定量方法。在犯罪学领域,他专注于收集犯罪学数据的方法,犯罪率的分析,以及消除年龄、时期和群体对犯罪率的影响的任务。他关于这一主题的最新出版物《美国凶杀案逮捕率趋势:时期和群体的贡献(1965–2015)》于2018年发表在《定量犯罪学杂志》上。在定量方法中,他的一些贡献涉及使用区间数据作为序数的影响、可推广性理论、结构方程建模测量模型中的识别、多重共线性指数的使用以及对年龄段队列模型的痴迷。2015年,他出版了一本关于这一主题的书,《年龄段队列模型:聚合数据的方法和分析》(Chapman&Hall,2015)。
{"title":"Comment: Bayes, Model Uncertainty, and Learning from Data","authors":"B. Western","doi":"10.1177/0081175018799095","DOIUrl":"https://doi.org/10.1177/0081175018799095","url":null,"abstract":"Robert M. O’Brien is a professor emeritus at the University of Oregon. He specializes in criminology and quantitative methods. Within criminology, he focuses on the methods used to gather criminological data, on the analysis of crime rates, and on the task of extricating the effects of ages, periods, and cohorts on crime rates. His most recent publication on that topic, “Homicide Arrest Rate Trends in the United States: The Contributions of Periods and Cohorts (1965–2015),” appeared in 2018 in the Journal of Quantitative Criminology. In quantitative methods, some of his contributions involve the effects of using interval data as ordinal, generalizability theory, identification in structural equation modeling measurement models, the use of multicollinearity indices, and an obsession with age-period-cohort models. In 2015 he published a book on this topic, Age-Period-Cohort Models: Approaches and Analyses with Aggregate Data (Chapman & Hall, 2015).","PeriodicalId":48140,"journal":{"name":"Sociological Methodology","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/0081175018799095","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48945486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Comment: Some Challenges When Estimating the Impact of Model Uncertainty on Coefficient Instability 评论:在估计模型不确定性对系数不稳定性的影响时面临的一些挑战
IF 3 2区 社会学 Q1 Social Sciences Pub Date : 2018-08-01 DOI: 10.1177/0081175018790569
Robert M. O’Brien
{"title":"Comment: Some Challenges When Estimating the Impact of Model Uncertainty on Coefficient Instability","authors":"Robert M. O’Brien","doi":"10.1177/0081175018790569","DOIUrl":"https://doi.org/10.1177/0081175018790569","url":null,"abstract":"","PeriodicalId":48140,"journal":{"name":"Sociological Methodology","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/0081175018790569","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44241112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Nonlinear Autoregressive Latent Trajectory Models 非线性自回归潜在轨迹模型
IF 3 2区 社会学 Q1 Social Sciences Pub Date : 2018-08-01 DOI: 10.1177/0081175018789441
Shawn Bauldry, K. Bollen
Autoregressive latent trajectory (ALT) models combine features of latent growth curve models and autoregressive models into a single modeling framework. The development of ALT models has focused primarily on models with linear growth components, but some social processes follow nonlinear trajectories. Although it is straightforward to extend ALT models to allow for some forms of nonlinear trajectories, the identification status of such models, approaches to comparing them with alternative models, and the interpretation of parameters have not been systematically assessed. In this paper we focus on two forms of nonlinear autoregressive latent trajectory (NLALT) models. The first form allows for a quadratic growth trajectory, a popular form of nonlinear latent growth curve models. The second form derives from latent basis models, or freed loading models, that allow for arbitrary growth processes. We discuss details concerning parameterization, model identification, estimation, and testing for the two forms of NLALT models. We include a simulation study that illustrates potential biases that may arise from fitting alternative models to data derived from an autoregressive process and individual-specific nonlinear trajectories. In addition, we include an extended empirical example modeling growth trajectories of weight from birth through age 2.
自回归潜在轨迹(ALT)模型将潜在增长曲线模型和自回归模型的特征结合到一个单一的建模框架中。ALT模型的发展主要集中在具有线性增长成分的模型上,但一些社会过程遵循非线性轨迹。尽管扩展ALT模型以允许某些形式的非线性轨迹是很简单的,但尚未系统评估此类模型的识别状态、将其与替代模型进行比较的方法以及参数的解释。本文主要研究两种形式的非线性自回归潜在轨迹(NLALT)模型。第一种形式允许二次增长轨迹,这是非线性潜在增长曲线模型的一种流行形式。第二种形式源自潜在基础模型,或自由加载模型,允许任意增长过程。我们讨论了两种形式的NLALT模型的参数化、模型识别、估计和测试的细节。我们包括一项模拟研究,该研究说明了将替代模型拟合自回归过程和个体特定非线性轨迹得出的数据可能产生的潜在偏差。此外,我们还提供了一个扩展的经验示例,对从出生到2岁的体重增长轨迹进行建模。
{"title":"Nonlinear Autoregressive Latent Trajectory Models","authors":"Shawn Bauldry, K. Bollen","doi":"10.1177/0081175018789441","DOIUrl":"https://doi.org/10.1177/0081175018789441","url":null,"abstract":"Autoregressive latent trajectory (ALT) models combine features of latent growth curve models and autoregressive models into a single modeling framework. The development of ALT models has focused primarily on models with linear growth components, but some social processes follow nonlinear trajectories. Although it is straightforward to extend ALT models to allow for some forms of nonlinear trajectories, the identification status of such models, approaches to comparing them with alternative models, and the interpretation of parameters have not been systematically assessed. In this paper we focus on two forms of nonlinear autoregressive latent trajectory (NLALT) models. The first form allows for a quadratic growth trajectory, a popular form of nonlinear latent growth curve models. The second form derives from latent basis models, or freed loading models, that allow for arbitrary growth processes. We discuss details concerning parameterization, model identification, estimation, and testing for the two forms of NLALT models. We include a simulation study that illustrates potential biases that may arise from fitting alternative models to data derived from an autoregressive process and individual-specific nonlinear trajectories. In addition, we include an extended empirical example modeling growth trajectories of weight from birth through age 2.","PeriodicalId":48140,"journal":{"name":"Sociological Methodology","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/0081175018789441","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42277749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Rejoinder: Can We Weight Models by Their Probability of Being True? 复辩状:我们能根据模型为真的概率对其进行加权吗?
IF 3 2区 社会学 Q1 Social Sciences Pub Date : 2018-08-01 DOI: 10.1177/0081175018796841
John Muñoz, Cristobal Young
Draper, David. 1995. “Assessment and Propagation of Model Uncertainty.” Journal of the Royal Statistical Society, Series B 57:45–97. Freedman, David A. 1983. “A Note on Screening Regression Equations.” American Statistician 37:152–55. Leamer, Edward E. 1983. “Let’s Take the Con out of Econometrics.” American Economic Review 73:31–43. Raftery, Adrian E. 1996. “Approximate Bayes Factors and Accounting for Model Uncertainty in Generalised Linear Models.” Biometrika 83:251–66. Young, Cristobal, and Katherine Holsteen. 2017. “Model Uncertainty and Robustness: A Computational Framework for Multimodel Analysis.” Sociological Methods and Research 46:3–40.
大卫·德雷柏1995。模型不确定性的评估与传播皇家统计学会学报,B辑57:45-97。弗里德曼,大卫A. 1983。关于筛选回归方程的注释。美国统计学家37:152-55。Leamer, Edward E. 1983。“让我们摒弃计量经济学的弊端。”美国经济评论73:31-43。拉特里,阿德里安E. 1996。广义线性模型中近似贝叶斯因子和模型不确定性的计算。生物统计学83:251 - 66。Young, Cristobal和Katherine Holsteen, 2017。模型不确定性和鲁棒性:多模型分析的计算框架。社会学方法与研究46:3-40。
{"title":"Rejoinder: Can We Weight Models by Their Probability of Being True?","authors":"John Muñoz, Cristobal Young","doi":"10.1177/0081175018796841","DOIUrl":"https://doi.org/10.1177/0081175018796841","url":null,"abstract":"Draper, David. 1995. “Assessment and Propagation of Model Uncertainty.” Journal of the Royal Statistical Society, Series B 57:45–97. Freedman, David A. 1983. “A Note on Screening Regression Equations.” American Statistician 37:152–55. Leamer, Edward E. 1983. “Let’s Take the Con out of Econometrics.” American Economic Review 73:31–43. Raftery, Adrian E. 1996. “Approximate Bayes Factors and Accounting for Model Uncertainty in Generalised Linear Models.” Biometrika 83:251–66. Young, Cristobal, and Katherine Holsteen. 2017. “Model Uncertainty and Robustness: A Computational Framework for Multimodel Analysis.” Sociological Methods and Research 46:3–40.","PeriodicalId":48140,"journal":{"name":"Sociological Methodology","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/0081175018796841","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46160602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Rejoinder: On the Assumptions of Inferential Model Selection—A Response to Vassend and Weakliem 答辩:论推理模型选择的假设——对Vassend和Weakliem的回应
IF 3 2区 社会学 Q1 Social Sciences Pub Date : 2018-08-01 DOI: 10.1177/0081175018794488
Michael Schultz
{"title":"Rejoinder: On the Assumptions of Inferential Model Selection—A Response to Vassend and Weakliem","authors":"Michael Schultz","doi":"10.1177/0081175018794488","DOIUrl":"https://doi.org/10.1177/0081175018794488","url":null,"abstract":"","PeriodicalId":48140,"journal":{"name":"Sociological Methodology","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/0081175018794488","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44095198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Causal Inference with Networked Treatment Diffusion 网络化治疗扩散的因果推理
IF 3 2区 社会学 Q1 Social Sciences Pub Date : 2018-07-25 DOI: 10.1177/0081175018785216
Weihua An
Treatment interference (i.e., one unit’s potential outcomes depend on other units’ treatment) is prevalent in social settings. Ignoring treatment interference can lead to biased estimates of treatment effects and incorrect statistical inferences. Some recent studies have started to incorporate treatment interference into causal inference. But treatment interference is often assumed to follow a simple structure (e.g., treatment interference exists only within groups) or measured in a simplistic way (e.g., only based on the number of treated friends). In this paper, I highlight the importance of collecting data on actual treatment diffusion in order to more accurately measure treatment interference. Furthermore, I show that with accurate measures of treatment interference, we can identify and estimate a series of causal effects that are previously unavailable, including the direct treatment effect, treatment interference effect, and treatment effect on interference. I illustrate the methods through a case study of a social network–based smoking prevention intervention.
治疗干扰(即一个单位的潜在结果取决于其他单位的治疗)在社会环境中很普遍。忽视治疗干扰可能导致对治疗效果的有偏差估计和不正确的统计推断。最近的一些研究已经开始将治疗干扰纳入因果推理。但治疗干扰通常被认为遵循一个简单的结构(例如,治疗干扰只存在于群体内)或以一种简单的方式衡量(例如,仅基于接受治疗的朋友的数量)。在本文中,我强调了收集实际治疗扩散数据的重要性,以便更准确地测量治疗干扰。此外,我表明,通过对治疗干扰的准确测量,我们可以识别和估计一系列以前无法获得的因果效应,包括直接治疗效应、治疗干扰效应和治疗对干扰的影响。我通过一个基于社会网络的吸烟预防干预的案例研究来说明这些方法。
{"title":"Causal Inference with Networked Treatment Diffusion","authors":"Weihua An","doi":"10.1177/0081175018785216","DOIUrl":"https://doi.org/10.1177/0081175018785216","url":null,"abstract":"Treatment interference (i.e., one unit’s potential outcomes depend on other units’ treatment) is prevalent in social settings. Ignoring treatment interference can lead to biased estimates of treatment effects and incorrect statistical inferences. Some recent studies have started to incorporate treatment interference into causal inference. But treatment interference is often assumed to follow a simple structure (e.g., treatment interference exists only within groups) or measured in a simplistic way (e.g., only based on the number of treated friends). In this paper, I highlight the importance of collecting data on actual treatment diffusion in order to more accurately measure treatment interference. Furthermore, I show that with accurate measures of treatment interference, we can identify and estimate a series of causal effects that are previously unavailable, including the direct treatment effect, treatment interference effect, and treatment effect on interference. I illustrate the methods through a case study of a social network–based smoking prevention intervention.","PeriodicalId":48140,"journal":{"name":"Sociological Methodology","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2018-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/0081175018785216","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46496172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
The Problem of Underdetermination in Model Selection 模型选择中的欠确定问题
IF 3 2区 社会学 Q1 Social Sciences Pub Date : 2018-07-13 DOI: 10.1177/0081175018786762
Michael Schultz
Conventional model selection evaluates models on their ability to represent data accurately, ignoring their dependence on theoretical and methodological assumptions. Drawing on the concept of underdetermination from the philosophy of science, the author argues that uncritical use of methodological assumptions can pose a problem for effective inference. By ignoring the plausibility of assumptions, existing techniques select models that are poor representations of theory and are thus suboptimal for inference. To address this problem, the author proposes a new paradigm for inference-oriented model selection that evaluates models on the basis of a trade-off between model fit and model plausibility. By comparing the fits of sequentially nested models, it is possible to derive an empirical lower bound for the subjective plausibility of assumptions. To demonstrate the effectiveness of this approach, the method is applied to models of the relationship between cultural tastes and network composition.
传统的模型选择根据模型准确表示数据的能力来评估模型,忽略了它们对理论和方法假设的依赖性。根据科学哲学中的不确定性概念,作者认为,不加批判地使用方法论假设可能会给有效推理带来问题。通过忽略假设的合理性,现有技术选择的模型对理论的表述很差,因此对推理来说是次优的。为了解决这个问题,作者提出了一种面向推理的模型选择新范式,该范式基于模型拟合和模型合理性之间的权衡来评估模型。通过比较顺序嵌套模型的拟合,可以得出假设主观合理性的经验下界。为了证明这种方法的有效性,将该方法应用于文化品味和网络构成之间关系的模型中。
{"title":"The Problem of Underdetermination in Model Selection","authors":"Michael Schultz","doi":"10.1177/0081175018786762","DOIUrl":"https://doi.org/10.1177/0081175018786762","url":null,"abstract":"Conventional model selection evaluates models on their ability to represent data accurately, ignoring their dependence on theoretical and methodological assumptions. Drawing on the concept of underdetermination from the philosophy of science, the author argues that uncritical use of methodological assumptions can pose a problem for effective inference. By ignoring the plausibility of assumptions, existing techniques select models that are poor representations of theory and are thus suboptimal for inference. To address this problem, the author proposes a new paradigm for inference-oriented model selection that evaluates models on the basis of a trade-off between model fit and model plausibility. By comparing the fits of sequentially nested models, it is possible to derive an empirical lower bound for the subjective plausibility of assumptions. To demonstrate the effectiveness of this approach, the method is applied to models of the relationship between cultural tastes and network composition.","PeriodicalId":48140,"journal":{"name":"Sociological Methodology","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2018-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/0081175018786762","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42436313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
We Ran 9 Billion Regressions: Eliminating False Positives through Computational Model Robustness 我们运行了90亿次回归:通过计算模型稳健性消除误报
IF 3 2区 社会学 Q1 Social Sciences Pub Date : 2018-07-13 DOI: 10.1177/0081175018777988
John Muñoz, Cristobal Young
False positive findings are a growing problem in many research literatures. We argue that excessive false positives often stem from model uncertainty. There are many plausible ways of specifying a regression model, but researchers typically report only a few preferred estimates. This raises the concern that such research reveals only a small fraction of the possible results and may easily lead to nonrobust, false positive conclusions. It is often unclear how much the results are driven by model specification and how much the results would change if a different plausible model were used. Computational model robustness analysis addresses this challenge by estimating all possible models from a theoretically informed model space. We use large-scale random noise simulations to show (1) the problem of excess false positive errors under model uncertainty and (2) that computational robustness analysis can identify and eliminate false positives caused by model uncertainty. We also draw on a series of empirical applications to further illustrate issues of model uncertainty and estimate instability. Computational robustness analysis offers a method for relaxing modeling assumptions and improving the transparency of applied research.
假阳性发现在许多研究文献中是一个日益严重的问题。我们认为,过多的误报往往源于模型的不确定性。有很多合理的方法可以指定回归模型,但研究人员通常只报告一些首选的估计值。这引发了人们的担忧,即此类研究只揭示了可能结果的一小部分,可能很容易导致不可靠的假阳性结论。通常不清楚模型规范在多大程度上驱动了结果,以及如果使用不同的合理模型,结果会发生多大变化。计算模型稳健性分析通过从理论上知情的模型空间估计所有可能的模型来解决这一挑战。我们使用大规模随机噪声模拟来展示(1)模型不确定性下的超额误报误差问题,以及(2)计算鲁棒性分析可以识别和消除由模型不确定性引起的误报。我们还利用一系列经验应用来进一步说明模型的不确定性和估计的不稳定性问题。计算稳健性分析为放宽建模假设和提高应用研究的透明度提供了一种方法。
{"title":"We Ran 9 Billion Regressions: Eliminating False Positives through Computational Model Robustness","authors":"John Muñoz, Cristobal Young","doi":"10.1177/0081175018777988","DOIUrl":"https://doi.org/10.1177/0081175018777988","url":null,"abstract":"False positive findings are a growing problem in many research literatures. We argue that excessive false positives often stem from model uncertainty. There are many plausible ways of specifying a regression model, but researchers typically report only a few preferred estimates. This raises the concern that such research reveals only a small fraction of the possible results and may easily lead to nonrobust, false positive conclusions. It is often unclear how much the results are driven by model specification and how much the results would change if a different plausible model were used. Computational model robustness analysis addresses this challenge by estimating all possible models from a theoretically informed model space. We use large-scale random noise simulations to show (1) the problem of excess false positive errors under model uncertainty and (2) that computational robustness analysis can identify and eliminate false positives caused by model uncertainty. We also draw on a series of empirical applications to further illustrate issues of model uncertainty and estimate instability. Computational robustness analysis offers a method for relaxing modeling assumptions and improving the transparency of applied research.","PeriodicalId":48140,"journal":{"name":"Sociological Methodology","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2018-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/0081175018777988","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44889259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 37
Estimating Income Statistics from Grouped Data: Mean-constrained Integration over Brackets 从分组数据估计收入统计:括号上的均值约束整合
IF 3 2区 社会学 Q1 Social Sciences Pub Date : 2018-07-09 DOI: 10.1177/0081175018782579
P. Jargowsky, Christopher A. Wheeler
Researchers studying income inequality, economic segregation, and other subjects must often rely on grouped data—that is, data in which thousands or millions of observations have been reduced to counts of units by specified income brackets. The distribution of households within the brackets is unknown, and highest incomes are often included in an open-ended top bracket, such as “$200,000 and above.” Common approaches to this estimation problem include calculating midpoint estimators with an assumed Pareto distribution in the top bracket and fitting a flexible multiple-parameter distribution to the data. The authors describe a new method, mean-constrained integration over brackets (MCIB), that is far more accurate than those methods using only the bracket counts and the overall mean of the data. On the basis of an analysis of 297 metropolitan areas, MCIB produces estimates of the standard deviation, Gini coefficient, and Theil index that are correlated at 0.997, 0.998, and 0.991, respectively, with the parameters calculated from the underlying individual record data. Similar levels of accuracy are obtained for percentiles of the distribution and the shares of income by quintiles of the distribution. The technique can easily be extended to other distributional parameters and inequality statistics.
研究收入不平等、经济隔离和其他主题的研究人员必须经常依赖分组数据——也就是说,在这些数据中,成千上万或数百万的观察结果被简化为特定收入等级的单位数。括号内的家庭分布是未知的,最高收入通常包括在一个开放的顶部括号中,如“200000美元及以上”。解决这一估计问题的常见方法包括在顶部括号中使用假定的Pareto分布计算中点估计量,并将灵活的多参数分布拟合到数据中。作者描述了一种新的方法,即括号上的平均约束积分(MCIB),它比那些只使用括号计数和数据总平均值的方法准确得多。基于对297个大都市地区的分析,MCIB得出了标准差、基尼系数和泰尔指数的估计值,这些估计值分别与根据基本个人记录数据计算的参数相关,分别为0.997、0.998和0.991。对于分布的百分位数和按分布的五分位数划分的收入份额,可以获得类似的准确度。该技术可以很容易地扩展到其他分布参数和不等式统计。
{"title":"Estimating Income Statistics from Grouped Data: Mean-constrained Integration over Brackets","authors":"P. Jargowsky, Christopher A. Wheeler","doi":"10.1177/0081175018782579","DOIUrl":"https://doi.org/10.1177/0081175018782579","url":null,"abstract":"Researchers studying income inequality, economic segregation, and other subjects must often rely on grouped data—that is, data in which thousands or millions of observations have been reduced to counts of units by specified income brackets. The distribution of households within the brackets is unknown, and highest incomes are often included in an open-ended top bracket, such as “$200,000 and above.” Common approaches to this estimation problem include calculating midpoint estimators with an assumed Pareto distribution in the top bracket and fitting a flexible multiple-parameter distribution to the data. The authors describe a new method, mean-constrained integration over brackets (MCIB), that is far more accurate than those methods using only the bracket counts and the overall mean of the data. On the basis of an analysis of 297 metropolitan areas, MCIB produces estimates of the standard deviation, Gini coefficient, and Theil index that are correlated at 0.997, 0.998, and 0.991, respectively, with the parameters calculated from the underlying individual record data. Similar levels of accuracy are obtained for percentiles of the distribution and the shares of income by quintiles of the distribution. The technique can easily be extended to other distributional parameters and inequality statistics.","PeriodicalId":48140,"journal":{"name":"Sociological Methodology","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2018-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/0081175018782579","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46483285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
期刊
Sociological Methodology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1