Pub Date : 2023-08-01DOI: 10.1177/00811750231163832
Lisa Avery, Michael Rotondi
Respondent-driven sampling (RDS) is used to measure trait or disease prevalence in populations that are difficult to reach and often marginalized. The authors evaluated the performance of RDS estimators under varying conditions of trait prevalence, homophily, and relative activity. They used large simulated networks (N = 20,000) derived from real-world RDS degree reports and an empirical Facebook network (N = 22,470) to evaluate estimators of binary and categorical trait prevalence. Variability in prevalence estimates is higher when network degree is drawn from real-world samples than from the commonly assumed Poisson distribution, resulting in lower coverage rates. Newer estimators perform well when the sample is a substantive proportion of the population, but bias is present when the population size is unknown. The choice of preferred RDS estimator needs to be study specific, considering both statistical properties and knowledge of the population under study.
{"title":"Evaluation of Respondent-Driven Sampling Prevalence Estimators Using Real-World Reported Network Degree.","authors":"Lisa Avery, Michael Rotondi","doi":"10.1177/00811750231163832","DOIUrl":"https://doi.org/10.1177/00811750231163832","url":null,"abstract":"<p><p>Respondent-driven sampling (RDS) is used to measure trait or disease prevalence in populations that are difficult to reach and often marginalized. The authors evaluated the performance of RDS estimators under varying conditions of trait prevalence, homophily, and relative activity. They used large simulated networks (<i>N</i> = 20,000) derived from real-world RDS degree reports and an empirical Facebook network (<i>N</i> = 22,470) to evaluate estimators of binary and categorical trait prevalence. Variability in prevalence estimates is higher when network degree is drawn from real-world samples than from the commonly assumed Poisson distribution, resulting in lower coverage rates. Newer estimators perform well when the sample is a substantive proportion of the population, but bias is present when the population size is unknown. The choice of preferred RDS estimator needs to be study specific, considering both statistical properties and knowledge of the population under study.</p>","PeriodicalId":48140,"journal":{"name":"Sociological Methodology","volume":"53 2","pages":"269-287"},"PeriodicalIF":3.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/23/b9/10.1177_00811750231163832.PMC10338697.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10302746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-17DOI: 10.1177/00811750231183711
S. Park, Suyeon Kang, Chioun Lee
Causal decomposition analysis is among the rapidly growing number of tools for identifying factors (“mediators”) that contribute to disparities in outcomes between social groups. An example of such mediators is college completion, which explains later health disparities between Black women and White men. The goal is to quantify how much a disparity would be reduced (or remain) if we hypothetically intervened to set the mediator distribution equal across social groups. Despite increasing interest in estimating disparity reduction and the disparity that remains, various estimation procedures are not straightforward, and researchers have scant guidance for choosing an optimal method. In this article, the authors evaluate the performance in terms of bias, variance, and coverage of three approaches that use different modeling strategies: (1) regression-based methods that impose restrictive modeling assumptions (e.g., linearity) and (2) weighting-based and (3) imputation-based methods that rely on the observed distribution of variables. The authors find a trade-off between the modeling assumptions required in the method and its performance. In terms of performance, regression-based methods operate best as long as the restrictive assumption of linearity is met. Methods relying on mediator models without imposing any modeling assumptions are sensitive to the ratio of the group-mediator association to the mediator-outcome association. These results highlight the importance of selecting an appropriate estimation procedure considering the data at hand.
{"title":"Choosing an Optimal Method for Causal Decomposition Analysis with Continuous Outcomes: A Review and Simulation Study","authors":"S. Park, Suyeon Kang, Chioun Lee","doi":"10.1177/00811750231183711","DOIUrl":"https://doi.org/10.1177/00811750231183711","url":null,"abstract":"Causal decomposition analysis is among the rapidly growing number of tools for identifying factors (“mediators”) that contribute to disparities in outcomes between social groups. An example of such mediators is college completion, which explains later health disparities between Black women and White men. The goal is to quantify how much a disparity would be reduced (or remain) if we hypothetically intervened to set the mediator distribution equal across social groups. Despite increasing interest in estimating disparity reduction and the disparity that remains, various estimation procedures are not straightforward, and researchers have scant guidance for choosing an optimal method. In this article, the authors evaluate the performance in terms of bias, variance, and coverage of three approaches that use different modeling strategies: (1) regression-based methods that impose restrictive modeling assumptions (e.g., linearity) and (2) weighting-based and (3) imputation-based methods that rely on the observed distribution of variables. The authors find a trade-off between the modeling assumptions required in the method and its performance. In terms of performance, regression-based methods operate best as long as the restrictive assumption of linearity is met. Methods relying on mediator models without imposing any modeling assumptions are sensitive to the ratio of the group-mediator association to the mediator-outcome association. These results highlight the importance of selecting an appropriate estimation procedure considering the data at hand.","PeriodicalId":48140,"journal":{"name":"Sociological Methodology","volume":" ","pages":""},"PeriodicalIF":3.0,"publicationDate":"2023-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42569427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-11DOI: 10.1177/00811750231184460
O. Aksoy, S. Yıldırım
The flow of resources across nodes over time (e.g., migration, financial transfers, peer-to-peer interactions) is a common phenomenon in sociology. Standard statistical methods are inadequate to model such interdependent flows. We propose a hierarchical Dirichlet-multinomial regression model and a Bayesian estimation method. We apply the model to analyze 25,632,876 migration instances that took place between Turkey’s 81 provinces from 2009 to 2018. We then discuss the methodological and substantive implications of our results. Methodologically, we demonstrate the predictive advantage of our model compared to its most common alternative in migration research, the gravity model. We also discuss our model in the context of other approaches, mostly developed in the social networks literature. Substantively, we find that population, economic prosperity, the spatial and political distance between the origin and destination, the strength of the AKP (Justice and Development Party) in a province, and the network characteristics of the provinces are important predictors of migration, whereas the proportion of ethnic minority Kurds in a province has no positive association with in- and out-migration.
{"title":"A Model of Dynamic Flows: Explaining Turkey’s Interprovincial Migration","authors":"O. Aksoy, S. Yıldırım","doi":"10.1177/00811750231184460","DOIUrl":"https://doi.org/10.1177/00811750231184460","url":null,"abstract":"The flow of resources across nodes over time (e.g., migration, financial transfers, peer-to-peer interactions) is a common phenomenon in sociology. Standard statistical methods are inadequate to model such interdependent flows. We propose a hierarchical Dirichlet-multinomial regression model and a Bayesian estimation method. We apply the model to analyze 25,632,876 migration instances that took place between Turkey’s 81 provinces from 2009 to 2018. We then discuss the methodological and substantive implications of our results. Methodologically, we demonstrate the predictive advantage of our model compared to its most common alternative in migration research, the gravity model. We also discuss our model in the context of other approaches, mostly developed in the social networks literature. Substantively, we find that population, economic prosperity, the spatial and political distance between the origin and destination, the strength of the AKP (Justice and Development Party) in a province, and the network characteristics of the provinces are important predictors of migration, whereas the proportion of ethnic minority Kurds in a province has no positive association with in- and out-migration.","PeriodicalId":48140,"journal":{"name":"Sociological Methodology","volume":" ","pages":""},"PeriodicalIF":3.0,"publicationDate":"2023-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48567827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-15DOI: 10.1177/00811750231177026
Satu Helske, Jouni Helske, Guilherme K. Chihaya
Sequence analysis is increasingly used in the social sciences for the holistic analysis of life-course and other longitudinal data. The usual approach is to construct sequences, calculate dissimilarities, group similar sequences with cluster analysis, and use cluster membership as a dependent or independent variable in a regression model. This approach may be problematic, as cluster memberships are assumed to be fixed known characteristics of the subjects in subsequent analyses. Furthermore, it is often more reasonable to assume that individual sequences are mixtures of multiple ideal types rather than equal members of some group. Failing to account for uncertain and mixed memberships may lead to wrong conclusions about the nature of the studied relationships. In this article, the authors bring forward and discuss the problems of the “traditional” use of sequence analysis clusters as variables and compare four approaches for creating explanatory variables from sequence dissimilarities using different types of data. The authors conduct simulation and empirical studies, demonstrating the importance of considering how sequences and outcomes are related and the need to adjust analyses accordingly. In many typical social science applications, the traditional approach is prone to result in wrong conclusions, and similarity-based approaches such as representativeness should be preferred.
{"title":"From Sequences to Variables: Rethinking the Relationship between Sequences and Outcomes","authors":"Satu Helske, Jouni Helske, Guilherme K. Chihaya","doi":"10.1177/00811750231177026","DOIUrl":"https://doi.org/10.1177/00811750231177026","url":null,"abstract":"Sequence analysis is increasingly used in the social sciences for the holistic analysis of life-course and other longitudinal data. The usual approach is to construct sequences, calculate dissimilarities, group similar sequences with cluster analysis, and use cluster membership as a dependent or independent variable in a regression model. This approach may be problematic, as cluster memberships are assumed to be fixed known characteristics of the subjects in subsequent analyses. Furthermore, it is often more reasonable to assume that individual sequences are mixtures of multiple ideal types rather than equal members of some group. Failing to account for uncertain and mixed memberships may lead to wrong conclusions about the nature of the studied relationships. In this article, the authors bring forward and discuss the problems of the “traditional” use of sequence analysis clusters as variables and compare four approaches for creating explanatory variables from sequence dissimilarities using different types of data. The authors conduct simulation and empirical studies, demonstrating the importance of considering how sequences and outcomes are related and the need to adjust analyses accordingly. In many typical social science applications, the traditional approach is prone to result in wrong conclusions, and similarity-based approaches such as representativeness should be preferred.","PeriodicalId":48140,"journal":{"name":"Sociological Methodology","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134890272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-05-20DOI: 10.1177/00811750231169729
Maik Hamjediers, Maximilian Sprengholz
Decompositions make it possible to investigate whether gaps between groups in certain outcomes would remain if groups had comparable characteristics. In practice, however, such a counterfactual comparability is difficult to establish in the presence of lacking common support, functional-form misspecification, and insufficient sample size. In this article, the authors show how decompositions can be undermined by these three interrelated issues by comparing the results of a regression-based Kitagawa-Blinder-Oaxaca decomposition and matching decompositions applied to simulated and real-world data. The results show that matching decompositions are robust to issues of common support and functional-form misspecification but demand a large number of observations. Kitagawa-Blinder-Oaxaca decompositions provide consistent estimates also for smaller samples but require assumptions for model specification and, when common support is lacking, for model-based extrapolation. The authors recommend that any decomposition benefits from using a matching approach first to assess potential problems of common support and misspecification.
{"title":"Comparing the Incomparable? Issues of Lacking Common Support, Functional-Form Misspecification, and Insufficient Sample Size in Decompositions","authors":"Maik Hamjediers, Maximilian Sprengholz","doi":"10.1177/00811750231169729","DOIUrl":"https://doi.org/10.1177/00811750231169729","url":null,"abstract":"Decompositions make it possible to investigate whether gaps between groups in certain outcomes would remain if groups had comparable characteristics. In practice, however, such a counterfactual comparability is difficult to establish in the presence of lacking common support, functional-form misspecification, and insufficient sample size. In this article, the authors show how decompositions can be undermined by these three interrelated issues by comparing the results of a regression-based Kitagawa-Blinder-Oaxaca decomposition and matching decompositions applied to simulated and real-world data. The results show that matching decompositions are robust to issues of common support and functional-form misspecification but demand a large number of observations. Kitagawa-Blinder-Oaxaca decompositions provide consistent estimates also for smaller samples but require assumptions for model specification and, when common support is lacking, for model-based extrapolation. The authors recommend that any decomposition benefits from using a matching approach first to assess potential problems of common support and misspecification.","PeriodicalId":48140,"journal":{"name":"Sociological Methodology","volume":"53 1","pages":"344 - 365"},"PeriodicalIF":3.0,"publicationDate":"2023-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42433140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-05-11DOI: 10.1177/00811750231169726
Angelo Moretti
Large-scale sample surveys are not designed to produce reliable estimates for small areas. Here, small area estimation methods can be applied to estimate population parameters of target variables to detailed geographic scales. Small area estimation for noncontinuous variables is a topic of great interest in the social sciences where such variables can be found. Generalized linear mixed models are widely adopted in the literature. Interestingly, the small area estimation literature shows that multivariate small area estimators, where correlations among outcome variables are taken into account, produce more efficient estimates than do the traditional univariate techniques. In this article, the author evaluate a multivariate small area estimator on the basis of a joint mixed model in which a small area proportion and mean of a continuous variable are estimated simultaneously. Using this method, the author “borrows strength” across response variables. The author carried out a design-based simulation study to evaluate the approach where the indicators object of study are the income and a monetary poverty (binary) indicator. The author found that the multivariate approach produces more efficient small area estimates than does the univariate modeling approach. The method can be extended to a large variety of indicators on the basis of social surveys.
{"title":"Multivariate Small Area Estimation of Social Indicators: The Case of Continuous and Binary Variables","authors":"Angelo Moretti","doi":"10.1177/00811750231169726","DOIUrl":"https://doi.org/10.1177/00811750231169726","url":null,"abstract":"Large-scale sample surveys are not designed to produce reliable estimates for small areas. Here, small area estimation methods can be applied to estimate population parameters of target variables to detailed geographic scales. Small area estimation for noncontinuous variables is a topic of great interest in the social sciences where such variables can be found. Generalized linear mixed models are widely adopted in the literature. Interestingly, the small area estimation literature shows that multivariate small area estimators, where correlations among outcome variables are taken into account, produce more efficient estimates than do the traditional univariate techniques. In this article, the author evaluate a multivariate small area estimator on the basis of a joint mixed model in which a small area proportion and mean of a continuous variable are estimated simultaneously. Using this method, the author “borrows strength” across response variables. The author carried out a design-based simulation study to evaluate the approach where the indicators object of study are the income and a monetary poverty (binary) indicator. The author found that the multivariate approach produces more efficient small area estimates than does the univariate modeling approach. The method can be extended to a large variety of indicators on the basis of social surveys.","PeriodicalId":48140,"journal":{"name":"Sociological Methodology","volume":"53 1","pages":"323 - 343"},"PeriodicalIF":3.0,"publicationDate":"2023-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48197993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-25DOI: 10.1177/00811750231163833
G. Ritschard, T. Liao, E. Struffolino
Multidomain/multichannel sequence analysis has become widely used in social science research to uncover the underlying relationships between two or more observed trajectories in parallel. For example, life-course researchers use multidomain sequence analysis to study the parallel unfolding of multiple life-course domains. In this article, the authors conduct a critical review of the approaches most used in multidomain sequence analysis. The parallel unfolding of trajectories in multiple domains is typically analyzed by building a joint multidomain typology and by examining how domain-specific sequence patterns combine with one another within the multidomain groups. The authors identify four strategies to construct the joint multidomain typology: proceeding independently of domain costs and distances between domain sequences, deriving multidomain costs from domain costs, deriving distances between multidomain sequences from within-domain distances, and combining typologies constructed for each domain. The second and third strategies are prevalent in the literature and typically proceed additively. The authors show that these additive procedures assume between-domain independence, and they make explicit the constraints these procedures impose on between-multidomain costs and distances. Regarding the fourth strategy, the authors propose a merging algorithm to avoid scarce combined types. As regards the first strategy, the authors demonstrate, with a real example based on data from the Swiss Household Panel, that using edit distances with data-driven costs at the multidomain level (i.e., independent of domain costs) remains easily manageable with more than 200 different multidomain combined states. In addition, the authors introduce strategies to enhance visualization by types and domains.
{"title":"Strategies for Multidomain Sequence Analysis in Social Research","authors":"G. Ritschard, T. Liao, E. Struffolino","doi":"10.1177/00811750231163833","DOIUrl":"https://doi.org/10.1177/00811750231163833","url":null,"abstract":"Multidomain/multichannel sequence analysis has become widely used in social science research to uncover the underlying relationships between two or more observed trajectories in parallel. For example, life-course researchers use multidomain sequence analysis to study the parallel unfolding of multiple life-course domains. In this article, the authors conduct a critical review of the approaches most used in multidomain sequence analysis. The parallel unfolding of trajectories in multiple domains is typically analyzed by building a joint multidomain typology and by examining how domain-specific sequence patterns combine with one another within the multidomain groups. The authors identify four strategies to construct the joint multidomain typology: proceeding independently of domain costs and distances between domain sequences, deriving multidomain costs from domain costs, deriving distances between multidomain sequences from within-domain distances, and combining typologies constructed for each domain. The second and third strategies are prevalent in the literature and typically proceed additively. The authors show that these additive procedures assume between-domain independence, and they make explicit the constraints these procedures impose on between-multidomain costs and distances. Regarding the fourth strategy, the authors propose a merging algorithm to avoid scarce combined types. As regards the first strategy, the authors demonstrate, with a real example based on data from the Swiss Household Panel, that using edit distances with data-driven costs at the multidomain level (i.e., independent of domain costs) remains easily manageable with more than 200 different multidomain combined states. In addition, the authors introduce strategies to enhance visualization by types and domains.","PeriodicalId":48140,"journal":{"name":"Sociological Methodology","volume":"53 1","pages":"288 - 322"},"PeriodicalIF":3.0,"publicationDate":"2023-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46683423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-10DOI: 10.1177/00811750231160781
Jackelyn Hwang, Nikhil Naik
Analysis of neighborhood environments is important for understanding inequality. Few studies, however, use direct measures of the visible characteristics of neighborhood conditions, despite their theorized importance in shaping individual and community well-being, because collecting data on the physical conditions of places across neighborhoods and cities and over time has required extensive time and labor. The authors introduce systematic social observation at scale (SSO@S), a pipeline for using visual data, crowdsourcing, and computer vision to identify visible characteristics of neighborhoods at a large scale. The authors implement SSO@S on millions of street-level images across three physically distinct cities—Boston, Detroit, and Los Angeles—from 2007 to 2020 to identify trash across space and over time. The authors evaluate the extent to which this approach can be used to assist with systematic coding of street-level imagery through cross-validation and out-of-sample validation, class-activation mapping, and comparisons with other sources of observed neighborhood characteristics. The SSO@S approach produces estimates with high reliability that correlate with some expected demographic characteristics but not others, depending on the city. The authors conclude with an assessment of this approach for measuring visible characteristics of neighborhoods and the implications for methods and research.
{"title":"Systematic Social Observation at Scale: Using Crowdsourcing and Computer Vision to Measure Visible Neighborhood Conditions","authors":"Jackelyn Hwang, Nikhil Naik","doi":"10.1177/00811750231160781","DOIUrl":"https://doi.org/10.1177/00811750231160781","url":null,"abstract":"Analysis of neighborhood environments is important for understanding inequality. Few studies, however, use direct measures of the visible characteristics of neighborhood conditions, despite their theorized importance in shaping individual and community well-being, because collecting data on the physical conditions of places across neighborhoods and cities and over time has required extensive time and labor. The authors introduce systematic social observation at scale (SSO@S), a pipeline for using visual data, crowdsourcing, and computer vision to identify visible characteristics of neighborhoods at a large scale. The authors implement SSO@S on millions of street-level images across three physically distinct cities—Boston, Detroit, and Los Angeles—from 2007 to 2020 to identify trash across space and over time. The authors evaluate the extent to which this approach can be used to assist with systematic coding of street-level imagery through cross-validation and out-of-sample validation, class-activation mapping, and comparisons with other sources of observed neighborhood characteristics. The SSO@S approach produces estimates with high reliability that correlate with some expected demographic characteristics but not others, depending on the city. The authors conclude with an assessment of this approach for measuring visible characteristics of neighborhoods and the implications for methods and research.","PeriodicalId":48140,"journal":{"name":"Sociological Methodology","volume":"53 1","pages":"183 - 216"},"PeriodicalIF":3.0,"publicationDate":"2023-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43079457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-28DOI: 10.1177/00811750231151949
E. Fosse, Christopher Winship
In a widely influential essay, Ryder argued that to understand social change, researchers should compare cohort careers, contrasting how different cohorts change over the life cycle with respect to some outcome. Ryder, however, provided few technical details on how to actually conduct a cohort analysis. In this article, the authors develop a framework for analyzing temporally structured data grounded in the construction, comparison, and decomposition of cohort careers. The authors begin by illustrating how one can analyze age-period-cohort (APC) data by constructing graphs of cohort careers. Although a useful starting point, the major problem with this approach is that the graphs are typically of sufficient complexity that it can be difficult, if not impossible, to discern the underlying trends and patterns in the data. To provide a more useful foundation for cohort analysis, the authors therefore introduce three distinct improvements over the purely graphical approach. First, they provide a mathematical definition of a cohort career, demonstrating how the underlying parameters of interest can be estimated using a reparameterized version of the conventional APC model. The authors call this the life cycle and social change (LC-SC) model. Second, they contrast the proposed model with two alternative three-factor APC models and all logically possible two-factor models, showing that none of these other models are adequate for fully representing Ryder’s ideas. Third, the authors present the article’s major accomplishment: using the LC-SC model, they show how a collection of cohort careers can be decomposed into just four basic components: a curve representing an overall intracohort trend (or life cycle change); a curve representing an overall intercohort trend (or social change); a set of common cross-period temporal fluctuations that permit variability across cohort careers; and, finally, a set of terms representing cell-specific heterogeneity (or, equivalently, interactions among age, period, and/or cohort). As the authors demonstrate, these parts can be reassembled into simpler versions of cohort careers, revealing underlying trends and patterns that may not be evident otherwise. The authors illustrate this approach by analyzing trends in political party strength in the General Social Survey.
{"title":"The Anatomy of Cohort Analysis: Decomposing Comparative Cohort Careers","authors":"E. Fosse, Christopher Winship","doi":"10.1177/00811750231151949","DOIUrl":"https://doi.org/10.1177/00811750231151949","url":null,"abstract":"In a widely influential essay, Ryder argued that to understand social change, researchers should compare cohort careers, contrasting how different cohorts change over the life cycle with respect to some outcome. Ryder, however, provided few technical details on how to actually conduct a cohort analysis. In this article, the authors develop a framework for analyzing temporally structured data grounded in the construction, comparison, and decomposition of cohort careers. The authors begin by illustrating how one can analyze age-period-cohort (APC) data by constructing graphs of cohort careers. Although a useful starting point, the major problem with this approach is that the graphs are typically of sufficient complexity that it can be difficult, if not impossible, to discern the underlying trends and patterns in the data. To provide a more useful foundation for cohort analysis, the authors therefore introduce three distinct improvements over the purely graphical approach. First, they provide a mathematical definition of a cohort career, demonstrating how the underlying parameters of interest can be estimated using a reparameterized version of the conventional APC model. The authors call this the life cycle and social change (LC-SC) model. Second, they contrast the proposed model with two alternative three-factor APC models and all logically possible two-factor models, showing that none of these other models are adequate for fully representing Ryder’s ideas. Third, the authors present the article’s major accomplishment: using the LC-SC model, they show how a collection of cohort careers can be decomposed into just four basic components: a curve representing an overall intracohort trend (or life cycle change); a curve representing an overall intercohort trend (or social change); a set of common cross-period temporal fluctuations that permit variability across cohort careers; and, finally, a set of terms representing cell-specific heterogeneity (or, equivalently, interactions among age, period, and/or cohort). As the authors demonstrate, these parts can be reassembled into simpler versions of cohort careers, revealing underlying trends and patterns that may not be evident otherwise. The authors illustrate this approach by analyzing trends in political party strength in the General Social Survey.","PeriodicalId":48140,"journal":{"name":"Sociological Methodology","volume":"53 1","pages":"217 - 268"},"PeriodicalIF":3.0,"publicationDate":"2023-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46970228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-13DOI: 10.1177/00811750221147525
Taylor Lewis, Joseph McMichael, Charlotte Looby
Most addresses on modern address-based sampling frames derived from the U.S. Postal Service’s Computerized Delivery Sequence file have a one-to-one relationship with a household. Some addresses, however, are associated with multiple households. These addresses are referred to as drop points, and the households therein are referred to as drop point units (DPUs). DPUs pose a challenge for self-administered surveys because no apartment number or unit designation is available, making it impossible to send targeted correspondence. The authors evaluate a method for substituting sampled DPUs with similar non-DPUs, which was implemented in the 2021 Healthy Chicago Survey alongside a concurrent survey of the originally sampled DPUs. Comparing aggregate distributions of DPUs and the non-DPU substitutes, the authors observe certain differences with respect to age, employment status, marital status, and housing tenure but no substantive differences in key health outcomes measured by the survey.
{"title":"Evaluating Substitution as a Strategy for Handling U.S. Postal Service Drop Points in Self-Administered Address-Based Sampling Frame Surveys","authors":"Taylor Lewis, Joseph McMichael, Charlotte Looby","doi":"10.1177/00811750221147525","DOIUrl":"https://doi.org/10.1177/00811750221147525","url":null,"abstract":"Most addresses on modern address-based sampling frames derived from the U.S. Postal Service’s Computerized Delivery Sequence file have a one-to-one relationship with a household. Some addresses, however, are associated with multiple households. These addresses are referred to as drop points, and the households therein are referred to as drop point units (DPUs). DPUs pose a challenge for self-administered surveys because no apartment number or unit designation is available, making it impossible to send targeted correspondence. The authors evaluate a method for substituting sampled DPUs with similar non-DPUs, which was implemented in the 2021 Healthy Chicago Survey alongside a concurrent survey of the originally sampled DPUs. Comparing aggregate distributions of DPUs and the non-DPU substitutes, the authors observe certain differences with respect to age, employment status, marital status, and housing tenure but no substantive differences in key health outcomes measured by the survey.","PeriodicalId":48140,"journal":{"name":"Sociological Methodology","volume":"53 1","pages":"158 - 175"},"PeriodicalIF":3.0,"publicationDate":"2023-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42127373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}