首页 > 最新文献

Statistics and Public Policy最新文献

英文 中文
NAICS Code Prediction Using Supervised Methods 使用监督方法的NAICS代码预测
IF 1.6 Q2 Mathematics Pub Date : 2022-01-24 DOI: 10.1080/2330443X.2022.2033654
C. Oehlert, Evan T. Schulz, Anne Parker
Abstract When compiling industry statistics or selecting businesses for further study, researchers often rely on North American Industry Classification System (NAICS) codes. However, codes are self-reported on tax forms and reporting incorrect codes or even leaving the code blank has no tax consequences, so they are often unusable. IRSs Statistics of Income (SOI) program validates NAICS codes for businesses in the statistical samples used to produce official tax statistics for various filing populations, including sole proprietorships (those filing Form 1040 Schedule C) and corporations (those filing Forms 1120). In this article we leverage these samples to explore ways to improve NAICS code reporting for all filers in the relevant populations. For sole proprietorships, we overcame several record linkage complications to combine data from SOI samples with other administrative data. Using the SOI-validated NAICS code values as ground truth, we trained classification-tree-based models (randomForest) to predict NAICS industry sector from other tax return data, including text descriptions, for businesses which did or did not initially report a valid NAICS code. For both sole proprietorships and corporations, we were able to improve slightly on the accuracy of valid self-reported industry sector and correctly identify sector for over half of businesses with no informative reported NAICS code.
摘要在编制行业统计数据或选择企业进行进一步研究时,研究人员通常依赖北美行业分类系统(NAICS)代码。然而,代码是在纳税申报表上自我报告的,报告错误的代码甚至将代码留空都不会产生税务后果,因此它们通常无法使用。IRS收入统计(SOI)程序验证了统计样本中企业的NAICS代码,该统计样本用于为各种申报人群编制官方税务统计数据,包括独资企业(提交1040表格附表C的企业)和公司(提交1120表格的企业)。在本文中,我们利用这些样本来探索如何改进相关人群中所有提交者的NAICS代码报告。对于独资企业,我们克服了几个记录关联的复杂性,将SOI样本的数据与其他管理数据相结合。使用SOI验证的NAICS代码值作为基本事实,我们训练了基于分类树的模型(randomForest),以根据其他纳税申报数据预测NAICS行业部门,包括最初报告或未报告有效NAICS代码的企业的文本描述。对于独资企业和公司,我们能够略微提高有效的自我报告行业部门的准确性,并在没有信息报告NAICS代码的情况下正确识别超过一半的企业的行业。
{"title":"NAICS Code Prediction Using Supervised Methods","authors":"C. Oehlert, Evan T. Schulz, Anne Parker","doi":"10.1080/2330443X.2022.2033654","DOIUrl":"https://doi.org/10.1080/2330443X.2022.2033654","url":null,"abstract":"Abstract When compiling industry statistics or selecting businesses for further study, researchers often rely on North American Industry Classification System (NAICS) codes. However, codes are self-reported on tax forms and reporting incorrect codes or even leaving the code blank has no tax consequences, so they are often unusable. IRSs Statistics of Income (SOI) program validates NAICS codes for businesses in the statistical samples used to produce official tax statistics for various filing populations, including sole proprietorships (those filing Form 1040 Schedule C) and corporations (those filing Forms 1120). In this article we leverage these samples to explore ways to improve NAICS code reporting for all filers in the relevant populations. For sole proprietorships, we overcame several record linkage complications to combine data from SOI samples with other administrative data. Using the SOI-validated NAICS code values as ground truth, we trained classification-tree-based models (randomForest) to predict NAICS industry sector from other tax return data, including text descriptions, for businesses which did or did not initially report a valid NAICS code. For both sole proprietorships and corporations, we were able to improve slightly on the accuracy of valid self-reported industry sector and correctly identify sector for over half of businesses with no informative reported NAICS code.","PeriodicalId":43397,"journal":{"name":"Statistics and Public Policy","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2022-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46698012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Reconciling Evaluations of the Millennium Villages Project 千年村计划的协调评估
IF 1.6 Q2 Mathematics Pub Date : 2021-12-17 DOI: 10.1080/2330443X.2021.2019152
A. Gelman, Shira Mitchell, J. Sachs, S. Sachs
Abstract The Millennium Villages Project was an integrated rural development program carried out for a decade in 10 clusters of villages in sub-Saharan Africa starting in 2005, and in a few other sites for shorter durations. An evaluation of the 10 main sites compared to retrospectively chosen control sites estimated positive effects on a range of economic, social, and health outcomes (Mitchell et al. 2018). More recently, an outside group performed a prospective controlled (but also nonrandomized) evaluation of one of the shorter-duration sites and reported smaller or null results (Masset et al. 2020). Although these two conclusions seem contradictory, the differences can be explained by the fact that Mitchell et al. studied 10 sites where the project was implemented for 10 years, and Masset et al. studied one site with a program lasting less than 5 years, as well as differences in inference and framing. Insights from both evaluations should be valuable in considering future development efforts of this sort. Both studies are consistent with a larger picture of positive average impacts (compared to untreated villages) across a broad range of outcomes, but with effects varying across sites or requiring an adequate duration for impacts to be manifested.
摘要千年村庄项目是一项综合农村发展计划,从2005年开始,在撒哈拉以南非洲的10个村庄集群中实施了十年,并在其他几个地点实施了较短的时间。与回顾性选择的对照点相比,对10个主要地点的评估估计了对一系列经济、社会和健康结果的积极影响(Mitchell等人,2018)。最近,一个外部小组对其中一个持续时间较短的位点进行了前瞻性对照(但也是非随机)评估,并报告了较小或无效的结果(Masset等人,2020)。尽管这两个结论似乎相互矛盾,但Mitchell等人研究了该项目实施10年的10个地点,Masset等人研究了一个项目持续时间不到5年的地点,以及推理和框架方面的差异,可以解释这些差异。这两项评估的见解在考虑未来这类发展努力时应该是有价值的。这两项研究都与广泛结果的积极平均影响(与未经处理的村庄相比)的更大图景相一致,但影响因地点而异,或需要足够的持续时间才能显现。
{"title":"Reconciling Evaluations of the Millennium Villages Project","authors":"A. Gelman, Shira Mitchell, J. Sachs, S. Sachs","doi":"10.1080/2330443X.2021.2019152","DOIUrl":"https://doi.org/10.1080/2330443X.2021.2019152","url":null,"abstract":"Abstract The Millennium Villages Project was an integrated rural development program carried out for a decade in 10 clusters of villages in sub-Saharan Africa starting in 2005, and in a few other sites for shorter durations. An evaluation of the 10 main sites compared to retrospectively chosen control sites estimated positive effects on a range of economic, social, and health outcomes (Mitchell et al. 2018). More recently, an outside group performed a prospective controlled (but also nonrandomized) evaluation of one of the shorter-duration sites and reported smaller or null results (Masset et al. 2020). Although these two conclusions seem contradictory, the differences can be explained by the fact that Mitchell et al. studied 10 sites where the project was implemented for 10 years, and Masset et al. studied one site with a program lasting less than 5 years, as well as differences in inference and framing. Insights from both evaluations should be valuable in considering future development efforts of this sort. Both studies are consistent with a larger picture of positive average impacts (compared to untreated villages) across a broad range of outcomes, but with effects varying across sites or requiring an adequate duration for impacts to be manifested.","PeriodicalId":43397,"journal":{"name":"Statistics and Public Policy","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2021-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44919973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Graphical Measures Summarizing the Inequality of Income of Two Groups 两个群体收入不平等的图解测度
IF 1.6 Q2 Mathematics Pub Date : 2021-12-13 DOI: 10.1080/2330443X.2021.2016084
Joshua Landon, Joseph Gastwirth
Abstract Recently, Gastwirth proposed two transformations and of the Lorenz curve, which calculates the proportion of a population, cumulated from the poorest or middle, respectively, needed to have the same amount of income as top . Economists and policy makers are often interested in the comparative status of two groups, for example, females versus males or minority versus majority. This article adapts and extends the concept underlying the and curves to provide analogous curves comparing the relative status of two groups. Now one calculates the proportion of the minority group, cumulated from the bottom or middle needed to have the same total income as the top qth fraction of the majority group (after adjusting for sample size). The areas between these curves and the line of equality are analogous to the Gini index. The methodology is used to illustrate the change in the degree of inequality between males and females, as well as between black and white males, in the United States between 2000 and 2017, and can be used to examine disparities between the expenditures on health of minorities and white people.
摘要最近,Gastwirth提出了洛伦兹曲线的两个变换和,洛伦兹曲线计算了一个人口的比例,该人口分别从最贫穷或中等人口累积而来,需要拥有与最高收入相同的收入。经济学家和政策制定者通常对两个群体的比较地位感兴趣,例如,女性与男性或少数群体与多数群体。本文对和曲线的基本概念进行了调整和扩展,以提供比较两组相对状态的类似曲线。现在,我们计算少数群体的比例,从底部或中间累积起来,需要与多数群体的顶部qth部分拥有相同的总收入(在调整样本量后)。这些曲线和等值线之间的面积类似于基尼指数。该方法用于说明2000年至2017年间美国男女以及黑人和白人男性之间不平等程度的变化,并可用于研究少数族裔和白人在健康支出方面的差异。
{"title":"Graphical Measures Summarizing the Inequality of Income of Two Groups","authors":"Joshua Landon, Joseph Gastwirth","doi":"10.1080/2330443X.2021.2016084","DOIUrl":"https://doi.org/10.1080/2330443X.2021.2016084","url":null,"abstract":"Abstract Recently, Gastwirth proposed two transformations and of the Lorenz curve, which calculates the proportion of a population, cumulated from the poorest or middle, respectively, needed to have the same amount of income as top . Economists and policy makers are often interested in the comparative status of two groups, for example, females versus males or minority versus majority. This article adapts and extends the concept underlying the and curves to provide analogous curves comparing the relative status of two groups. Now one calculates the proportion of the minority group, cumulated from the bottom or middle needed to have the same total income as the top qth fraction of the majority group (after adjusting for sample size). The areas between these curves and the line of equality are analogous to the Gini index. The methodology is used to illustrate the change in the degree of inequality between males and females, as well as between black and white males, in the United States between 2000 and 2017, and can be used to examine disparities between the expenditures on health of minorities and white people.","PeriodicalId":43397,"journal":{"name":"Statistics and Public Policy","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2021-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46759615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimating Local Prevalence of Obesity Via Survey Under Cost Constraints: Stratifying ZCTAs in Virginia’s Thomas Jefferson Health District 在成本限制下通过调查估计当地肥胖患病率:弗吉尼亚州托马斯杰斐逊卫生区的ZCTA分层
IF 1.6 Q2 Mathematics Pub Date : 2021-12-10 DOI: 10.1080/2330443X.2021.2016083
Benjamin J. Lobo, D. Bonds, K. Kafadar
Abstract Currently, the most reliable estimate of the prevalence of obesity in Virginia’s Thomas Jefferson Health District (TJHD) comes from an annual telephone survey conducted by the Centers for Disease Control and Prevention. This district-wide estimate has limited use to decision makers who must target health interventions at a more granular level. A survey is one way of obtaining more granular estimates. This article describes the process of stratifying targeted geographic units (here, ZIP Code Tabulation Areas, or ZCTAs) prior to conducting the survey for those situations where cost considerations make it infeasible to sample each geographic unit (here, ZCTA) in the region (here, TJHD). Feature selection, allocation factor analysis, and hierarchical clustering were used to stratify ZCTAs. We describe the survey sampling strategy that we developed, by creating strata of ZCTAs; the data analysis using the R survey package; and the results. The resulting maps of obesity prevalence show stark differences in prevalence depending on the area of the health district, highlighting the importance of assessing health outcomes at a granular level. Our approach is a detailed and reproducible set of steps that can be used by others who face similar scenarios. Supplementary files for this article are available online.
摘要目前,弗吉尼亚州托马斯杰斐逊卫生区(TJHD)肥胖患病率的最可靠估计来自疾病控制和预防中心进行的年度电话调查。这一地区范围的估计仅限于决策者,他们必须在更精细的层面上针对健康干预措施。调查是获得更精细估计的一种方式。本文描述了在进行调查之前对目标地理单元(此处为邮政编码制表区或ZCTA)进行分层的过程,因为这些情况下,由于成本考虑,无法对区域(此处为TJHD)中的每个地理单元(这里为ZCTA)采样。使用特征选择、分配因子分析和层次聚类对ZCTA进行分层。我们描述了我们开发的调查抽样策略,通过创建ZCTA的地层;使用R调查包的数据分析;以及结果。由此绘制的肥胖患病率图显示,根据卫生区的不同地区,肥胖患病率存在明显差异,这突出了在颗粒水平上评估健康结果的重要性。我们的方法是一套详细且可重复的步骤,其他面临类似情况的人可以使用这些步骤。本文的补充文件可在线获取。
{"title":"Estimating Local Prevalence of Obesity Via Survey Under Cost Constraints: Stratifying ZCTAs in Virginia’s Thomas Jefferson Health District","authors":"Benjamin J. Lobo, D. Bonds, K. Kafadar","doi":"10.1080/2330443X.2021.2016083","DOIUrl":"https://doi.org/10.1080/2330443X.2021.2016083","url":null,"abstract":"Abstract Currently, the most reliable estimate of the prevalence of obesity in Virginia’s Thomas Jefferson Health District (TJHD) comes from an annual telephone survey conducted by the Centers for Disease Control and Prevention. This district-wide estimate has limited use to decision makers who must target health interventions at a more granular level. A survey is one way of obtaining more granular estimates. This article describes the process of stratifying targeted geographic units (here, ZIP Code Tabulation Areas, or ZCTAs) prior to conducting the survey for those situations where cost considerations make it infeasible to sample each geographic unit (here, ZCTA) in the region (here, TJHD). Feature selection, allocation factor analysis, and hierarchical clustering were used to stratify ZCTAs. We describe the survey sampling strategy that we developed, by creating strata of ZCTAs; the data analysis using the R survey package; and the results. The resulting maps of obesity prevalence show stark differences in prevalence depending on the area of the health district, highlighting the importance of assessing health outcomes at a granular level. Our approach is a detailed and reproducible set of steps that can be used by others who face similar scenarios. Supplementary files for this article are available online.","PeriodicalId":43397,"journal":{"name":"Statistics and Public Policy","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2021-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43545360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The effect of COVID-19 vaccinations on self-reported depression and anxiety during February 2021 2021年2月新冠肺炎疫苗接种对自我报告的抑郁和焦虑的影响
IF 1.6 Q2 Mathematics Pub Date : 2021-10-13 DOI: 10.1080/2330443x.2023.2190008
M. Rubinstein, A. Haviland, J. Breslau
Using the COVID-19 Trends and Impacts Survey (CTIS), we examine the effect of COVID-19 vaccinations on (self-reported) feelings of depression and anxiety ("depression"), isolation, and worries about health, among vaccine-accepting survey respondents during February 2021. Assuming no unmeasured confounding, we estimate that vaccinations caused a -4.3 (-4.7, -3.8), -3.4 (-3.9, -2.9), and -4.8 (-5.4, -4.1) percentage point change in these outcomes, respectively. We further argue that these effects provide a lower bound on the mental health burden of the pandemic, implying that the COVID-19 pandemic was responsible for at least a 28.6 (25.3, 31.9) percent increase in feelings of depression and a 20.5 (17.3, 23.6) percent increase in feelings of isolation during February 2021 among vaccine-accepting CTIS survey respondents. We also posit a model where vaccinations affect depression through worries about health and feelings of isolation, and estimate the proportion mediated by each pathway. We find that feelings of social isolation is the stronger mediator, accounting for 41.0 (37.3, 44.7) percent of the total effect, while worries about health accounts for 9.4 (7.6, 11.1) percent of the total effect. We caution that the causal interpretation of these findings rests on strong assumptions. Nevertheless, as the pandemic continues, policymakers should also target interventions aimed at managing the substantial mental health burden associated with the COVID-19 pandemic.
利用新冠肺炎趋势和影响调查(CTIS),我们研究了2021年2月接受疫苗接种的调查对象中新冠肺炎疫苗接种对(自我报告的)抑郁和焦虑(“抑郁”)、隔离和对健康的担忧的影响。假设没有未测量的混杂因素,我们估计疫苗接种分别导致这些结果发生-4.3(-4.7,-3.8)、-3.4(-3.9,-2.9)和-4.8(-5.4,-4.1)个百分点的变化。我们进一步认为,这些影响为大流行的心理健康负担提供了一个下限,这意味着新冠肺炎大流行导致了2021年2月接受疫苗接种的CTIS调查对象中至少28.6%(25.3,31.9)的抑郁感增加和20.5%(17.3,23.6)的孤独感增加。我们还假设了一个模型,在该模型中,疫苗接种通过对健康的担忧和孤独感影响抑郁症,并估计了每种途径介导的比例。我们发现,社会孤立感是更强的中介,占总效应的41.0(37.3,44.7)%,而对健康的担忧占总影响的9.4(7.611.1)%。我们警告说,对这些发现的因果解释建立在强有力的假设之上。尽管如此,随着疫情的持续,政策制定者还应针对旨在管理与新冠肺炎疫情相关的巨大心理健康负担的干预措施。
{"title":"The effect of COVID-19 vaccinations on self-reported depression and anxiety during February 2021","authors":"M. Rubinstein, A. Haviland, J. Breslau","doi":"10.1080/2330443x.2023.2190008","DOIUrl":"https://doi.org/10.1080/2330443x.2023.2190008","url":null,"abstract":"Using the COVID-19 Trends and Impacts Survey (CTIS), we examine the effect of COVID-19 vaccinations on (self-reported) feelings of depression and anxiety (\"depression\"), isolation, and worries about health, among vaccine-accepting survey respondents during February 2021. Assuming no unmeasured confounding, we estimate that vaccinations caused a -4.3 (-4.7, -3.8), -3.4 (-3.9, -2.9), and -4.8 (-5.4, -4.1) percentage point change in these outcomes, respectively. We further argue that these effects provide a lower bound on the mental health burden of the pandemic, implying that the COVID-19 pandemic was responsible for at least a 28.6 (25.3, 31.9) percent increase in feelings of depression and a 20.5 (17.3, 23.6) percent increase in feelings of isolation during February 2021 among vaccine-accepting CTIS survey respondents. We also posit a model where vaccinations affect depression through worries about health and feelings of isolation, and estimate the proportion mediated by each pathway. We find that feelings of social isolation is the stronger mediator, accounting for 41.0 (37.3, 44.7) percent of the total effect, while worries about health accounts for 9.4 (7.6, 11.1) percent of the total effect. We caution that the causal interpretation of these findings rests on strong assumptions. Nevertheless, as the pandemic continues, policymakers should also target interventions aimed at managing the substantial mental health burden associated with the COVID-19 pandemic.","PeriodicalId":43397,"journal":{"name":"Statistics and Public Policy","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45879409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Mathematical Analysis of Redistricting in Utah 犹他州选区重划的数学分析
IF 1.6 Q2 Mathematics Pub Date : 2021-07-12 DOI: 10.1080/2330443X.2022.2105770
Ann King, Jacob Murri, Jake Callahan, Adrienne Russell, Tyler J. Jarvis
Abstract We discuss difficulties of evaluating partisan gerrymandering in the congressional districts in Utah and the failure of many common metrics in Utah. We explain why the Republican vote share in the least-Republican district (LRVS) is a good indicator of the advantage or disadvantage each party has in the Utah congressional districts. Although the LRVS only makes sense in settings with at most one competitive district, in that setting it directly captures the extent to which a given redistricting plan gives advantage or disadvantage to the Republican and Democratic parties. We use the LRVS to evaluate the most common measures of partisan gerrymandering in the context of Utah’s 2011 congressional districts. We do this by generating large ensembles of alternative redistricting plans using Markov chain Monte Carlo methods. We also discuss the implications of this new metric and our results on the question of whether the 2011 Utah congressional plan was gerrymandered.
摘要:我们讨论了评估犹他州国会选区党派不公正划分的困难以及犹他州许多常见指标的失败。我们解释了为什么共和党在最少共和党选区(LRVS)的投票份额是每个政党在犹他州国会选区的优势或劣势的一个很好的指标。虽然LRVS只在最多一个竞争性选区的情况下才有意义,但在这种情况下,它直接反映了一个给定的重新划分计划对共和党和民主党有利或不利的程度。我们使用LRVS在犹他州2011年国会选区的背景下评估党派不公正划分的最常见措施。我们通过使用马尔可夫链蒙特卡罗方法生成备选重划计划的大集合来实现这一点。我们还讨论了这个新指标的含义,以及我们对2011年犹他州国会计划是否存在不公正划分的问题的研究结果。
{"title":"Mathematical Analysis of Redistricting in Utah","authors":"Ann King, Jacob Murri, Jake Callahan, Adrienne Russell, Tyler J. Jarvis","doi":"10.1080/2330443X.2022.2105770","DOIUrl":"https://doi.org/10.1080/2330443X.2022.2105770","url":null,"abstract":"Abstract We discuss difficulties of evaluating partisan gerrymandering in the congressional districts in Utah and the failure of many common metrics in Utah. We explain why the Republican vote share in the least-Republican district (LRVS) is a good indicator of the advantage or disadvantage each party has in the Utah congressional districts. Although the LRVS only makes sense in settings with at most one competitive district, in that setting it directly captures the extent to which a given redistricting plan gives advantage or disadvantage to the Republican and Democratic parties. We use the LRVS to evaluate the most common measures of partisan gerrymandering in the context of Utah’s 2011 congressional districts. We do this by generating large ensembles of alternative redistricting plans using Markov chain Monte Carlo methods. We also discuss the implications of this new metric and our results on the question of whether the 2011 Utah congressional plan was gerrymandered.","PeriodicalId":43397,"journal":{"name":"Statistics and Public Policy","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2021-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45183055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Prevalence and Propagation of Fake News 假新闻的流行和传播
IF 1.6 Q2 Mathematics Pub Date : 2021-06-17 DOI: 10.1080/2330443X.2023.2190368
Banafsheh Behzad, Bhavana Bheem, D. Elizondo, Deyana Marsh, Susan E. Martonosi
In recent years, scholars have raised concerns on the effects that unreliable news, or"fake news,"has on our political sphere, and our democracy as a whole. For example, the propagation of fake news on social media is widely believed to have influenced the outcome of national elections, including the 2016 U.S. Presidential Election, and the 2020 COVID-19 pandemic. What drives the propagation of fake news on an individual level, and which interventions could effectively reduce the propagation rate? Our model disentangles bias from truthfulness of an article and examines the relationship between these two parameters and a reader's own beliefs. Using the model, we create policy recommendations for both social media platforms and individual social media users to reduce the spread of untruthful or highly biased news. We recommend that platforms sponsor unbiased truthful news, focus fact-checking efforts on mild to moderately biased news, recommend friend suggestions across the political spectrum, and provide users with reports about the political alignment of their feed. We recommend that individual social media users fact check news that strongly aligns with their political bias and read articles of opposing political bias.
近年来,学者们对不可靠的新闻或“假新闻”对我们的政治领域乃至整个民主的影响提出了担忧。例如,人们普遍认为,社交媒体上的假新闻传播影响了2016年美国总统大选和2020年新冠肺炎疫情等全国选举的结果。在个人层面上,是什么推动了假新闻的传播?哪些干预措施可以有效地降低传播速度?我们的模型将偏见与文章的真实性分开,并检查这两个参数与读者自己的信念之间的关系。使用该模型,我们为社交媒体平台和个人社交媒体用户创建政策建议,以减少不真实或高度偏见新闻的传播。我们建议平台赞助无偏见的真实新闻,将事实核查工作集中在轻度至中度偏见的新闻上,向不同政治派别的朋友推荐建议,并向用户提供有关其feed的政治一致性的报告。我们建议个人社交媒体用户核实与他们的政治偏见强烈一致的新闻,并阅读反对政治偏见的文章。
{"title":"Prevalence and Propagation of Fake News","authors":"Banafsheh Behzad, Bhavana Bheem, D. Elizondo, Deyana Marsh, Susan E. Martonosi","doi":"10.1080/2330443X.2023.2190368","DOIUrl":"https://doi.org/10.1080/2330443X.2023.2190368","url":null,"abstract":"In recent years, scholars have raised concerns on the effects that unreliable news, or\"fake news,\"has on our political sphere, and our democracy as a whole. For example, the propagation of fake news on social media is widely believed to have influenced the outcome of national elections, including the 2016 U.S. Presidential Election, and the 2020 COVID-19 pandemic. What drives the propagation of fake news on an individual level, and which interventions could effectively reduce the propagation rate? Our model disentangles bias from truthfulness of an article and examines the relationship between these two parameters and a reader's own beliefs. Using the model, we create policy recommendations for both social media platforms and individual social media users to reduce the spread of untruthful or highly biased news. We recommend that platforms sponsor unbiased truthful news, focus fact-checking efforts on mild to moderately biased news, recommend friend suggestions across the political spectrum, and provide users with reports about the political alignment of their feed. We recommend that individual social media users fact check news that strongly aligns with their political bias and read articles of opposing political bias.","PeriodicalId":43397,"journal":{"name":"Statistics and Public Policy","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2021-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47609705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Misuse of Statistical Reasoning: The Statistical Arguments Offered by Texas to the Supreme Court in an Attempt to Overturn the Results of the 2020 Election 滥用统计推理:得克萨斯州向最高法院提供的统计论据试图推翻2020年大选的结果
IF 1.6 Q2 Mathematics Pub Date : 2021-05-25 DOI: 10.1080/2330443X.2022.2050327
W. Miao, Qing Pan, J. Gastwirth
Abstract In December 2020, Texas filed a motion to the U.S. Supreme Court claiming that the four battleground states: Pennsylvania, Georgia, Michigan, and Wisconsin did not conduct their 2020 presidential elections in compliance with the Constitution. Texas supported its motion with a statistical analysis purportedly demonstrating that it was highly improbable that Biden had more votes than Trump in the four battleground states. This article points out that Texas’s claim is logically flawed and the analysis submitted violated several fundamental principles of statistics.
摘要2020年12月,得克萨斯州向美国最高法院提出动议,声称宾夕法尼亚州、佐治亚州、密歇根州和威斯康星州这四个关键州没有按照宪法进行2020年总统选举。得克萨斯州通过一项统计分析支持了其动议,据称该分析表明,拜登在四个战场州的选票极不可能超过特朗普。这篇文章指出,得克萨斯州的说法在逻辑上有缺陷,提交的分析违反了统计学的几个基本原则。
{"title":"A Misuse of Statistical Reasoning: The Statistical Arguments Offered by Texas to the Supreme Court in an Attempt to Overturn the Results of the 2020 Election","authors":"W. Miao, Qing Pan, J. Gastwirth","doi":"10.1080/2330443X.2022.2050327","DOIUrl":"https://doi.org/10.1080/2330443X.2022.2050327","url":null,"abstract":"Abstract In December 2020, Texas filed a motion to the U.S. Supreme Court claiming that the four battleground states: Pennsylvania, Georgia, Michigan, and Wisconsin did not conduct their 2020 presidential elections in compliance with the Constitution. Texas supported its motion with a statistical analysis purportedly demonstrating that it was highly improbable that Biden had more votes than Trump in the four battleground states. This article points out that Texas’s claim is logically flawed and the analysis submitted violated several fundamental principles of statistics.","PeriodicalId":43397,"journal":{"name":"Statistics and Public Policy","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2021-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46560436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Changes in Crime Rates during the COVID-19 Pandemic COVID-19大流行期间犯罪率的变化
IF 1.6 Q2 Mathematics Pub Date : 2021-05-19 DOI: 10.1080/2330443X.2022.2071369
Mikaela Meyer, Ahmed Hassafy, G. Lewis, Prasun Shrestha, A. Haviland, D. Nagin
Abstract We estimate changes in the rates of five FBI Part 1 crimes during the 2020 spring COVID-19 pandemic lockdown period and the period after the killing of George Floyd through December 2020. We use weekly crime rate data from 28 of the 70 largest cities in the United States from January 2018 to December 2020. Homicide rates were higher throughout 2020, including during early 2020 prior to March lockdowns. Auto thefts increased significantly during the summer and remainder of 2020. In contrast, robbery and larceny significantly declined during all three post-pandemic periods. Point estimates of burglary rates pointed to a decline for all four periods of 2020, but only the pre-pandemic period was statistically significant. We construct a city-level openness index to examine whether the degree of openness just prior to and during the lockdowns was associated with changing crime rates. Larceny and robbery rates both had a positive and significant association with the openness index implying lockdown restrictions reduced offense rates whereas the other three crime types had no detectable association. While opportunity theory is a tempting post hoc explanation of some of these findings, no single crime theory provides a plausible explanation of all the results. Supplementary materials for this article are available online.
我们估计了2020年春季COVID-19大流行封锁期间和乔治·弗洛伊德被杀后至2020年12月期间FBI第一部分的五项犯罪率的变化。我们使用了2018年1月至2020年12月期间美国70个最大城市中28个城市的每周犯罪率数据。整个2020年,包括3月份封锁之前的2020年初,凶杀率都更高。在2020年的夏季和剩余时间里,汽车盗窃案显著增加。相比之下,在大流行后的所有三个时期,抢劫和盗窃都大幅下降。入室盗窃率的点估计表明,2020年所有四个时期的入室盗窃率都在下降,但只有大流行前的时期在统计上具有显著意义。我们构建了一个城市级别的开放指数,以检验封锁之前和期间的开放程度是否与犯罪率变化有关。盗窃和抢劫率都与开放指数呈正相关,这意味着封锁限制降低了犯罪率,而其他三种犯罪类型没有可检测到的关联。虽然机会理论是对其中一些发现的一种诱人的事后解释,但没有一种犯罪理论能对所有的结果提供合理的解释。本文的补充材料可在网上获得。
{"title":"Changes in Crime Rates during the COVID-19 Pandemic","authors":"Mikaela Meyer, Ahmed Hassafy, G. Lewis, Prasun Shrestha, A. Haviland, D. Nagin","doi":"10.1080/2330443X.2022.2071369","DOIUrl":"https://doi.org/10.1080/2330443X.2022.2071369","url":null,"abstract":"Abstract We estimate changes in the rates of five FBI Part 1 crimes during the 2020 spring COVID-19 pandemic lockdown period and the period after the killing of George Floyd through December 2020. We use weekly crime rate data from 28 of the 70 largest cities in the United States from January 2018 to December 2020. Homicide rates were higher throughout 2020, including during early 2020 prior to March lockdowns. Auto thefts increased significantly during the summer and remainder of 2020. In contrast, robbery and larceny significantly declined during all three post-pandemic periods. Point estimates of burglary rates pointed to a decline for all four periods of 2020, but only the pre-pandemic period was statistically significant. We construct a city-level openness index to examine whether the degree of openness just prior to and during the lockdowns was associated with changing crime rates. Larceny and robbery rates both had a positive and significant association with the openness index implying lockdown restrictions reduced offense rates whereas the other three crime types had no detectable association. While opportunity theory is a tempting post hoc explanation of some of these findings, no single crime theory provides a plausible explanation of all the results. Supplementary materials for this article are available online.","PeriodicalId":43397,"journal":{"name":"Statistics and Public Policy","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2021-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42372574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Rethinking the Funding Line at the Swiss National Science Foundation: Bayesian Ranking and Lottery 重新思考瑞士国家科学基金会的资助额度:贝叶斯排名和彩票
IF 1.6 Q2 Mathematics Pub Date : 2021-02-19 DOI: 10.1080/2330443X.2022.2086190
R. Heyard, Manuela Ott, G. Salanti, M. Egger
Abstract Funding agencies rely on peer review and expert panels to select the research deserving funding. Peer review has limitations, including bias against risky proposals or interdisciplinary research. The inter-rater reliability between reviewers and panels is low, particularly for proposals near the funding line. Funding agencies are also increasingly acknowledging the role of chance. The Swiss National Science Foundation (SNSF) introduced a lottery for proposals in the middle group of good but not excellent proposals. In this article, we introduce a Bayesian hierarchical model for the evaluation process. To rank the proposals, we estimate their expected ranks (ER), which incorporates both the magnitude and uncertainty of the estimated differences between proposals. A provisional funding line is defined based on ER and budget. The ER and its credible interval are used to identify proposals with similar quality and credible intervals that overlap with the provisional funding line. These proposals are entered into a lottery. We illustrate the approach for two SNSF grant schemes in career and project funding. We argue that the method could reduce bias in the evaluation process. R code, data and other materials for this article are available online.
资助机构依靠同行评审和专家小组来选择值得资助的研究。同行评议有其局限性,包括对风险提案或跨学科研究的偏见。审稿人和小组之间的可靠性很低,特别是对于接近资助线的提案。资助机构也越来越认识到机遇的作用。瑞士国家科学基金会(SNSF)引入了一种抽奖方式,从中间的好而非优秀的提案中选出。在本文中,我们介绍了一个贝叶斯层次模型的评估过程。为了对提案进行排名,我们估计它们的期望排名(ER),它包含了提案之间估计差异的大小和不确定性。根据ER和预算确定临时资金额度。风险评估及其可信间隔用于识别与临时资金线重叠的具有相似质量和可信间隔的提案。这些提案以抽签方式进行。我们举例说明了两种国家科学基金资助计划在职业和项目资助方面的方法。我们认为该方法可以减少评估过程中的偏差。本文的R代码、数据和其他材料可在网上获得。
{"title":"Rethinking the Funding Line at the Swiss National Science Foundation: Bayesian Ranking and Lottery","authors":"R. Heyard, Manuela Ott, G. Salanti, M. Egger","doi":"10.1080/2330443X.2022.2086190","DOIUrl":"https://doi.org/10.1080/2330443X.2022.2086190","url":null,"abstract":"Abstract Funding agencies rely on peer review and expert panels to select the research deserving funding. Peer review has limitations, including bias against risky proposals or interdisciplinary research. The inter-rater reliability between reviewers and panels is low, particularly for proposals near the funding line. Funding agencies are also increasingly acknowledging the role of chance. The Swiss National Science Foundation (SNSF) introduced a lottery for proposals in the middle group of good but not excellent proposals. In this article, we introduce a Bayesian hierarchical model for the evaluation process. To rank the proposals, we estimate their expected ranks (ER), which incorporates both the magnitude and uncertainty of the estimated differences between proposals. A provisional funding line is defined based on ER and budget. The ER and its credible interval are used to identify proposals with similar quality and credible intervals that overlap with the provisional funding line. These proposals are entered into a lottery. We illustrate the approach for two SNSF grant schemes in career and project funding. We argue that the method could reduce bias in the evaluation process. R code, data and other materials for this article are available online.","PeriodicalId":43397,"journal":{"name":"Statistics and Public Policy","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2021-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43492548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
期刊
Statistics and Public Policy
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1