使用名字信息改进种族和民族分类

IF 1.5 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Statistics and Public Policy Pub Date : 2016-02-22 DOI:10.1080/2330443X.2018.1427012
Ioan Voicu
{"title":"使用名字信息改进种族和民族分类","authors":"Ioan Voicu","doi":"10.1080/2330443X.2018.1427012","DOIUrl":null,"url":null,"abstract":"ABSTRACT This article uses a recent first name list to develop an improvement to an existing Bayesian classifier, namely the Bayesian Improved Surname Geocoding (BISG) method, which combines surname and geography information to impute missing race/ethnicity. The new Bayesian Improved First Name Surname Geocoding (BIFSG) method is validated using a large sample of mortgage applicants who self-report their race/ethnicity. BIFSG outperforms BISG, in terms of accuracy and coverage, for all major racial/ethnic categories. Although the overall magnitude of improvement is somewhat small, the largest improvements occur for non-Hispanic Blacks, a group for which the BISG performance is weakest. When estimating the race/ethnicity effects on mortgage pricing and underwriting decisions with regression models, estimation biases from both BIFSG and BISG are very small, with BIFSG generally having smaller biases, and the maximum a posteriori classifier resulting in smaller biases than through use of estimated probabilities. Robustness checks using voter registration data confirm BIFSG's improved performance vis-a-vis BISG and illustrate BIFSG's applicability to areas other than mortgage lending. Finally, I demonstrate an application of the BIFSG to the imputation of missing race/ethnicity in the Home Mortgage Disclosure Act data, and in the process, offer novel evidence that the incidence of missing race/ethnicity information is correlated with race/ethnicity.","PeriodicalId":43397,"journal":{"name":"Statistics and Public Policy","volume":"5 1","pages":"1 - 13"},"PeriodicalIF":1.5000,"publicationDate":"2016-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/2330443X.2018.1427012","citationCount":"29","resultStr":"{\"title\":\"Using First Name Information to Improve Race and Ethnicity Classification\",\"authors\":\"Ioan Voicu\",\"doi\":\"10.1080/2330443X.2018.1427012\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"ABSTRACT This article uses a recent first name list to develop an improvement to an existing Bayesian classifier, namely the Bayesian Improved Surname Geocoding (BISG) method, which combines surname and geography information to impute missing race/ethnicity. The new Bayesian Improved First Name Surname Geocoding (BIFSG) method is validated using a large sample of mortgage applicants who self-report their race/ethnicity. BIFSG outperforms BISG, in terms of accuracy and coverage, for all major racial/ethnic categories. Although the overall magnitude of improvement is somewhat small, the largest improvements occur for non-Hispanic Blacks, a group for which the BISG performance is weakest. When estimating the race/ethnicity effects on mortgage pricing and underwriting decisions with regression models, estimation biases from both BIFSG and BISG are very small, with BIFSG generally having smaller biases, and the maximum a posteriori classifier resulting in smaller biases than through use of estimated probabilities. Robustness checks using voter registration data confirm BIFSG's improved performance vis-a-vis BISG and illustrate BIFSG's applicability to areas other than mortgage lending. Finally, I demonstrate an application of the BIFSG to the imputation of missing race/ethnicity in the Home Mortgage Disclosure Act data, and in the process, offer novel evidence that the incidence of missing race/ethnicity information is correlated with race/ethnicity.\",\"PeriodicalId\":43397,\"journal\":{\"name\":\"Statistics and Public Policy\",\"volume\":\"5 1\",\"pages\":\"1 - 13\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2016-02-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1080/2330443X.2018.1427012\",\"citationCount\":\"29\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Statistics and Public Policy\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1080/2330443X.2018.1427012\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"SOCIAL SCIENCES, MATHEMATICAL METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistics and Public Policy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/2330443X.2018.1427012","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"SOCIAL SCIENCES, MATHEMATICAL METHODS","Score":null,"Total":0}
引用次数: 29

摘要

摘要:本文利用最近的人名列表对现有的贝叶斯分类器进行改进,即贝叶斯改进姓氏地理编码(BISG)方法,该方法将姓氏和地理信息结合起来,以估算缺失的种族/民族。新的贝叶斯改进的姓氏地理编码(BIFSG)方法是使用大量的抵押贷款申请人自我报告他们的种族/民族的样本进行验证的。在所有主要种族/族裔类别的准确性和覆盖率方面,BIFSG优于BISG。尽管总体上的改善幅度有些小,但最大的改善发生在非西班牙裔黑人身上,这是BISG表现最弱的群体。当使用回归模型估计种族/民族对抵押贷款定价和承保决策的影响时,来自BIFSG和BISG的估计偏差都非常小,BIFSG通常具有较小的偏差,并且最大后验分类器导致的偏差比使用估计概率更小。使用选民登记数据的鲁棒性检查证实了BIFSG相对于BISG的改进性能,并说明了BIFSG对抵押贷款以外领域的适用性。最后,我展示了BIFSG在住房抵押贷款披露法案数据中缺失种族/民族的应用,并在此过程中提供了新的证据,证明缺失种族/民族信息的发生率与种族/民族相关。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Using First Name Information to Improve Race and Ethnicity Classification
ABSTRACT This article uses a recent first name list to develop an improvement to an existing Bayesian classifier, namely the Bayesian Improved Surname Geocoding (BISG) method, which combines surname and geography information to impute missing race/ethnicity. The new Bayesian Improved First Name Surname Geocoding (BIFSG) method is validated using a large sample of mortgage applicants who self-report their race/ethnicity. BIFSG outperforms BISG, in terms of accuracy and coverage, for all major racial/ethnic categories. Although the overall magnitude of improvement is somewhat small, the largest improvements occur for non-Hispanic Blacks, a group for which the BISG performance is weakest. When estimating the race/ethnicity effects on mortgage pricing and underwriting decisions with regression models, estimation biases from both BIFSG and BISG are very small, with BIFSG generally having smaller biases, and the maximum a posteriori classifier resulting in smaller biases than through use of estimated probabilities. Robustness checks using voter registration data confirm BIFSG's improved performance vis-a-vis BISG and illustrate BIFSG's applicability to areas other than mortgage lending. Finally, I demonstrate an application of the BIFSG to the imputation of missing race/ethnicity in the Home Mortgage Disclosure Act data, and in the process, offer novel evidence that the incidence of missing race/ethnicity information is correlated with race/ethnicity.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Statistics and Public Policy
Statistics and Public Policy SOCIAL SCIENCES, MATHEMATICAL METHODS-
CiteScore
3.20
自引率
6.20%
发文量
13
审稿时长
32 weeks
期刊最新文献
State-Building through Public Land Disposal? An Application of Matrix Completion for Counterfactual Prediction Clusters of Jail Incarcerations in US Counties: 2010-2018 Comment on ‘What protects the autonomy of the Federal Statistics Agencies? An Assessment of the Procedures in Place That Protect the Independence and Objectivity of Official Statistics” by Pierson et al. On Coping in a Non-Binary World: Rejoinder to Biedermann and Kotsoglou Commentary on “Three-Way ROCs for Forensic Decision Making” by Nicholas Scurich and Richard S. John (in: Statistics and Public Policy)
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1