{"title":"America’s racial framework of superiority and Americanness embedded in natural language","authors":"Messi H.J. Lee, Jacob M Montgomery, Calvin K Lai","doi":"10.1093/pnasnexus/pgad485","DOIUrl":null,"url":null,"abstract":"\n America’s racial framework can be summarized using two distinct dimensions: superiority/inferiority and Americanness/foreignness (Zou & Cheryan, 2017). We investigated America’s racial framework in a corpus of spoken and written language using word embeddings. Word embeddings place words on a low-dimensional space where words with similar meanings are proximate, allowing researchers to test whether the positions of group and attribute words in a semantic space reflect stereotypes. We trained a word embedding model on the Corpus of Contemporary American English - a corpus of one-billion words that span thirty years and eight text categories - and compared the positions of racial/ethnic groups with respect to superiority and Americanness. We found that America’s racial framework is embedded in American English. We also captured an additional nuance: Asian people were stereotyped as more American than Hispanic people. These results are empirical evidence that America’s racial framework is embedded in American English.","PeriodicalId":509985,"journal":{"name":"PNAS Nexus","volume":"64 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PNAS Nexus","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/pnasnexus/pgad485","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
America’s racial framework can be summarized using two distinct dimensions: superiority/inferiority and Americanness/foreignness (Zou & Cheryan, 2017). We investigated America’s racial framework in a corpus of spoken and written language using word embeddings. Word embeddings place words on a low-dimensional space where words with similar meanings are proximate, allowing researchers to test whether the positions of group and attribute words in a semantic space reflect stereotypes. We trained a word embedding model on the Corpus of Contemporary American English - a corpus of one-billion words that span thirty years and eight text categories - and compared the positions of racial/ethnic groups with respect to superiority and Americanness. We found that America’s racial framework is embedded in American English. We also captured an additional nuance: Asian people were stereotyped as more American than Hispanic people. These results are empirical evidence that America’s racial framework is embedded in American English.