使用词嵌入来研究人类心理:方法和应用

H. Bao, Zi-Xi Wang, Xi Cheng, Zhan Su, Ying-Hong Yang, Guang-Yao Zhang, Bo Wang, Hua-Jian Cai
{"title":"使用词嵌入来研究人类心理:方法和应用","authors":"H. Bao, Zi-Xi Wang, Xi Cheng, Zhan Su, Ying-Hong Yang, Guang-Yao Zhang, Bo Wang, Hua-Jian Cai","doi":"10.3724/sp.j.1042.2023.00887","DOIUrl":null,"url":null,"abstract":": As a basic technique in natural language processing (NLP), word embedding represents a word with a low-dimensional, dense, and continuous numeric vector (i.e., word vector). Word embeddings can be obtained by using neural network algorithms to predict words from the surrounding words or vice versa (Word2Vec and FastText) or words’ probability of co-occurrence (GloVe) in large-scale text corpora. In this case, the values of dimensions of a word vector denote the pattern of how a word can be predicted in a context, substantially connoting its semantic information. Therefore, word embeddings can be utilized for semantic analyses of text. In recent years, word embeddings have been rapidly employed to study human psychology, including human semantic processing, cognitive judgment, individual divergent thinking (creativity), group-level social cognition, sociocultural changes, and so forth. We have developed the R package “PsychWordVec” to help researchers utilize and analyze word embeddings in a tidy approach. Future research using word embeddings should (1) distinguish between implicit and explicit components of social cognition, (2) train fine-grained word vectors in terms of time and region to facilitate cross-temporal and cross-cultural research, and (3) deepen and expand the application of contextualized word embeddings and large pre-trained language models such as GPT and BERT","PeriodicalId":62025,"journal":{"name":"心理科学进展","volume":"32 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Using word embeddings to investigate human psychology: Methods and applications\",\"authors\":\"H. Bao, Zi-Xi Wang, Xi Cheng, Zhan Su, Ying-Hong Yang, Guang-Yao Zhang, Bo Wang, Hua-Jian Cai\",\"doi\":\"10.3724/sp.j.1042.2023.00887\",\"DOIUrl\":null,\"url\":null,\"abstract\":\": As a basic technique in natural language processing (NLP), word embedding represents a word with a low-dimensional, dense, and continuous numeric vector (i.e., word vector). Word embeddings can be obtained by using neural network algorithms to predict words from the surrounding words or vice versa (Word2Vec and FastText) or words’ probability of co-occurrence (GloVe) in large-scale text corpora. In this case, the values of dimensions of a word vector denote the pattern of how a word can be predicted in a context, substantially connoting its semantic information. Therefore, word embeddings can be utilized for semantic analyses of text. In recent years, word embeddings have been rapidly employed to study human psychology, including human semantic processing, cognitive judgment, individual divergent thinking (creativity), group-level social cognition, sociocultural changes, and so forth. We have developed the R package “PsychWordVec” to help researchers utilize and analyze word embeddings in a tidy approach. Future research using word embeddings should (1) distinguish between implicit and explicit components of social cognition, (2) train fine-grained word vectors in terms of time and region to facilitate cross-temporal and cross-cultural research, and (3) deepen and expand the application of contextualized word embeddings and large pre-trained language models such as GPT and BERT\",\"PeriodicalId\":62025,\"journal\":{\"name\":\"心理科学进展\",\"volume\":\"32 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"心理科学进展\",\"FirstCategoryId\":\"95\",\"ListUrlMain\":\"https://doi.org/10.3724/sp.j.1042.2023.00887\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"心理科学进展","FirstCategoryId":"95","ListUrlMain":"https://doi.org/10.3724/sp.j.1042.2023.00887","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

词嵌入是自然语言处理(NLP)中的一项基本技术,它用一个低维、密集、连续的数字向量(即词向量)来表示一个词。词嵌入可以通过使用神经网络算法从周围的词中预测词(Word2Vec和FastText)或大规模文本语料库中的词的共现概率(GloVe)来获得。在这种情况下,单词向量的维度值表示如何在上下文中预测单词的模式,实质上包含其语义信息。因此,词嵌入可以用于文本的语义分析。近年来,词嵌入被迅速应用于人类心理研究,包括人类语义加工、认知判断、个体发散思维(创造力)、群体层面的社会认知、社会文化变迁等。我们开发了R软件包“PsychWordVec”,以帮助研究人员以一种整洁的方式利用和分析词嵌入。未来使用词嵌入的研究应该:(1)区分社会认知的内隐和外显成分;(2)在时间和区域上训练细粒度的词向量,以促进跨时间和跨文化的研究;(3)深化和扩展语境化词嵌入和大型预训练语言模型(如GPT和BERT)的应用
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Using word embeddings to investigate human psychology: Methods and applications
: As a basic technique in natural language processing (NLP), word embedding represents a word with a low-dimensional, dense, and continuous numeric vector (i.e., word vector). Word embeddings can be obtained by using neural network algorithms to predict words from the surrounding words or vice versa (Word2Vec and FastText) or words’ probability of co-occurrence (GloVe) in large-scale text corpora. In this case, the values of dimensions of a word vector denote the pattern of how a word can be predicted in a context, substantially connoting its semantic information. Therefore, word embeddings can be utilized for semantic analyses of text. In recent years, word embeddings have been rapidly employed to study human psychology, including human semantic processing, cognitive judgment, individual divergent thinking (creativity), group-level social cognition, sociocultural changes, and so forth. We have developed the R package “PsychWordVec” to help researchers utilize and analyze word embeddings in a tidy approach. Future research using word embeddings should (1) distinguish between implicit and explicit components of social cognition, (2) train fine-grained word vectors in terms of time and region to facilitate cross-temporal and cross-cultural research, and (3) deepen and expand the application of contextualized word embeddings and large pre-trained language models such as GPT and BERT
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
4819
期刊介绍:
期刊最新文献
The influence of extrinsic and intrinsic motivation on memory in adolescents and the underlying neural mechanisms The formation of user perspective taking and its influence on opportunity belief performance The process motivation model of algorithmic decision-making approach and avoidance The functional mechanism of oxytocin in anxiety detection and extinction among anxiety-susceptible groups The effect of external rewards on declarative memory
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1