Mudança semântica e word embeddings: estudos de caso na diacronia do português/ Semantic change and word embeddings: case studies on the diachrony of Portuguese

IF 0.2 0 LANGUAGE & LINGUISTICS Revista de Estudos da Linguagem Pub Date : 2022-10-06 DOI:10.17851/2237-2083.30.4.2043-2086
Lucas Lage, Evandro Cunha
{"title":"Mudança semântica e word embeddings: estudos de caso na diacronia do português/ Semantic change and word embeddings: case studies on the diachrony of Portuguese","authors":"Lucas Lage, Evandro Cunha","doi":"10.17851/2237-2083.30.4.2043-2086","DOIUrl":null,"url":null,"abstract":": According to Givón (2001), the lexicon is a repository of concepts which are relatively stable in time, socially shared and well encoded. They are well organized in a network where similar concepts are grouped next to each other. On a similar note, the lexicographer Georges Matoré proposes associative relationships between words and defines the concepts of notional field and testimonial words, which are organizational elements of the lexicon. Using computational techniques such as Word Embeddings, which represent words as vectors in a vector space, it is possible to analyze groupings of words based on their semantic features. This paper aims to explore the viability of such methods in semantic change. The occurrences of the word forms “deus”, “homem”, “mulher”, “pai”, “mae” and “terra” were analyzed in the Tycho Brahe corpus for Portuguese. Word Embeddings were created using the Skip-gram algorithm, and visualizations for a semantic feature network were created for each word in three different time slices. Evidence of the semantic organization of the lexicon and its reorganization was observed through the generated visualizations.","PeriodicalId":42188,"journal":{"name":"Revista de Estudos da Linguagem","volume":" ","pages":""},"PeriodicalIF":0.2000,"publicationDate":"2022-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Revista de Estudos da Linguagem","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.17851/2237-2083.30.4.2043-2086","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
引用次数: 0

Abstract

: According to Givón (2001), the lexicon is a repository of concepts which are relatively stable in time, socially shared and well encoded. They are well organized in a network where similar concepts are grouped next to each other. On a similar note, the lexicographer Georges Matoré proposes associative relationships between words and defines the concepts of notional field and testimonial words, which are organizational elements of the lexicon. Using computational techniques such as Word Embeddings, which represent words as vectors in a vector space, it is possible to analyze groupings of words based on their semantic features. This paper aims to explore the viability of such methods in semantic change. The occurrences of the word forms “deus”, “homem”, “mulher”, “pai”, “mae” and “terra” were analyzed in the Tycho Brahe corpus for Portuguese. Word Embeddings were created using the Skip-gram algorithm, and visualizations for a semantic feature network were created for each word in three different time slices. Evidence of the semantic organization of the lexicon and its reorganization was observed through the generated visualizations.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
语义变化与词语嵌入——以葡萄牙语历时性为例
:根据Givón(2001)的说法,词典是一个概念库,这些概念在时间上相对稳定,社会共享,编码良好。它们被很好地组织在一个网络中,在这个网络中,相似的概念被分组在一起。同样,词典编纂者Georges Matoré提出了单词之间的联想关系,并定义了概念域和证明词的概念,这是词典的组织元素。使用诸如单词嵌入之类的计算技术,将单词表示为向量空间中的向量,可以根据单词的语义特征来分析单词分组。本文旨在探讨这种方法在语义变化中的可行性。在第谷·布拉赫的葡萄牙语语料库中,分析了“deus”、“homem”、“mulher”、“pai”、“mae”和“terra”等单词形式的出现情况。使用Skip gram算法创建单词嵌入,并在三个不同的时间片中为每个单词创建语义特征网络的可视化。通过生成的可视化观察到了词典的语义组织及其重组的证据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Revista de Estudos da Linguagem
Revista de Estudos da Linguagem LANGUAGE & LINGUISTICS-
CiteScore
0.30
自引率
0.00%
发文量
55
审稿时长
52 weeks
期刊最新文献
The Necrobiopolitics of COVID-19 in Brazil: Transitivity Choices in Global Media Representations / A necrobiopolítica da COVID-19 no Brasil: escolhas de transitividade em representações midiáticas globais Emoções projetadas em carta de Michel Temer / Emotions projected in Michel Temer’s letter A toponímia de origem tupi na Região Geográfica Intermediária de Rio Verde (GO) / The toponymy of Tupi origin in the Intermediate Geographical Region of Rio Verde (GO) Sândi Vocálico Externo no Português Vernacular Santomense / External Vocalic Sandhi in Santomean Popular Portuguese Do sofrimento individual à luta coletiva: as narrativas de engajamento de mães em movimentos sociais / From Individual Suffering to Collective Struggle: Narratives of Engagement of Mothers in Social Movements
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1