语言变化的动力:以波兰barzo bbbbo bardzo为例

Q2 Arts and Humanities Studies in Polish Linguistics Pub Date : 2021-01-01 DOI:10.4467/23005920spl.21.007.14261
R. L. Górski
{"title":"语言变化的动力:以波兰barzo bbbbo bardzo为例","authors":"R. L. Górski","doi":"10.4467/23005920spl.21.007.14261","DOIUrl":null,"url":null,"abstract":"The paper discusses the benefits and shortcomings of modelling a language change with logistic regression, an approach often called the Piotrowski-Altmann law. It is shown with an example of an isolated change, which occurred in Middle Polish, namely barzo > bardzo. The study is based on a historical corpus of Polish consisting of several hundreds of texts with over 12 million running words. Logistic regression based on the entire dataset shows relatively high goodness of fit, still there are some data points, especially close to the end of the process, which are quite far removed from the idealised trajectory. In the article, the author seeks to answer the question: to what extent the quality of the corpus affects the model. An experiment was conducted: a number of texts were randomly removed in order to create a smaller corpus, containing 90%, 75% and 50% of the texts of the entire set. Since such procedure is repeated 200 times, it is possible to compare the distribution of the scores indicating the goodness of fit of the model. It turns out that the smaller the corpus, the more diverse the goodness of fit, and in some rare cases it is even better than its counterpart for a larger corpus. Still the larger the corpus, the scores indicating goodness of fit tend to be higher.","PeriodicalId":37336,"journal":{"name":"Studies in Polish Linguistics","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Dynamics of Language Change: The Case of Polish barzo > bardzo\",\"authors\":\"R. L. Górski\",\"doi\":\"10.4467/23005920spl.21.007.14261\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The paper discusses the benefits and shortcomings of modelling a language change with logistic regression, an approach often called the Piotrowski-Altmann law. It is shown with an example of an isolated change, which occurred in Middle Polish, namely barzo > bardzo. The study is based on a historical corpus of Polish consisting of several hundreds of texts with over 12 million running words. Logistic regression based on the entire dataset shows relatively high goodness of fit, still there are some data points, especially close to the end of the process, which are quite far removed from the idealised trajectory. In the article, the author seeks to answer the question: to what extent the quality of the corpus affects the model. An experiment was conducted: a number of texts were randomly removed in order to create a smaller corpus, containing 90%, 75% and 50% of the texts of the entire set. Since such procedure is repeated 200 times, it is possible to compare the distribution of the scores indicating the goodness of fit of the model. It turns out that the smaller the corpus, the more diverse the goodness of fit, and in some rare cases it is even better than its counterpart for a larger corpus. Still the larger the corpus, the scores indicating goodness of fit tend to be higher.\",\"PeriodicalId\":37336,\"journal\":{\"name\":\"Studies in Polish Linguistics\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Studies in Polish Linguistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4467/23005920spl.21.007.14261\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Arts and Humanities\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Studies in Polish Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4467/23005920spl.21.007.14261","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Arts and Humanities","Score":null,"Total":0}
引用次数: 0

摘要

本文讨论了用逻辑回归建模语言变化的优点和缺点,这种方法通常被称为Piotrowski-Altmann定律。这里有一个单独的变化的例子,发生在中波兰语,即barzo > bardzo。这项研究基于波兰语的历史语料库,该语料库由数百个文本组成,超过1200万运行单词。基于整个数据集的逻辑回归显示出较高的拟合优度,但仍有一些数据点,特别是接近过程结束的数据点,与理想轨迹相距甚远。在本文中,作者试图回答这样一个问题:语料库的质量在多大程度上影响了模型。进行了一个实验:为了创建一个更小的语料库,随机删除一些文本,其中包含整个集合的90%,75%和50%的文本。由于这样的过程重复了200次,因此可以比较表明模型拟合优度的分数的分布。事实证明,语料库越小,拟合优度就越多样化,在某些罕见的情况下,它甚至比一个更大的语料库还要好。然而,语料库越大,表示拟合优度的分数往往越高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Dynamics of Language Change: The Case of Polish barzo > bardzo
The paper discusses the benefits and shortcomings of modelling a language change with logistic regression, an approach often called the Piotrowski-Altmann law. It is shown with an example of an isolated change, which occurred in Middle Polish, namely barzo > bardzo. The study is based on a historical corpus of Polish consisting of several hundreds of texts with over 12 million running words. Logistic regression based on the entire dataset shows relatively high goodness of fit, still there are some data points, especially close to the end of the process, which are quite far removed from the idealised trajectory. In the article, the author seeks to answer the question: to what extent the quality of the corpus affects the model. An experiment was conducted: a number of texts were randomly removed in order to create a smaller corpus, containing 90%, 75% and 50% of the texts of the entire set. Since such procedure is repeated 200 times, it is possible to compare the distribution of the scores indicating the goodness of fit of the model. It turns out that the smaller the corpus, the more diverse the goodness of fit, and in some rare cases it is even better than its counterpart for a larger corpus. Still the larger the corpus, the scores indicating goodness of fit tend to be higher.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Studies in Polish Linguistics
Studies in Polish Linguistics Arts and Humanities-Language and Linguistics
CiteScore
0.50
自引率
0.00%
发文量
3
期刊最新文献
Agentive reading in the Middle: The structure of Polish reflexiva tantum Temporal progression in film retellings in Polish: Perspectives on the interaction of the imperfective aspect and narrative principles English-Sourced Direct and Indirect Borrowings in a New Lexicon of Polish Anglicisms Responding to Omicron: Speaker Commitment and Legitimisation in COVID-related Press Conferences Morphopragmatic View on the Ironic Use of Diminutives in Polish
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1