帕累托原则在语法分布中的早期证据:汉语会话语篇中的使役情境

IF 0.2 3区 文学 0 ASIAN STUDIES Journal of Chinese Linguistics Pub Date : 2022-07-12 DOI:10.1353/jcl.2022.0017
Danjie Su
{"title":"帕累托原则在语法分布中的早期证据:汉语会话语篇中的使役情境","authors":"Danjie Su","doi":"10.1353/jcl.2022.0017","DOIUrl":null,"url":null,"abstract":"ABSTRACT:This study is an initial report on Pareto distribution (the 80/20 rule) of grammatical constructions; namely, about 20% of the types of grammatical constructions for causative situations account for about 80% of the uses in conversation. I use a data-driven approach to investigate the grammatical constructions that Chinese L1 speakers choose in spontaneous talk show conversations to describe causative situations. I identify two specific Pareto distributional patterns. 1) The distribution of all 22 constructions for causative situations constitutes a Pareto ABC diagram with the A-class (ba-; unmarked passive; rang-; bei-; resultative; gei-) containing 27.3% of the types but accounting for 88.8% of all the 1,497 uses. 2) Most uses of a grammatical construction come from a small set of subtypes: The full ba-accounts for 87.9% of all ba-uses; the reduced bei-accounts for 86.8%; 37.5% of rang-subtypes account for 84.2%. These patterns can be explained by the Lens concept. I conclude that a few constructions account for most grammatical choices of L1 Chinese speakers in conversation. Understanding these grammatical distributions in natural discourse can improve the efficiency and efficacy of language teaching and Natural Language Processing (NLP).摘要:本研究是关于自然会话中语法构式的帕累托(Pareto)分布(二八法则) 的第一份报告——大约 20%的语法构式类型占表述致使情景的所有实 际用例的 80%。基于脱口秀自然会话语料,本文使用数据驱动的方法 穷尽式地探究汉语母语者选择何种语法构式表述会话中的致使情景。 本文关于帕累托分布的具体发现是:(一)会话中表述致使情景的所有 22 种汉语语法构式的分布反映了帕累托原理及其 ABC 等级分布。A 级 的构式类型数量为 22 种构式类型的 27.3%,却占到所有 1,497 条用例 的 88.8%。A 级包括的最高频构式依次是:把字句、无标记被动句、 让字句、被字句、结果补语、给字句。B 级的构式类型数量同样占 27.3%,却仅占所有用例的 8.9%。C 级的构式类型数量占了近一半 (45.5%),却只占所有用例的 2.3%。(二)语法构式的大多数用例来自 个别子类型:完整版把字句占所有把字句用例的 87.9%;减短版被字 句占所有被字句用例的 86.8%;37.5%的让字句类型占所有让字句用 例的 84.2%。Lens 理论可以解释这些分布规律。本文结论是,汉语母 语者在自然会话中选用少数构式类型来表述绝大部分致使情景。该发 现进一步揭示了自然话语中语法构式的分布,这对语言教学和自然语 言处理具有直接参考价值。","PeriodicalId":44675,"journal":{"name":"Journal of Chinese Linguistics","volume":"50 1","pages":"443 - 474"},"PeriodicalIF":0.2000,"publicationDate":"2022-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Early Evidence of the Pareto Principle in Grammatical Distribution: Causative Situations in Chinese Conversational Discourse\",\"authors\":\"Danjie Su\",\"doi\":\"10.1353/jcl.2022.0017\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"ABSTRACT:This study is an initial report on Pareto distribution (the 80/20 rule) of grammatical constructions; namely, about 20% of the types of grammatical constructions for causative situations account for about 80% of the uses in conversation. I use a data-driven approach to investigate the grammatical constructions that Chinese L1 speakers choose in spontaneous talk show conversations to describe causative situations. I identify two specific Pareto distributional patterns. 1) The distribution of all 22 constructions for causative situations constitutes a Pareto ABC diagram with the A-class (ba-; unmarked passive; rang-; bei-; resultative; gei-) containing 27.3% of the types but accounting for 88.8% of all the 1,497 uses. 2) Most uses of a grammatical construction come from a small set of subtypes: The full ba-accounts for 87.9% of all ba-uses; the reduced bei-accounts for 86.8%; 37.5% of rang-subtypes account for 84.2%. These patterns can be explained by the Lens concept. I conclude that a few constructions account for most grammatical choices of L1 Chinese speakers in conversation. Understanding these grammatical distributions in natural discourse can improve the efficiency and efficacy of language teaching and Natural Language Processing (NLP).摘要:本研究是关于自然会话中语法构式的帕累托(Pareto)分布(二八法则) 的第一份报告——大约 20%的语法构式类型占表述致使情景的所有实 际用例的 80%。基于脱口秀自然会话语料,本文使用数据驱动的方法 穷尽式地探究汉语母语者选择何种语法构式表述会话中的致使情景。 本文关于帕累托分布的具体发现是:(一)会话中表述致使情景的所有 22 种汉语语法构式的分布反映了帕累托原理及其 ABC 等级分布。A 级 的构式类型数量为 22 种构式类型的 27.3%,却占到所有 1,497 条用例 的 88.8%。A 级包括的最高频构式依次是:把字句、无标记被动句、 让字句、被字句、结果补语、给字句。B 级的构式类型数量同样占 27.3%,却仅占所有用例的 8.9%。C 级的构式类型数量占了近一半 (45.5%),却只占所有用例的 2.3%。(二)语法构式的大多数用例来自 个别子类型:完整版把字句占所有把字句用例的 87.9%;减短版被字 句占所有被字句用例的 86.8%;37.5%的让字句类型占所有让字句用 例的 84.2%。Lens 理论可以解释这些分布规律。本文结论是,汉语母 语者在自然会话中选用少数构式类型来表述绝大部分致使情景。该发 现进一步揭示了自然话语中语法构式的分布,这对语言教学和自然语 言处理具有直接参考价值。\",\"PeriodicalId\":44675,\"journal\":{\"name\":\"Journal of Chinese Linguistics\",\"volume\":\"50 1\",\"pages\":\"443 - 474\"},\"PeriodicalIF\":0.2000,\"publicationDate\":\"2022-07-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Chinese Linguistics\",\"FirstCategoryId\":\"98\",\"ListUrlMain\":\"https://doi.org/10.1353/jcl.2022.0017\",\"RegionNum\":3,\"RegionCategory\":\"文学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"0\",\"JCRName\":\"ASIAN STUDIES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chinese Linguistics","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1353/jcl.2022.0017","RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"ASIAN STUDIES","Score":null,"Total":0}
引用次数: 0

摘要

摘要:本文是关于语法结构帕累托分布(80/20规则)的初步研究报告;也就是说,约20%的使役情景语法结构类型占会话中约80%的用法。我使用数据驱动的方法来调查汉语母语使用者在即兴脱口秀对话中选择的语法结构,以描述使使性情景。我确定了两种特定的帕累托分布模式。1)所有22种使役情景结构的分布构成了a类(ba-;没有标记的被动;响了,;贝-;表示结果的;Gei -)占总数的27.3%,但占全部1497种用法的88.8%。2)一个语法结构的大部分用法来自于一小部分子类型:完整的ba-占所有ba-用法的87.9%;减少的北占86.8%;37.5%的范围亚型占84.2%。这些模式可以用Lens的概念来解释。我的结论是,少数结构占了母语汉语使用者在会话中的大部分语法选择。理解这些语法分布在自然话语可以提高语言教学的效率和有效性和自然语言处理(NLP)。摘要:本研究是关于自然会话中语法构式的帕累托(帕累托)分布(二八法则)的第一份报告——大约20%的语法构式类型占表述致使情景的所有实际用例的80%。基于脱口秀自然会话语料,本文使用数据驱动的方法 穷尽式地探究汉语母语者选择何种语法构式表述会话中的致使情景。 本文关于帕累托分布的具体发现是:(一)会话中表述致使情景的所有22种汉语语法构式的分布反映了帕累托原理及其ABC等级分布。一个级的构式类型数量为22种构式类型的27.3%,却占到所有1497条用例的88.8%。A:中文,中文,中文,中文,中文,中文,中文,中文,中文,中文,中文。27.3%, 8.9%。C级的构式类型数量占了近一半(45.5%),却只占所有用例的2.3%。(二)语法构式的大多数用例来自个别子类型:完整版把字句占所有把字句用例的87.9%,减短版被字句占所有被字句用例的86.8%;37.5%的让字句类型占所有让字句用例的84.2%。透镜本文结论是,汉语母 语者在自然会话中选用少数构式类型来表述绝大部分致使情景。该发 现进一步揭示了自然话语中语法构式的分布,这对语言教学和自然语 言处理具有直接参考价值。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Early Evidence of the Pareto Principle in Grammatical Distribution: Causative Situations in Chinese Conversational Discourse
ABSTRACT:This study is an initial report on Pareto distribution (the 80/20 rule) of grammatical constructions; namely, about 20% of the types of grammatical constructions for causative situations account for about 80% of the uses in conversation. I use a data-driven approach to investigate the grammatical constructions that Chinese L1 speakers choose in spontaneous talk show conversations to describe causative situations. I identify two specific Pareto distributional patterns. 1) The distribution of all 22 constructions for causative situations constitutes a Pareto ABC diagram with the A-class (ba-; unmarked passive; rang-; bei-; resultative; gei-) containing 27.3% of the types but accounting for 88.8% of all the 1,497 uses. 2) Most uses of a grammatical construction come from a small set of subtypes: The full ba-accounts for 87.9% of all ba-uses; the reduced bei-accounts for 86.8%; 37.5% of rang-subtypes account for 84.2%. These patterns can be explained by the Lens concept. I conclude that a few constructions account for most grammatical choices of L1 Chinese speakers in conversation. Understanding these grammatical distributions in natural discourse can improve the efficiency and efficacy of language teaching and Natural Language Processing (NLP).摘要:本研究是关于自然会话中语法构式的帕累托(Pareto)分布(二八法则) 的第一份报告——大约 20%的语法构式类型占表述致使情景的所有实 际用例的 80%。基于脱口秀自然会话语料,本文使用数据驱动的方法 穷尽式地探究汉语母语者选择何种语法构式表述会话中的致使情景。 本文关于帕累托分布的具体发现是:(一)会话中表述致使情景的所有 22 种汉语语法构式的分布反映了帕累托原理及其 ABC 等级分布。A 级 的构式类型数量为 22 种构式类型的 27.3%,却占到所有 1,497 条用例 的 88.8%。A 级包括的最高频构式依次是:把字句、无标记被动句、 让字句、被字句、结果补语、给字句。B 级的构式类型数量同样占 27.3%,却仅占所有用例的 8.9%。C 级的构式类型数量占了近一半 (45.5%),却只占所有用例的 2.3%。(二)语法构式的大多数用例来自 个别子类型:完整版把字句占所有把字句用例的 87.9%;减短版被字 句占所有被字句用例的 86.8%;37.5%的让字句类型占所有让字句用 例的 84.2%。Lens 理论可以解释这些分布规律。本文结论是,汉语母 语者在自然会话中选用少数构式类型来表述绝大部分致使情景。该发 现进一步揭示了自然话语中语法构式的分布,这对语言教学和自然语 言处理具有直接参考价值。
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
0.40
自引率
0.00%
发文量
34
期刊介绍: Journal of Chinese Linguistics (JCL) is an academic journal, which comprises research content from both general linguistics and Chinese linguistics. It is edited by a distinguished editorial board of international expertise. There are two publications: Journal of Chinese Linguistics (JCL) and Journal of Chinese Linguistics Monograph Series (JCLMS).
期刊最新文献
The origin of the adjectival and adverbial mulaolao in Wu Chinese From analogies to negativity: Pragmatic functions and stance expression of subjective counterfactual ruguo sentence. (In Chinese) Word Frequency Modulates the Selection of Semantic Access Pathways of Spoken Words in the Second Language The Resumptive View of the Cantonese Dummy keoi5 Revisited Phonation types and morpho-phonological structure as linguistic prerequisites of tonogenesis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1