印地语词序中格标记的意外和干扰效应

Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics Pub Date : 2019-06-01 DOI:10.18653/v1/W19-2904

Sidharth Ranjan, Sumeet Agarwal, Rajakrishnan Rajkumar

{"title":"印地语词序中格标记的意外和干扰效应","authors":"Sidharth Ranjan, Sumeet Agarwal, Rajakrishnan Rajkumar","doi":"10.18653/v1/W19-2904","DOIUrl":null,"url":null,"abstract":"Based on the Production-Distribution-Comprehension (PDC) account of language processing, we formulate two distinct hypotheses about case marking, word order choices and processing in Hindi. Our first hypothesis is that Hindi tends to optimize for processing efficiency at both lexical and syntactic levels. We quantify the role of case markers in this process. For the task of predicting the reference sentence occurring in a corpus (amidst meaning-equivalent grammatical variants) using a machine learning model, surprisal estimates from an artificial version of the language (i.e., Hindi without any case markers) result in lower prediction accuracy compared to natural Hindi. Our second hypothesis is that Hindi tends to minimize interference due to case markers while ordering preverbal constituents. We show that Hindi tends to avoid placing next to each other constituents whose heads are marked by identical case inflections. Our findings adhere to PDC assumptions and we discuss their implications for language production, learning and universals.","PeriodicalId":428409,"journal":{"name":"Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Surprisal and Interference Effects of Case Markers in Hindi Word Order\",\"authors\":\"Sidharth Ranjan, Sumeet Agarwal, Rajakrishnan Rajkumar\",\"doi\":\"10.18653/v1/W19-2904\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Based on the Production-Distribution-Comprehension (PDC) account of language processing, we formulate two distinct hypotheses about case marking, word order choices and processing in Hindi. Our first hypothesis is that Hindi tends to optimize for processing efficiency at both lexical and syntactic levels. We quantify the role of case markers in this process. For the task of predicting the reference sentence occurring in a corpus (amidst meaning-equivalent grammatical variants) using a machine learning model, surprisal estimates from an artificial version of the language (i.e., Hindi without any case markers) result in lower prediction accuracy compared to natural Hindi. Our second hypothesis is that Hindi tends to minimize interference due to case markers while ordering preverbal constituents. We show that Hindi tends to avoid placing next to each other constituents whose heads are marked by identical case inflections. Our findings adhere to PDC assumptions and we discuss their implications for language production, learning and universals.\",\"PeriodicalId\":428409,\"journal\":{\"name\":\"Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18653/v1/W19-2904\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/W19-2904","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

基于语言加工的生产-分布-理解(PDC)理论，我们对印地语的分格标注、词序选择和加工提出了两种截然不同的假设。我们的第一个假设是，印地语倾向于在词汇和句法层面上优化处理效率。我们量化了案例标记在这一过程中的作用。对于使用机器学习模型预测语料库中出现的参考句子(在意义相等的语法变体中)的任务，来自人工语言版本(即没有任何大小写标记的印地语)的意外估计导致与自然印地语相比的预测准确性较低。我们的第二个假设是，印地语倾向于在排序前语成分时尽量减少大小写标记的干扰。我们表明，印地语倾向于避免放置相邻的组成部分，他们的头部有相同的屈折。我们的研究结果坚持PDC假设，并讨论了它们对语言产生、学习和普遍性的影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Surprisal and Interference Effects of Case Markers in Hindi Word Order

Based on the Production-Distribution-Comprehension (PDC) account of language processing, we formulate two distinct hypotheses about case marking, word order choices and processing in Hindi. Our first hypothesis is that Hindi tends to optimize for processing efficiency at both lexical and syntactic levels. We quantify the role of case markers in this process. For the task of predicting the reference sentence occurring in a corpus (amidst meaning-equivalent grammatical variants) using a machine learning model, surprisal estimates from an artificial version of the language (i.e., Hindi without any case markers) result in lower prediction accuracy compared to natural Hindi. Our second hypothesis is that Hindi tends to minimize interference due to case markers while ordering preverbal constituents. We show that Hindi tends to avoid placing next to each other constituents whose heads are marked by identical case inflections. Our findings adhere to PDC assumptions and we discuss their implications for language production, learning and universals.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics

自引率

0.00%

发文量