首页 > 最新文献

Journal of Quantitative Linguistics最新文献

英文 中文
Statistical Analysis of the Tables in Mahadevan’s Concordance of the Indus Valley Script 《印度河流域文字玛哈德万汇编》表的统计分析
IF 1.4 2区 文学 0 LANGUAGE & LINGUISTICS Pub Date : 2019-01-02 DOI: 10.1080/09296174.2017.1406294
M. Oakes
Abstract The Indus Script originates from the culture known as the Indus Valley Civilization, which flourished from approximately 2600 to 1900 bc. Several thousand objects bearing these signs have been found over a wide area of Northern India and Pakistan. In 1977, Iravatham Mahadevan published a concordance of all of the scripts that had been discovered so far. Accompanying the concordance are a set of nine tables showing the distribution of individual signs by position, archaeological site, object type, field symbol (accompanying image), and direction of writing. Analysis of the frequencies of the signs found so far using Large Numbers of Rare Events (LNRE) models estimated the total vocabulary of the language, including signs not yet found, to be about 857. All the tables were analysed using Pearson’s residuals, and it was found that the signs were not randomly distributed, but some showed statistically significant associations with position, object, field symbol or direction of writing. A more detailed analysis of the relation between signs and field symbols was made using correspondence analysis, which showed that certain signs were associated with the unicorn symbol, while others were associated with the gharial and dotted circle symbols.
摘要印度河文字起源于印度河流域文明,该文化繁荣于约公元前2600年至1900年。在印度北部和巴基斯坦的大片地区发现了数千个带有这些标志的物体。1977年,Iravatham Mahadevan出版了迄今为止发现的所有剧本的索引。伴随着一致性的是一组九张表,显示了按位置、考古遗址、物体类型、场地符号(附带图像)和书写方向排列的单个标志的分布。使用大量罕见事件(LNRE)模型对迄今为止发现的迹象的频率进行分析,估计该语言的总词汇(包括尚未发现的迹象)约为857个。使用Pearson残差对所有表格进行了分析,发现这些符号不是随机分布的,但有些符号与位置、对象、场符号或书写方向具有统计学意义。通过对应分析,对符号与场符号之间的关系进行了更详细的分析,结果表明,某些符号与独角兽符号有关,而另一些符号则与gharial和点圆符号有关。
{"title":"Statistical Analysis of the Tables in Mahadevan’s Concordance of the Indus Valley Script","authors":"M. Oakes","doi":"10.1080/09296174.2017.1406294","DOIUrl":"https://doi.org/10.1080/09296174.2017.1406294","url":null,"abstract":"Abstract The Indus Script originates from the culture known as the Indus Valley Civilization, which flourished from approximately 2600 to 1900 bc. Several thousand objects bearing these signs have been found over a wide area of Northern India and Pakistan. In 1977, Iravatham Mahadevan published a concordance of all of the scripts that had been discovered so far. Accompanying the concordance are a set of nine tables showing the distribution of individual signs by position, archaeological site, object type, field symbol (accompanying image), and direction of writing. Analysis of the frequencies of the signs found so far using Large Numbers of Rare Events (LNRE) models estimated the total vocabulary of the language, including signs not yet found, to be about 857. All the tables were analysed using Pearson’s residuals, and it was found that the signs were not randomly distributed, but some showed statistically significant associations with position, object, field symbol or direction of writing. A more detailed analysis of the relation between signs and field symbols was made using correspondence analysis, which showed that certain signs were associated with the unicorn symbol, while others were associated with the gharial and dotted circle symbols.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"26 1","pages":"22 - 47"},"PeriodicalIF":1.4,"publicationDate":"2019-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2017.1406294","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41996848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Levels of Statistical Use in Applied Linguistics Research Articles: From 1986 to 2015 应用语言学研究文章中的统计使用水平:从1986年到2015年
IF 1.4 2区 文学 0 LANGUAGE & LINGUISTICS Pub Date : 2019-01-02 DOI: 10.1080/09296174.2017.1421498
Reza Khany, Khalil Tazik
Abstract The main objective of this study is to assess the levels of statistical use (basic, intermediate, and advanced) in Applied Linguistics research articles over the past three decades (from 1986 to 2015). The corpus included 4079 quantitative and mixed-methods studies published in ten prominent journals of Applied Linguistics. The articles were analysed and the statistical techniques used were aggregated by two current writers and four PhD students in TEFL. Results showed that descriptive statistics (40.04%) were by far the most commonly used technique followed by one-way ANOVA (14.91%), t-test (10.15%), and Pearson correlation (8.76%). Regarding the sophistication level of statistical use, about 78.77% (n = 4686) of the techniques were classified as basic, 14.49% (n = 862) as intermediate, and 6.74% (n = 401) as advanced. Clearly, most of the techniques were either basic or intermediate, with a significant higher percentage for the former. So, a person with basic knowledge of statistics could understand 69.03% of the papers published during 1986 to 2015. It is discussed that researchers should be updated on recent statistical knowledge if they wish to statistically comprehend research articles published in Applied Linguistics journals.
摘要本研究的主要目的是评估过去三十年(1986年至2015年)应用语言学研究文章的统计使用水平(基础、中级和高级)。该语料库包括4079项定量和混合方法研究,发表在十本著名的应用语言学杂志上。两位现任作者和四位TEFL博士生对这些文章进行了分析,并汇总了所使用的统计技术。结果显示,描述性统计(40.04%)是迄今为止最常用的技术,其次是单因素方差分析(14.91%)、t检验(10.15%)和皮尔逊相关(8.76%)。就统计使用的复杂程度而言,约78.77%(n=4686)的技术被归类为基本技术,14.49%(n=862)的技术为中级技术,6.74%(n=401)的技术属于高级技术。显然,大多数技术要么是基础技术,要么是中级技术,前者的比例要高得多。因此,一个具有统计学基础知识的人可以理解1986年至2015年发表的69.03%的论文。有人讨论说,如果研究人员希望从统计学上理解应用语言学期刊上发表的研究文章,他们应该了解最新的统计学知识。
{"title":"Levels of Statistical Use in Applied Linguistics Research Articles: From 1986 to 2015","authors":"Reza Khany, Khalil Tazik","doi":"10.1080/09296174.2017.1421498","DOIUrl":"https://doi.org/10.1080/09296174.2017.1421498","url":null,"abstract":"Abstract The main objective of this study is to assess the levels of statistical use (basic, intermediate, and advanced) in Applied Linguistics research articles over the past three decades (from 1986 to 2015). The corpus included 4079 quantitative and mixed-methods studies published in ten prominent journals of Applied Linguistics. The articles were analysed and the statistical techniques used were aggregated by two current writers and four PhD students in TEFL. Results showed that descriptive statistics (40.04%) were by far the most commonly used technique followed by one-way ANOVA (14.91%), t-test (10.15%), and Pearson correlation (8.76%). Regarding the sophistication level of statistical use, about 78.77% (n = 4686) of the techniques were classified as basic, 14.49% (n = 862) as intermediate, and 6.74% (n = 401) as advanced. Clearly, most of the techniques were either basic or intermediate, with a significant higher percentage for the former. So, a person with basic knowledge of statistics could understand 69.03% of the papers published during 1986 to 2015. It is discussed that researchers should be updated on recent statistical knowledge if they wish to statistically comprehend research articles published in Applied Linguistics journals.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"26 1","pages":"48 - 65"},"PeriodicalIF":1.4,"publicationDate":"2019-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2017.1421498","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45244061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Menzerath-Altmann Law and Prothetic /v/ in Spoken Czech Menzerath-Altmann Law和prosthetics /v/ in口语捷克语
IF 1.4 2区 文学 0 LANGUAGE & LINGUISTICS Pub Date : 2019-01-02 DOI: 10.1080/09296174.2018.1424493
Ján Mačutek, J. Chromý, M. Koščová
Abstract This paper discusses the Menzerath-Altmann law in general at first, then it is shown that the law is valid in spoken Czech. In particular, the relation between word length (measured in the number of syllables) and the mean syllable length (measured in the number of phonemes) is investigated. In addition, we model the relation between the relative occurrence of prothetic /v/ in words and word stems which, according to the official norms of the Czech language, begin with phoneme /o/, and word length in syllables in these words.
本文首先对Menzerath Altmann定律进行了一般性的讨论,然后证明了该定律在捷克语口语中是有效的。特别是,研究了单词长度(以音节数量衡量)和平均音节长度(以音素数量衡量)之间的关系。此外,我们还对单词和词干中假音/v/的相对出现与这些单词音节中的单词长度之间的关系进行了建模,根据捷克语的官方规范,这些词干以音位/o/开头。
{"title":"Menzerath-Altmann Law and Prothetic /v/ in Spoken Czech","authors":"Ján Mačutek, J. Chromý, M. Koščová","doi":"10.1080/09296174.2018.1424493","DOIUrl":"https://doi.org/10.1080/09296174.2018.1424493","url":null,"abstract":"Abstract This paper discusses the Menzerath-Altmann law in general at first, then it is shown that the law is valid in spoken Czech. In particular, the relation between word length (measured in the number of syllables) and the mean syllable length (measured in the number of phonemes) is investigated. In addition, we model the relation between the relative occurrence of prothetic /v/ in words and word stems which, according to the official norms of the Czech language, begin with phoneme /o/, and word length in syllables in these words.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"26 1","pages":"66 - 80"},"PeriodicalIF":1.4,"publicationDate":"2019-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2018.1424493","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41544845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
The Stylometric Impacts of Ageing and Life Events on Identity 年龄和生活事件对身份的文体影响
IF 1.4 2区 文学 0 LANGUAGE & LINGUISTICS Pub Date : 2019-01-02 DOI: 10.1080/09296174.2017.1405719
D. Kernot, T. Bossomaier, R. Bradbury
Abstract Using data containing stylometric markers for depression and Alzheimer’s disease, the 45 novels of Iris Murdoch and P.D. James are examined to see if a signature of an individual, their personality, changes over time due to life events and natural ageing. We use variants of the critical slowing down 1-lag autocorrelation and coefficient of skewness techniques with a multivariate identity measure, RPAS to visualize these changes. We find that life events such as depression, anxiety, and Alzheimer’s disease might be identified outside of natural ageing through a tipping point phenomenon. We believe these techniques might be a useful self-help tool to aid in the signalling of depressive episodes, such as averting suicide, and the early identification of Alzheimer’s disease, or for law enforcement personnel monitoring terrorists on watch lists.
摘要使用包含抑郁症和阿尔茨海默病风格标志物的数据,对Iris Murdoch和P.D.James的45部小说进行了检查,以了解个人的特征、个性是否因生活事件和自然衰老而随时间变化。我们使用临界减缓1-滞后自相关和偏度系数技术的变体,以及多元同一性度量RPAS来可视化这些变化。我们发现,抑郁症、焦虑症和阿尔茨海默病等生活事件可能通过一种临界点现象在自然衰老之外被识别出来。我们认为,这些技术可能是一种有用的自助工具,有助于发出抑郁发作的信号,例如避免自杀和早期识别阿尔茨海默病,或者有助于执法人员监测观察名单上的恐怖分子。
{"title":"The Stylometric Impacts of Ageing and Life Events on Identity","authors":"D. Kernot, T. Bossomaier, R. Bradbury","doi":"10.1080/09296174.2017.1405719","DOIUrl":"https://doi.org/10.1080/09296174.2017.1405719","url":null,"abstract":"Abstract Using data containing stylometric markers for depression and Alzheimer’s disease, the 45 novels of Iris Murdoch and P.D. James are examined to see if a signature of an individual, their personality, changes over time due to life events and natural ageing. We use variants of the critical slowing down 1-lag autocorrelation and coefficient of skewness techniques with a multivariate identity measure, RPAS to visualize these changes. We find that life events such as depression, anxiety, and Alzheimer’s disease might be identified outside of natural ageing through a tipping point phenomenon. We believe these techniques might be a useful self-help tool to aid in the signalling of depressive episodes, such as averting suicide, and the early identification of Alzheimer’s disease, or for law enforcement personnel monitoring terrorists on watch lists.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"26 1","pages":"1 - 21"},"PeriodicalIF":1.4,"publicationDate":"2019-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2017.1405719","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46875671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Is the Menzerath-Altmann Law Specific to Certain Languages in Certain Registers? Menzerath Altmann定律是特定于某些寄存器中的某些语言的吗?
IF 1.4 2区 文学 0 LANGUAGE & LINGUISTICS Pub Date : 2018-10-18 DOI: 10.1080/09296174.2018.1532158
Lirong Xu, Lianzhen He
ABSTRACT Since its formulation, the Menzerath-Altmann law (MAL) has gone through continuing validation and development when applied to different languages or different language units. However, whether the MAL still holds true irrespective of spoken or written register remains a controversial issue. This article endeavours to re-examine the MAL by investigating the correlation between the length of English sentences (measured in the number of clauses) and their constituting clause length (measured in the number of words) in both academic spoken and written registers. It is observed that the MAL is valid in both registers. Further, the fitted parameter values of the MAL can serve as good predictors for register differentiation.
摘要Menzerath Altmann定律自诞生以来,在应用于不同语言或不同语言单位时,经历了不断的验证和发展。然而,无论口头登记还是书面登记,《仲裁示范法》是否仍然适用仍然是一个有争议的问题。本文试图通过调查学术口语和书面语域中英语句子长度(以从句数量衡量)与其构成从句长度(以单词数量衡量)之间的相关性,重新审视《仲裁示范法》。据观察,《仲裁示范法》在两个登记册中都是有效的。此外,MAL的拟合参数值可以作为寄存器微分的良好预测因子。
{"title":"Is the Menzerath-Altmann Law Specific to Certain Languages in Certain Registers?","authors":"Lirong Xu, Lianzhen He","doi":"10.1080/09296174.2018.1532158","DOIUrl":"https://doi.org/10.1080/09296174.2018.1532158","url":null,"abstract":"ABSTRACT Since its formulation, the Menzerath-Altmann law (MAL) has gone through continuing validation and development when applied to different languages or different language units. However, whether the MAL still holds true irrespective of spoken or written register remains a controversial issue. This article endeavours to re-examine the MAL by investigating the correlation between the length of English sentences (measured in the number of clauses) and their constituting clause length (measured in the number of words) in both academic spoken and written registers. It is observed that the MAL is valid in both registers. Further, the fitted parameter values of the MAL can serve as good predictors for register differentiation.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"27 1","pages":"187 - 203"},"PeriodicalIF":1.4,"publicationDate":"2018-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2018.1532158","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45103238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Dynamic Lexical Features of PhD Theses across Disciplines: A Text Mining Approach 跨学科博士论文的动态词汇特征:一种文本挖掘方法
IF 1.4 2区 文学 0 LANGUAGE & LINGUISTICS Pub Date : 2018-10-15 DOI: 10.1080/09296174.2018.1531618
Wei Xiao, S. Sun
ABSTRACT This study employed a text mining method to investigate the lexical features and their dynamic changes of PhD theses across the natural sciences, social sciences and humanities. Four quantitative indices, i.e. TTR, h-point, R1 and writer’s view, were employed to analyze 150 PhD theses (50 theses from each discipline). Although h-point and writer’s view were found counter-intuitively to show insignificant variation across disciplines, the results of TTR and R1 did reveal sharp contrasts between theses in humanities and natural sciences. While the second half of humanities theses showed a significantly higher level of lexical diversity, indicated by higher TTR, theses in natural sciences tended to be richer in content words in the first half, indicated by a higher R1. Meanwhile, theses in social sciences seemed to be more moderate, with features lying in the middle position. This study has implications not only for the widening of applications of quantitative linguistic methods but also for academic writing (especially PhD thesis writing) instruction and practice.
摘要本研究采用文本挖掘方法,研究了自然科学、社会科学和人文科学领域博士论文的词汇特征及其动态变化。采用TTR、h-point、R1和作者观点四个定量指标对150篇博士论文(每个学科50篇)进行了分析。尽管h点和作者的观点在不同学科之间表现出不明显的差异,但TTR和R1的结果确实揭示了人文科学和自然科学论文之间的鲜明对比。虽然人文学科论文的后半部分表现出显著更高的词汇多样性水平,TTR更高,但自然科学论文的前半部分内容词往往更丰富,R1更高。与此同时,社会科学的论文似乎更为温和,其特点处于中间位置。这项研究不仅对拓宽定量语言学方法的应用范围,而且对学术写作(尤其是博士论文写作)的教学和实践都有启示。
{"title":"Dynamic Lexical Features of PhD Theses across Disciplines: A Text Mining Approach","authors":"Wei Xiao, S. Sun","doi":"10.1080/09296174.2018.1531618","DOIUrl":"https://doi.org/10.1080/09296174.2018.1531618","url":null,"abstract":"ABSTRACT This study employed a text mining method to investigate the lexical features and their dynamic changes of PhD theses across the natural sciences, social sciences and humanities. Four quantitative indices, i.e. TTR, h-point, R1 and writer’s view, were employed to analyze 150 PhD theses (50 theses from each discipline). Although h-point and writer’s view were found counter-intuitively to show insignificant variation across disciplines, the results of TTR and R1 did reveal sharp contrasts between theses in humanities and natural sciences. While the second half of humanities theses showed a significantly higher level of lexical diversity, indicated by higher TTR, theses in natural sciences tended to be richer in content words in the first half, indicated by a higher R1. Meanwhile, theses in social sciences seemed to be more moderate, with features lying in the middle position. This study has implications not only for the widening of applications of quantitative linguistic methods but also for academic writing (especially PhD thesis writing) instruction and practice.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"27 1","pages":"114 - 133"},"PeriodicalIF":1.4,"publicationDate":"2018-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2018.1531618","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49664793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Quantitative Analysis of Dependency Structures 依赖结构的定量分析
IF 1.4 2区 文学 0 LANGUAGE & LINGUISTICS Pub Date : 2018-10-05 DOI: 10.1080/09296174.2018.1558835
Yalan Wang
Russkom Jazyke vs Problemy I Rešenija) and an unnecessary mixing of transliteration rules (s, sh), and in the Cataloguing in Publication (CIP) one finds an illegible writing of the Russian names of the editors, which is surely not the fault of the editors. In sum, this volume gives a good overview of the current state of the art in empirical linguistics, relying heavily on the particular subset of statistical methods applied. However, the title of the volume is at least partly misleading, since any links to current studies on Russian and quantitative Russian linguistics are missing. The focus is clearly placed on (statistical) corpus linguistics, wherein – as this volume shows – statistical methods can doubtlessly be applied fruitfully, albeit mainly embedded as an inductive tool with which to obtain general empirical tendencies. Taking into account the rich tradition of Russian quantitative linguistics – represented by outstanding scholars such as R. G. Piotrovskij, M. V. Arapov, J. A. Tuldava and J. K. Krylov, among many others – this volume seems to represent a ‘new’ beginning of the application of quantitative methods of Russian, but by no means of post-Soviet and Russian linguistics in general.
Russkom Jazyke vs Problemy I Rešenija)和不必要的音译规则混合(s, sh),在出版目录(CIP)中,人们发现编辑的俄语名字写得难以辨认,这肯定不是编辑的错。总而言之,本卷给出了一个很好的概述,目前的艺术在经验语言学,严重依赖于统计方法的特定子集应用。然而,卷的标题至少部分是误导,因为任何链接到目前的俄语和定量俄语语言学的研究是缺失的。重点显然放在(统计)语料库语言学,其中-正如本卷所示-统计方法无疑可以有效地应用,尽管主要嵌入作为归纳工具,以获得一般的经验倾向。考虑到俄罗斯数量语言学的丰富传统-代表着杰出的学者,如R. G. Piotrovskij, M. V. Arapov, J. a . Tuldava和J. K. Krylov,在许多其他-本卷似乎代表了一个“新”开始的应用定量方法的俄语,但绝不是后苏联和俄罗斯语言学一般。
{"title":"Quantitative Analysis of Dependency Structures","authors":"Yalan Wang","doi":"10.1080/09296174.2018.1558835","DOIUrl":"https://doi.org/10.1080/09296174.2018.1558835","url":null,"abstract":"Russkom Jazyke vs Problemy I Rešenija) and an unnecessary mixing of transliteration rules (s, sh), and in the Cataloguing in Publication (CIP) one finds an illegible writing of the Russian names of the editors, which is surely not the fault of the editors. In sum, this volume gives a good overview of the current state of the art in empirical linguistics, relying heavily on the particular subset of statistical methods applied. However, the title of the volume is at least partly misleading, since any links to current studies on Russian and quantitative Russian linguistics are missing. The focus is clearly placed on (statistical) corpus linguistics, wherein – as this volume shows – statistical methods can doubtlessly be applied fruitfully, albeit mainly embedded as an inductive tool with which to obtain general empirical tendencies. Taking into account the rich tradition of Russian quantitative linguistics – represented by outstanding scholars such as R. G. Piotrovskij, M. V. Arapov, J. A. Tuldava and J. K. Krylov, among many others – this volume seems to represent a ‘new’ beginning of the application of quantitative methods of Russian, but by no means of post-Soviet and Russian linguistics in general.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"27 1","pages":"83 - 91"},"PeriodicalIF":1.4,"publicationDate":"2018-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2018.1558835","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48787891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
A Statistical Explanation of the Distribution of Sortal Classifiers in Languages of the World via Computational Classifiers 用计算分类器对世界语言中排序分类器分布的统计解释
IF 1.4 2区 文学 0 LANGUAGE & LINGUISTICS Pub Date : 2018-10-01 DOI: 10.1080/09296174.2018.1523777
One-Soon Her, Marc Allassonnière-Tang
ABSTRACT Previous studies demonstrate that morphosyntactic plural markers and the structure of numeral systems have individually strong predictive power with regard to the usage of sortal classifiers in languages. We use these two factors as explanatory variables to train the computational classifier of random forests and evaluate the accuracy of their predictive power when selecting the existence/absence of sortal classifiers as response variable. Our results show that these two factors result in an excellent discrimination performance of random forests, even when taking into account sortal classifiers as an areal feature. However, the correlation between morphosyntactic plural markers and multiplicative bases is weaker than the correlation between sortal classifiers and plural markers plus multiplicative bases. We are thus able to provide novel insights with regard to probabilistic universals on sortal classifiers, and suggest an innovative cross-disciplinary approach to test the effect of implicational universals with computational methods.
以往的研究表明,形态句法复数标记和数词系统的结构各自对语言中分类分类器的使用具有较强的预测能力。我们使用这两个因素作为解释变量来训练随机森林的计算分类器,并在选择排序分类器是否存在作为响应变量时评估其预测能力的准确性。我们的研究结果表明,即使将排序分类器作为一个区域特征考虑在内,这两个因素也会导致随机森林具有出色的识别性能。然而,形态句法复数标记与倍增碱基的相关性弱于分类分类与复数标记加倍增碱基的相关性。因此,我们能够提供关于排序分类器上的概率共相的新见解,并提出一种创新的跨学科方法来用计算方法测试隐含共相的影响。
{"title":"A Statistical Explanation of the Distribution of Sortal Classifiers in Languages of the World via Computational Classifiers","authors":"One-Soon Her, Marc Allassonnière-Tang","doi":"10.1080/09296174.2018.1523777","DOIUrl":"https://doi.org/10.1080/09296174.2018.1523777","url":null,"abstract":"ABSTRACT Previous studies demonstrate that morphosyntactic plural markers and the structure of numeral systems have individually strong predictive power with regard to the usage of sortal classifiers in languages. We use these two factors as explanatory variables to train the computational classifier of random forests and evaluate the accuracy of their predictive power when selecting the existence/absence of sortal classifiers as response variable. Our results show that these two factors result in an excellent discrimination performance of random forests, even when taking into account sortal classifiers as an areal feature. However, the correlation between morphosyntactic plural markers and multiplicative bases is weaker than the correlation between sortal classifiers and plural markers plus multiplicative bases. We are thus able to provide novel insights with regard to probabilistic universals on sortal classifiers, and suggest an innovative cross-disciplinary approach to test the effect of implicational universals with computational methods.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"27 1","pages":"113 - 93"},"PeriodicalIF":1.4,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2018.1523777","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43034123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Readability Analysis of Bengali Literary Texts 孟加拉语文学文本的可读性分析
IF 1.4 2区 文学 0 LANGUAGE & LINGUISTICS Pub Date : 2018-09-24 DOI: 10.1080/09296174.2018.1499456
Shanta Phani, S. Lahiri, A. Biswas
ABSTRACT In this paper we propose a set of novel regression models for readability scoring in Bengali language, which can also be used for Hindi, making use of several lexical, surface-level, syntactic and semantic features. We perform 5-fold and leave-one-out cross-validation on a human-annotated gold standard dataset of 30 passages, written by 4 eminent Bengali litterateurs. On this dataset, our best model achieves a mean squared error (MSE) of 57%, which is better than state-of-the-art results (73% MSE). We further perform feature analysis to identify potentially useful features in learning a regression model for Bengali readability. Ablation studies indicate the importance of compound characters (Juktakkhors) in readability assessment.
摘要在本文中,我们提出了一套新的孟加拉语可读性评分回归模型,该模型也可用于印地语,利用了几个词汇、表层、句法和语义特征。我们对由4位著名孟加拉文学家撰写的30段人类注释金标准数据集进行了5倍的交叉验证。在这个数据集上,我们的最佳模型实现了57%的均方误差(MSE),这比最先进的结果(73%的MSE)要好。我们进一步进行特征分析,以确定在学习孟加拉语可读性回归模型时可能有用的特征。消融研究表明了复合字符(Juktakkhors)在可读性评估中的重要性。
{"title":"Readability Analysis of Bengali Literary Texts","authors":"Shanta Phani, S. Lahiri, A. Biswas","doi":"10.1080/09296174.2018.1499456","DOIUrl":"https://doi.org/10.1080/09296174.2018.1499456","url":null,"abstract":"ABSTRACT In this paper we propose a set of novel regression models for readability scoring in Bengali language, which can also be used for Hindi, making use of several lexical, surface-level, syntactic and semantic features. We perform 5-fold and leave-one-out cross-validation on a human-annotated gold standard dataset of 30 passages, written by 4 eminent Bengali litterateurs. On this dataset, our best model achieves a mean squared error (MSE) of 57%, which is better than state-of-the-art results (73% MSE). We further perform feature analysis to identify potentially useful features in learning a regression model for Bengali readability. Ablation studies indicate the importance of compound characters (Juktakkhors) in readability assessment.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"26 1","pages":"287 - 305"},"PeriodicalIF":1.4,"publicationDate":"2018-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2018.1499456","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42933734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Quantitative aspects of the clause: Length, position and depth of the clause 子句的数量方面:子句的长度、位置和深度
IF 1.4 2区 文学 0 LANGUAGE & LINGUISTICS Pub Date : 2018-08-21 DOI: 10.1080/09296174.2018.1491749
Haruko Sanada
ABSTRACT The present study focuses on the quantitative aspects of clauses related to the empirical study on valency. We employed the length of clause, the position of clause, and the depth of clause as the linguistic entities. It can be observed that there are relationships with significant functions among these entities. A relationship between the position and the depth of clause obeys Köhler’s model while a relationship between the length and the position of the clause shows opposite functions to Köhler’s model. A relationship between the depth and the length of the clause shows a decreasing function. However, length can be affected by other entities. The method of measuring entities, e.g. a position of the clause in the sentence must be reconsidered.
摘要本研究的重点是与配价实证研究相关的从句的数量方面。我们使用了从句的长度、从句的位置和从句的深度作为语言实体。可以观察到,在这些实体之间存在具有重要功能的关系。子句的位置和深度之间的关系遵循Köhler模型,而子句的长度和位置之间的关系显示出与Köchler模型相反的函数。子句的深度和长度之间的关系显示出递减函数。但是,长度可能会受到其他实体的影响。衡量实体的方法,例如句子中条款的位置,必须重新考虑。
{"title":"Quantitative aspects of the clause: Length, position and depth of the clause","authors":"Haruko Sanada","doi":"10.1080/09296174.2018.1491749","DOIUrl":"https://doi.org/10.1080/09296174.2018.1491749","url":null,"abstract":"ABSTRACT The present study focuses on the quantitative aspects of clauses related to the empirical study on valency. We employed the length of clause, the position of clause, and the depth of clause as the linguistic entities. It can be observed that there are relationships with significant functions among these entities. A relationship between the position and the depth of clause obeys Köhler’s model while a relationship between the length and the position of the clause shows opposite functions to Köhler’s model. A relationship between the depth and the length of the clause shows a decreasing function. However, length can be affected by other entities. The method of measuring entities, e.g. a position of the clause in the sentence must be reconsidered.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"26 1","pages":"306 - 329"},"PeriodicalIF":1.4,"publicationDate":"2018-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2018.1491749","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43648888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
Journal of Quantitative Linguistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1