首页 > 最新文献

Special Interest Group on Computational Morphology and Phonology Workshop最新文献

英文 中文
Mining linguistic tone patterns with symbolic representation 用符号表示挖掘语言语气模式
Pub Date : 2016-08-01 DOI: 10.18653/v1/W16-2001
Shuo Zhang
This paper conceptualizes speech prosody data mining and its potential application in data-driven phonology/phonetics research. We first conceptualize Speech Prosody Mining (SPM) in a time-series data mining framework. Specifically, we propose using efficient symbolic representations for speech prosody time-series similarity computation. We experiment with both symbolic and numeric representations and distance measures in a series of time-series classification and clustering experiments on a dataset of Mandarin tones. Evaluation results show that symbolic representation performs comparably with other representations at a reduced cost, which enables us to efficiently mine large speech prosody corpora while opening up to possibilities of using a wide range of algorithms that require discrete valued data. We discuss the potential of SPM using time-series mining techniques in future works.
本文概述了语音韵律数据挖掘及其在数据驱动音韵学/语音学研究中的潜在应用。我们首先在时序数据挖掘框架中对语音韵律挖掘(SPM)进行了概念化。具体来说,我们提出使用有效的符号表示来计算语音韵律时间序列相似性。我们在普通话声调数据集上进行了一系列时间序列分类和聚类实验,实验了符号和数字表示以及距离度量。评估结果表明,符号表示以较低的成本与其他表示相比较,这使我们能够有效地挖掘大型语音韵律语料库,同时为使用需要离散值数据的广泛算法开辟了可能性。我们在未来的工作中讨论了使用时间序列挖掘技术的SPM的潜力。
{"title":"Mining linguistic tone patterns with symbolic representation","authors":"Shuo Zhang","doi":"10.18653/v1/W16-2001","DOIUrl":"https://doi.org/10.18653/v1/W16-2001","url":null,"abstract":"This paper conceptualizes speech prosody data mining and its potential application in data-driven phonology/phonetics research. We first conceptualize Speech Prosody Mining (SPM) in a time-series data mining framework. Specifically, we propose using efficient symbolic representations for speech prosody time-series similarity computation. We experiment with both symbolic and numeric representations and distance measures in a series of time-series classification and clustering experiments on a dataset of Mandarin tones. Evaluation results show that symbolic representation performs comparably with other representations at a reduced cost, which enables us to efficiently mine large speech prosody corpora while opening up to possibilities of using a wide range of algorithms that require discrete valued data. We discuss the potential of SPM using time-series mining techniques in future works.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122768288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Towards robust cross-linguistic comparisons of phonological networks 走向健全的跨语言语音网络比较
Pub Date : 2016-08-01 DOI: 10.18653/v1/W16-2018
Philippa Shoemark, S. Goldwater, James P. Kirby, Rik Sarkar
Recent work has proposed using network science to analyse the structure of the mental lexicon by viewing words as nodes in a phonological network, with edges connecting words that differ by a single phoneme. Comparing the structure of phonological networks across different languages could provide insights into linguistic typology and the cognitive pressures that shape language acquisition, evolution, and processing. However, previous studies have not considered how statistics gathered from these networks are affected by factors such as lexicon size and the distribution of word lengths. We show that these factors can substantially affect the statistics of a phonological network and propose a new method for making more robust comparisons. We then analyse eight languages, finding many commonalities but also some qualitative differences in their lexicon structure.
最近的研究建议使用网络科学来分析心理词汇的结构,方法是将单词视为语音网络中的节点,这些节点的边缘连接着单个音素不同的单词。比较不同语言的音系网络结构可以深入了解语言类型学和影响语言习得、进化和加工的认知压力。然而,之前的研究并没有考虑到从这些网络中收集到的统计数据是如何受到词汇量和单词长度分布等因素的影响的。我们表明,这些因素可以实质性地影响语音网络的统计,并提出了一种新的方法来进行更稳健的比较。然后,我们分析了八种语言,发现它们在词汇结构上有许多共同点,但也有一些质的差异。
{"title":"Towards robust cross-linguistic comparisons of phonological networks","authors":"Philippa Shoemark, S. Goldwater, James P. Kirby, Rik Sarkar","doi":"10.18653/v1/W16-2018","DOIUrl":"https://doi.org/10.18653/v1/W16-2018","url":null,"abstract":"Recent work has proposed using network science to analyse the structure of the mental lexicon by viewing words as nodes in a phonological network, with edges connecting words that differ by a single phoneme. Comparing the structure of phonological networks across different languages could provide insights into linguistic typology and the cognitive pressures that shape language acquisition, evolution, and processing. However, previous studies have not considered how statistics gathered from these networks are affected by factors such as lexicon size and the distribution of word lengths. We show that these factors can substantially affect the statistics of a phonological network and propose a new method for making more robust comparisons. We then analyse eight languages, finding many commonalities but also some qualitative differences in their lexicon structure.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121700364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
The SIGMORPHON 2016 Shared Task—Morphological Reinflection SIGMORPHON 2016共享任务-形态反射
Pub Date : 2016-08-01 DOI: 10.18653/v1/W16-2002
Ryan Cotterell, Christo Kirov, John Sylak-Glassman, David Yarowsky, Jason Eisner, Mans Hulden
The 2016 SIGMORPHON Shared Task was devoted to the problem of morphological reinflection. It introduced morphological datasets for 10 languages with diverse ty-pological characteristics. The shared task drew submissions from 9 teams representing 11 institutions reflecting a variety of approaches to addressing supervised learning of reinflection. For the simplest task, in-flection generation from lemmas, the best system averaged 95.56% exact-match accuracy across all languages, ranging from Maltese (88.99%) to Hungarian (99.30%). With the relatively large training datasets provided, recurrent neural network architectures consistently performed best—in fact, there was a significant margin between neural and non-neural approaches. The best neural approach, averaged over all tasks and languages, outperformed the best non-neural one by 13.76% absolute; on individual tasks and languages the gap in accuracy sometimes exceeded 60%. Overall, the results show a strong state of the art, and serve as encouragement for future shared tasks that explore morphological analysis and generation with varying degrees of supervision.
2016年SIGMORPHON共享任务致力于形态学反射问题。介绍了具有不同形态特征的10种语言的形态数据集。共同的任务吸引了来自11个机构的9个团队的提交,反映了解决反思监督学习的各种方法。对于最简单的任务,从引词生成词形,最好的系统在所有语言中平均精确匹配准确率为95.56%,从马耳他语(88.99%)到匈牙利语(99.30%)。在提供相对较大的训练数据集的情况下,递归神经网络架构始终表现最好——事实上,神经和非神经方法之间存在显著差异。在所有任务和语言中,最佳神经方法的平均表现比最佳非神经方法高出13.76%;在个别任务和语言上,准确度的差距有时超过60%。总的来说,结果显示了一个强大的艺术状态,并鼓励未来在不同程度的监督下探索形态分析和生成的共享任务。
{"title":"The SIGMORPHON 2016 Shared Task—Morphological Reinflection","authors":"Ryan Cotterell, Christo Kirov, John Sylak-Glassman, David Yarowsky, Jason Eisner, Mans Hulden","doi":"10.18653/v1/W16-2002","DOIUrl":"https://doi.org/10.18653/v1/W16-2002","url":null,"abstract":"The 2016 SIGMORPHON Shared Task was devoted to the problem of morphological reinflection. It introduced morphological datasets for 10 languages with diverse ty-pological characteristics. The shared task drew submissions from 9 teams representing 11 institutions reflecting a variety of approaches to addressing supervised learning of reinflection. For the simplest task, in-flection generation from lemmas, the best system averaged 95.56% exact-match accuracy across all languages, ranging from Maltese (88.99%) to Hungarian (99.30%). With the relatively large training datasets provided, recurrent neural network architectures consistently performed best—in fact, there was a significant margin between neural and non-neural approaches. The best neural approach, averaged over all tasks and languages, outperformed the best non-neural one by 13.76% absolute; on individual tasks and languages the gap in accuracy sometimes exceeded 60%. Overall, the results show a strong state of the art, and serve as encouragement for future shared tasks that explore morphological analysis and generation with varying degrees of supervision.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116510257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 236
Read my points: Effect of animation type when speech-reading from EMA data 阅读我的观点:从EMA数据中阅读演讲时动画类型的影响
Pub Date : 2016-08-01 DOI: 10.18653/v1/W16-2014
Kristy James, Martijn B. Wieling
Three popular vocal-tract animation paradigms were tested for intelligibility when displaying videos of pre-recorded Electromagnetic Articulography (EMA) data in an online experiment. EMA tracks the position of sensors attached to the tongue. The conditions were dots with tails (where only the coil location is presented), 2D animation (where the dots are connected to form 2D representations of the lips, tongue surface and chin), and a 3D model with coil locations driving facial and tongue rigs. The 2D animation (recorded in VisArtico) showed the highest identification of the prompts.
在一个在线实验中,测试了三种流行的声道动画范式在显示预先录制的电磁声道成像(EMA)数据视频时的可理解性。EMA追踪附着在舌头上的传感器的位置。条件是带尾的点(只显示线圈位置),2D动画(将点连接起来形成嘴唇、舌头表面和下巴的2D表示),以及带线圈位置驱动面部和舌头的3D模型。2D动画(记录在VisArtico中)对提示的识别程度最高。
{"title":"Read my points: Effect of animation type when speech-reading from EMA data","authors":"Kristy James, Martijn B. Wieling","doi":"10.18653/v1/W16-2014","DOIUrl":"https://doi.org/10.18653/v1/W16-2014","url":null,"abstract":"Three popular vocal-tract animation paradigms were tested for intelligibility when displaying videos of pre-recorded Electromagnetic Articulography (EMA) data in an online experiment. EMA tracks the position of sensors attached to the tongue. The conditions were dots with tails (where only the coil location is presented), 2D animation (where the dots are connected to form 2D representations of the lips, tongue surface and chin), and a 3D model with coil locations driving facial and tongue rigs. The 2D animation (recorded in VisArtico) showed the highest identification of the prompts.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131978279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
The Columbia University - New York University Abu Dhabi SIGMORPHON 2016 Morphological Reinflection Shared Task Submission 哥伦比亚大学-阿布扎比纽约大学SIGMORPHON 2016形态学反思共享任务提交
Pub Date : 2016-08-01 DOI: 10.18653/v1/W16-2011
Dima Taji, R. Eskander, Nizar Habash, Owen Rambow
We present a high-level description and error analysis of the Columbia-NYUAD sys-tem for morphological reinflection, which builds on previous work on supervised morphological paradigm completion. Our system improved over the shared task baseline on some of the languages, reaching up to 30% absolute increase. Our ranking on average was 5th in Track 1, 8th in Track 2, and 3rd in Track 3.
我们提出了一个高层次的描述和误差分析哥伦比亚大学-纽约大学的系统形态反射,这是建立在先前的工作,监督形态范式完成。我们的系统在某些语言的共享任务基线上进行了改进,达到了30%的绝对增长。我们的平均排名是第1条第5名,第2条第8名,第3条第3名。
{"title":"The Columbia University - New York University Abu Dhabi SIGMORPHON 2016 Morphological Reinflection Shared Task Submission","authors":"Dima Taji, R. Eskander, Nizar Habash, Owen Rambow","doi":"10.18653/v1/W16-2011","DOIUrl":"https://doi.org/10.18653/v1/W16-2011","url":null,"abstract":"We present a high-level description and error analysis of the Columbia-NYUAD sys-tem for morphological reinflection, which builds on previous work on supervised morphological paradigm completion. Our system improved over the shared task baseline on some of the languages, reaching up to 30% absolute increase. Our ranking on average was 5th in Track 1, 8th in Track 2, and 3rd in Track 3.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116884907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Inferring Morphotactics from Interlinear Glossed Text: Combining Clustering and Precision Grammars 结合聚类和精确语法从行间有光泽文本推断形态策略
Pub Date : 2016-08-01 DOI: 10.18653/v1/W16-2021
Olga Zamaraeva
In this paper I present a k-means clustering approach to inferring morphological position classes (morphotactics) from Interlinear Glossed Text (IGT), data collections available for some endangered and low-resource languages. While the experiment is not restricted to low-resource languages, they are meant to be the targeted domain. Specifically my approach is meant to be for field linguists who do not necessarily know how many position classes there are in the language they work with and what the position classes are, but have the expertise to evaluate different hypotheses. It builds on an existing approach (Wax, 2014), but replaces the core heuristic with a clustering algorithm. The results mainly illustrate two points. First, they are largely negative, which shows that the baseline algorithm (summarized in the paper) uses a very predictive feature to determine whether affixes belong to the same position class, namely edge overlap in the affix graph. At the same time, unlike the baseline method that relies entirely on a single feature, kmeans clustering can account for different features and helps discover more morphological phenomena, e.g. circumfixation. I conclude that unsupervised learning algorithms such as k-means clustering can in principle be used for morphotactics inference, though the algorithm should probably weigh certain features more than others. Most importantly, I conclude that clustering is a promising approach for diverse morphotactics and as such it can facilitate linguistic analysis of field languages.
在本文中,我提出了一种k-means聚类方法,从一些濒危和低资源语言的数据收集中推断出线间光滑文本(IGT)的形态位置类(morphotactics)。虽然实验并不局限于低资源语言,但它们是目标领域。具体来说,我的方法是为现场语言学家准备的,他们不一定知道他们使用的语言中有多少个位置类,也不知道位置类是什么,但他们有专业知识来评估不同的假设。它建立在现有方法(Wax, 2014)的基础上,但用聚类算法取代了核心启发式。研究结果主要说明了两点。首先,它们在很大程度上是负的,这表明基线算法(本文总结)使用了一个非常预测性的特征来确定词缀是否属于同一位置类,即词缀图中的边缘重叠。同时,与完全依赖单个特征的基线方法不同,kmeans聚类可以考虑不同的特征,并有助于发现更多的形态现象,例如圆周固定。我的结论是,像k-means聚类这样的无监督学习算法原则上可以用于形态策略推断,尽管算法可能会比其他算法更看重某些特征。最重要的是,我得出结论,聚类是一种很有前途的方法,可以用于多种形态策略,因此它可以促进对领域语言的语言分析。
{"title":"Inferring Morphotactics from Interlinear Glossed Text: Combining Clustering and Precision Grammars","authors":"Olga Zamaraeva","doi":"10.18653/v1/W16-2021","DOIUrl":"https://doi.org/10.18653/v1/W16-2021","url":null,"abstract":"In this paper I present a k-means clustering approach to inferring morphological position classes (morphotactics) from Interlinear Glossed Text (IGT), data collections available for some endangered and low-resource languages. While the experiment is not restricted to low-resource languages, they are meant to be the targeted domain. Specifically my approach is meant to be for field linguists who do not necessarily know how many position classes there are in the language they work with and what the position classes are, but have the expertise to evaluate different hypotheses. It builds on an existing approach (Wax, 2014), but replaces the core heuristic with a clustering algorithm. The results mainly illustrate two points. First, they are largely negative, which shows that the baseline algorithm (summarized in the paper) uses a very predictive feature to determine whether affixes belong to the same position class, namely edge overlap in the affix graph. At the same time, unlike the baseline method that relies entirely on a single feature, kmeans clustering can account for different features and helps discover more morphological phenomena, e.g. circumfixation. I conclude that unsupervised learning algorithms such as k-means clustering can in principle be used for morphotactics inference, though the algorithm should probably weigh certain features more than others. Most importantly, I conclude that clustering is a promising approach for diverse morphotactics and as such it can facilitate linguistic analysis of field languages.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115940751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Morphological reinflection with conditional random fields and unsupervised features 具有条件随机场和无监督特征的形态反射
Pub Date : 2016-08-01 DOI: 10.18653/v1/W16-2006
L. Liu, L. Mao
This paper describes our participation in the SIGMORPHON 2016 shared task on morphological reinflection. In the task, we use a linear-chain conditional random field model to learn to map sequences of input characters to sequences of output characters and focus on developing features that are useful for predicting inflectional behavior. Since the training data in the task is limited, we also generalize the training data by extracting, in an unsupervised fashion, the types of consonant-vowel sequences that trigger inflectional behavior, and by extending the available training data through inference of unlabeled morphosyntactic descriptions.
本文描述了我们参与SIGMORPHON 2016关于形态反射的共享任务。在任务中,我们使用线性链条件随机场模型来学习将输入字符序列映射到输出字符序列,并专注于开发对预测屈折行为有用的特征。由于任务中的训练数据是有限的,我们还通过以无监督的方式提取触发屈曲行为的辅音-元音序列的类型来泛化训练数据,并通过推断未标记的形态句法描述来扩展可用的训练数据。
{"title":"Morphological reinflection with conditional random fields and unsupervised features","authors":"L. Liu, L. Mao","doi":"10.18653/v1/W16-2006","DOIUrl":"https://doi.org/10.18653/v1/W16-2006","url":null,"abstract":"This paper describes our participation in the SIGMORPHON 2016 shared task on morphological reinflection. In the task, we use a linear-chain conditional random field model to learn to map sequences of input characters to sequences of output characters and focus on developing features that are useful for predicting inflectional behavior. Since the training data in the task is limited, we also generalize the training data by extracting, in an unsupervised fashion, the types of consonant-vowel sequences that trigger inflectional behavior, and by extending the available training data through inference of unlabeled morphosyntactic descriptions.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115647634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Predicting the Direction of Derivation in English Conversion 英语转换中派生方向的预测
Pub Date : 2016-08-01 DOI: 10.18653/v1/W16-2015
M. Kisselew, Laura Rimell, Alexis Palmer, Sebastian Padó
Conversion is a word formation operation that changes the grammatical category of a word in the absence of overt morphology. Conversion is extremely productive in English (e.g., tunnel, talk). This paper investigates whether distributional information can be used to predict the diachronic direction of conversion for homophonous noun‐verb pairs. We aim to predict, for example, that tunnel was used as a noun prior to its use as a verb. We test two hypotheses: (1) that derived forms are less frequent than their bases, and (2) that derived forms are more semantically specific than their bases, as approximated by information theoretic measures. We find that hypothesis (1) holds for N-to-V conversion, while hypothesis (2) holds for V-to-N conversion. We achieve the best overall account of the historical data by taking both frequency and semantic specificity into account. These results provide a new perspective on linguistic theories regarding the semantic specificity of derivational morphemes, and on the morphosyntactic status of conversion.
转换是一种构词操作,在没有明显词法的情况下改变单词的语法类别。转换在英语中是非常有效的(例如,地道,谈话)。本文研究了是否可以用分布信息来预测同音名词-动词对转换的历时方向。我们的目标是预测,例如,隧道在用作动词之前被用作名词。我们测试了两个假设:(1)派生形式比它们的基形式更少出现,(2)派生形式比它们的基形式在语义上更具体,正如信息理论测量所接近的那样。我们发现假设(1)适用于n到v的转换,而假设(2)适用于v到n的转换。我们通过考虑频率和语义特异性来实现对历史数据的最佳总体描述。这些结果为派生语素的语义专一性和转换的形态句法地位提供了新的语言学理论视角。
{"title":"Predicting the Direction of Derivation in English Conversion","authors":"M. Kisselew, Laura Rimell, Alexis Palmer, Sebastian Padó","doi":"10.18653/v1/W16-2015","DOIUrl":"https://doi.org/10.18653/v1/W16-2015","url":null,"abstract":"Conversion is a word formation operation that changes the grammatical category of a word in the absence of overt morphology. Conversion is extremely productive in English (e.g., tunnel, talk). This paper investigates whether distributional information can be used to predict the diachronic direction of conversion for homophonous noun‐verb pairs. We aim to predict, for example, that tunnel was used as a noun prior to its use as a verb. We test two hypotheses: (1) that derived forms are less frequent than their bases, and (2) that derived forms are more semantically specific than their bases, as approximated by information theoretic measures. We find that hypothesis (1) holds for N-to-V conversion, while hypothesis (2) holds for V-to-N conversion. We achieve the best overall account of the historical data by taking both frequency and semantic specificity into account. These results provide a new perspective on linguistic theories regarding the semantic specificity of derivational morphemes, and on the morphosyntactic status of conversion.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122935832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
EHU at the SIGMORPHON 2016 Shared Task. A Simple Proposal: Grapheme-to-Phoneme for Inflection EHU在SIGMORPHON 2016共享任务大会上。一个简单的建议:字母到音素的屈折
Pub Date : 2016-08-01 DOI: 10.18653/v1/W16-2004
I. Alegria, Izaskun Etxeberria
This paper presents a proposal for learning morphological inflections by a graphemeto-phoneme learning model. No special processing is used for specific languages. The starting point has been our previous research on induction of phonology and morphology for normalization of historical texts. The results show that a very simple method can indeed improve upon some baselines, but does not reach the accuracies of the best systems in the task.
本文提出了一种基于字形-音素学习模型的形态屈折学习方法。没有对特定语言进行特殊处理。本文的出发点是我们之前对历史文本的语音和形态学归纳法的研究。结果表明,一种非常简单的方法确实可以提高一些基线,但不能达到任务中最佳系统的精度。
{"title":"EHU at the SIGMORPHON 2016 Shared Task. A Simple Proposal: Grapheme-to-Phoneme for Inflection","authors":"I. Alegria, Izaskun Etxeberria","doi":"10.18653/v1/W16-2004","DOIUrl":"https://doi.org/10.18653/v1/W16-2004","url":null,"abstract":"This paper presents a proposal for learning morphological inflections by a graphemeto-phoneme learning model. No special processing is used for specific languages. The starting point has been our previous research on induction of phonology and morphology for normalization of historical texts. The results show that a very simple method can indeed improve upon some baselines, but does not reach the accuracies of the best systems in the task.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127851603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Three Correlates of the Typological Frequency of Quantity-Insensitive Stress Systems 数量不敏感应力系统类型频率的三个相关因素
Pub Date : 2008-06-19 DOI: 10.3115/1626324.1626330
Max Bane, Jason Riggle
We examine the typology of quantity-insensitive (QI) stress systems and ask to what extent an existing optimality theoretic model of QI stress can predict the observed typological frequencies of stress patterns. We find three significant correlates of pattern attestation and frequency: the trigram entropy of a pattern, the degree to which it is "confusable" with other patterns predicted by the model, and the number of constraint rankings that specify the pattern.
我们研究了数量不敏感(QI)应力系统的类型,并询问现有的QI应力最优理论模型在多大程度上可以预测所观察到的应力模式的类型频率。我们发现模式证明和频率之间有三个重要的相关关系:模式的三元熵,它与模型预测的其他模式“混淆”的程度,以及指定模式的约束排名的数量。
{"title":"Three Correlates of the Typological Frequency of Quantity-Insensitive Stress Systems","authors":"Max Bane, Jason Riggle","doi":"10.3115/1626324.1626330","DOIUrl":"https://doi.org/10.3115/1626324.1626330","url":null,"abstract":"We examine the typology of quantity-insensitive (QI) stress systems and ask to what extent an existing optimality theoretic model of QI stress can predict the observed typological frequencies of stress patterns. We find three significant correlates of pattern attestation and frequency: the trigram entropy of a pattern, the degree to which it is \"confusable\" with other patterns predicted by the model, and the number of constraint rankings that specify the pattern.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132272855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
期刊
Special Interest Group on Computational Morphology and Phonology Workshop
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1