{"title":"《质数机》中的协和线排序","authors":"Stephen Jeaco","doi":"10.1075/IJCL.18056.JEA","DOIUrl":null,"url":null,"abstract":"\n Corpus data provide evidence of the patterning of language, and one way word usage can be analysed is through the\n study of concordance lines. While popular concordancers provide different sorting methods, they are typically only able to display\n lines in the order in which they occur in the corpus, randomly, or alphabetically by words in slots to the left or right of the\n word of interest. Less sophisticated users may find recognising patterns from these orderings quite challenging. This paper\n considers possible needs of language learners in terms of concordance ranking and introduces two methods which have been adopted\n and developed for The Prime Machine. The first method uses repeated patterns, measuring the number of matches\n made with other lines in the set. The second method incorporates collocation scores, providing examples with strong collocations\n from the entire corpus at the top of sampled concordance lines.","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":1.6000,"publicationDate":"2021-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Concordance line sorting in The Prime Machine\",\"authors\":\"Stephen Jeaco\",\"doi\":\"10.1075/IJCL.18056.JEA\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n Corpus data provide evidence of the patterning of language, and one way word usage can be analysed is through the\\n study of concordance lines. While popular concordancers provide different sorting methods, they are typically only able to display\\n lines in the order in which they occur in the corpus, randomly, or alphabetically by words in slots to the left or right of the\\n word of interest. Less sophisticated users may find recognising patterns from these orderings quite challenging. This paper\\n considers possible needs of language learners in terms of concordance ranking and introduces two methods which have been adopted\\n and developed for The Prime Machine. The first method uses repeated patterns, measuring the number of matches\\n made with other lines in the set. The second method incorporates collocation scores, providing examples with strong collocations\\n from the entire corpus at the top of sampled concordance lines.\",\"PeriodicalId\":46843,\"journal\":{\"name\":\"International Journal of Corpus Linguistics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.6000,\"publicationDate\":\"2021-04-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Corpus Linguistics\",\"FirstCategoryId\":\"98\",\"ListUrlMain\":\"https://doi.org/10.1075/IJCL.18056.JEA\",\"RegionNum\":2,\"RegionCategory\":\"文学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"0\",\"JCRName\":\"LANGUAGE & LINGUISTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Corpus Linguistics","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1075/IJCL.18056.JEA","RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
引用次数: 0
摘要
语料库数据提供了语言模式的证据,分析单词用法的一种方法是通过研究一致性线。虽然流行的concordancer提供了不同的排序方法,但它们通常只能按照语料库中出现的顺序、随机或按兴趣单词的左侧或右侧插槽中的单词的字母顺序显示行。不太熟练的用户可能会发现从这些顺序中识别模式非常具有挑战性。本文考虑了语言学习者在一致性排序方面的可能需求,并介绍了两种为The Prime Machine所采用和开发的方法。第一种方法使用重复的图案,测量与集合中其他线条匹配的次数。第二种方法结合搭配得分,在采样的一致性线的顶部提供整个语料库中具有强搭配的例子。
Corpus data provide evidence of the patterning of language, and one way word usage can be analysed is through the
study of concordance lines. While popular concordancers provide different sorting methods, they are typically only able to display
lines in the order in which they occur in the corpus, randomly, or alphabetically by words in slots to the left or right of the
word of interest. Less sophisticated users may find recognising patterns from these orderings quite challenging. This paper
considers possible needs of language learners in terms of concordance ranking and introduces two methods which have been adopted
and developed for The Prime Machine. The first method uses repeated patterns, measuring the number of matches
made with other lines in the set. The second method incorporates collocation scores, providing examples with strong collocations
from the entire corpus at the top of sampled concordance lines.
期刊介绍:
The International Journal of Corpus Linguistics (IJCL) publishes original research covering methodological, applied and theoretical work in any area of corpus linguistics. Through its focus on empirical language research, IJCL provides a forum for the presentation of new findings and innovative approaches in any area of linguistics (e.g. lexicology, grammar, discourse analysis, stylistics, sociolinguistics, morphology, contrastive linguistics), applied linguistics (e.g. language teaching, forensic linguistics), and translation studies. Based on its interest in corpus methodology, IJCL also invites contributions on the interface between corpus and computational linguistics.