Bruna Sommer-Farias, A. Novikov, Adriana Picoral, Mariana Centanin-Bertho, S. Staples
This article provides a detailed account of the framework, pedagogical and research applications of the Multilingual Academic Corpus of Assignments – Writing and Speech (MACAWS).1 MACAWS is a monitor learner corpus of written and oral assignments produced by foreign language learners in the context of their language learning classrooms. Currently the corpus focuses on two less commonly taught languages rarely represented in learner corpora, Portuguese and Russian, and contains 124,054 words in Russian and 536,168 in Portuguese, being updated each semester as new texts are added to the corpus. The online interface is designed for ease of use by teachers and students. Our novel interactive data-driven learning (iDDL) tool allows embedding of concordance lines into websites and learning management systems (LMS), facilitating student interaction with concordance lines. Researchers can gain access to an offline corpus for greater flexibility.
{"title":"A multilingual learner corpus for less commonly taught languages","authors":"Bruna Sommer-Farias, A. Novikov, Adriana Picoral, Mariana Centanin-Bertho, S. Staples","doi":"10.1075/ijlcr.21001.som","DOIUrl":"https://doi.org/10.1075/ijlcr.21001.som","url":null,"abstract":"\u0000This article provides a detailed account of the framework, pedagogical and research applications of the Multilingual Academic Corpus of Assignments – Writing and Speech (MACAWS).1 MACAWS is a monitor learner corpus of written and oral assignments produced by foreign language learners in the context of their language learning classrooms. Currently the corpus focuses on two less commonly taught languages rarely represented in learner corpora, Portuguese and Russian, and contains 124,054 words in Russian and 536,168 in Portuguese, being updated each semester as new texts are added to the corpus. The online interface is designed for ease of use by teachers and students. Our novel interactive data-driven learning (iDDL) tool allows embedding of concordance lines into websites and learning management systems (LMS), facilitating student interaction with concordance lines. Researchers can gain access to an offline corpus for greater flexibility.","PeriodicalId":29715,"journal":{"name":"International Journal of Learner Corpus Research","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2022-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46097053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lexical bundles are frequently recurring word sequences (e.g. as can be seen) that function as building blocks of discourse. This corpus-based study examined the use of four-word lexical bundles in business emails written by three groups of writers: intermediate business English learners, advanced business English learners, and working professionals. The prominent structural and functional characteristics of lexical bundles expressed in business emails were identified and compared across the three groups. The results showed that lexical bundles were related to the extent to which formality and politeness were expressed in written business communications. The advanced business English learners and working professionals used more structural and functional characteristics of lexical bundles that are characteristic of written conventions than did intermediate business English learners. Both intermediate and advanced learner groups used functionally different lexical bundles from those produced by the working professionals.
{"title":"“Please let me know”","authors":"Detong Xia, Haiyang Ai, Hye K. Pae","doi":"10.1075/ijlcr.20019.xia","DOIUrl":"https://doi.org/10.1075/ijlcr.20019.xia","url":null,"abstract":"\u0000 Lexical bundles are frequently recurring word sequences (e.g. as can be seen) that function as\u0000 building blocks of discourse. This corpus-based study examined the use of four-word lexical bundles in business emails written by\u0000 three groups of writers: intermediate business English learners, advanced business English learners, and working professionals.\u0000 The prominent structural and functional characteristics of lexical bundles expressed in business emails were identified and\u0000 compared across the three groups. The results showed that lexical bundles were related to the extent to which formality and\u0000 politeness were expressed in written business communications. The advanced business English learners and working professionals\u0000 used more structural and functional characteristics of lexical bundles that are characteristic of written conventions than did\u0000 intermediate business English learners. Both intermediate and advanced learner groups used functionally different lexical bundles\u0000 from those produced by the working professionals.","PeriodicalId":29715,"journal":{"name":"International Journal of Learner Corpus Research","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2022-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44621205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this study, we apply Gries and Deshors’s (2014) and Deshors and Gries’s (2016) MuPDAR(F) approach to explore the use of synonymous adjectives tärkeä (i.e. “important”) and keskeinen (i.e. “central”) in academic native and advanced learner Finnish, linking the phenomenon with the general assumptions of usage-based cognitive linguistics. This method confidently modelled the differences between using near-synonyms in native data and distinguished between native-like and non-native-like uses in learner data. Crucially, it differentiated between the contexts in which one synonym was clearly favoured and those in which either one was acceptable, in accordance with Gries and Deshors (2020). The results suggest that Finnish learners fairly coherently follow the tendencies of native speakers, but several variables differentiate their use of synonyms from the latter’s. We interpret the differences to reflect complexity- and prototypicality-related phenomena. On the one hand, learners use more common options more often. On the other, non-nativelike adjectives are used only in contexts that are structurally in the most prototypical and least complex form, suggesting that learners employ complexity-related structural alternations – e.g., non-prototypical grammatical subjects or degree modifiers – after lexical alternations.
在本研究中,我们采用Gries and Deshors(2014)和Deshors and Gries(2016)的MuPDAR(F)方法来探索母语和高级学习者芬兰语中同义词形容词tärkeä(即“重要”)和keskeinen(即“中心”)的使用情况,并将这一现象与基于使用的认知语言学的一般假设联系起来。该方法自信地模拟了在母语数据中使用近义词的差异,并区分了学习者数据中类似母语和非母语的使用。至关重要的是,根据Gries和Deshors(2020)的研究,它区分了一个同义词明显受到青睐的语境和任何一个都可以接受的语境。结果表明,芬兰学习者相当连贯地遵循母语人士的倾向,但几个变量区分了他们对同义词的使用。我们解释了这些差异,以反映复杂性和原型相关的现象。一方面,学习者更频繁地使用更常用的选项。另一方面,非原生类形容词只在结构上最原型和最不复杂的语境中使用,这表明学习者在词汇变化之后使用与复杂性相关的结构变化——例如,非原型语法主语或程度修饰语。
{"title":"The use of synonymous adjectives by learners of Finnish as a second language","authors":"Niina Kekki, I. Ivaska","doi":"10.1075/ijlcr.21006.kek","DOIUrl":"https://doi.org/10.1075/ijlcr.21006.kek","url":null,"abstract":"\u0000 In this study, we apply Gries and Deshors’s (2014) and Deshors and Gries’s (2016) MuPDAR(F) approach to explore the use of synonymous adjectives\u0000 tärkeä (i.e. “important”) and keskeinen (i.e. “central”) in academic native and advanced\u0000 learner Finnish, linking the phenomenon with the general assumptions of usage-based cognitive linguistics. This method confidently\u0000 modelled the differences between using near-synonyms in native data and distinguished between native-like and non-native-like uses\u0000 in learner data. Crucially, it differentiated between the contexts in which one synonym was clearly favoured and those in which\u0000 either one was acceptable, in accordance with Gries and Deshors (2020). The results\u0000 suggest that Finnish learners fairly coherently follow the tendencies of native speakers, but several variables differentiate\u0000 their use of synonyms from the latter’s. We interpret the differences to reflect complexity- and prototypicality-related\u0000 phenomena. On the one hand, learners use more common options more often. On the other, non-nativelike adjectives are used only in\u0000 contexts that are structurally in the most prototypical and least complex form, suggesting that learners employ complexity-related\u0000 structural alternations – e.g., non-prototypical grammatical subjects or degree modifiers – after lexical alternations.","PeriodicalId":29715,"journal":{"name":"International Journal of Learner Corpus Research","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2022-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43251464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Review of Le Bruyn & Paquot (2021): Learner Corpus Research Meets Second Language Acquisition","authors":"Kevin McManus","doi":"10.1075/ijlcr.00027.mcm","DOIUrl":"https://doi.org/10.1075/ijlcr.00027.mcm","url":null,"abstract":"","PeriodicalId":29715,"journal":{"name":"International Journal of Learner Corpus Research","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2022-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41481812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Glaznieks, Jennifer-Carmen Frey, Maria Stopfner, L. Zanasi, Lionel Nicolas
This article presents the longitudinal trilingual corpus of young learners of Italian, German and English called LEONIDE. The corpus consists of L1, L2 and L3 learner texts. L1 texts were written in two languages of schooling (i.e. Italian and German), L2 texts in two languages learned as second languages (i.e. German and Italian), and L3 texts in an additional foreign language (i.e. English). All texts were collected from a group of lower secondary school pupils from the multilingual Italian province of South Tyrol whose development in all three languages was observed over a period of three years. Each text comes with rich metadata as well as manual and automatic annotations.
{"title":"Leonide","authors":"A. Glaznieks, Jennifer-Carmen Frey, Maria Stopfner, L. Zanasi, Lionel Nicolas","doi":"10.1075/ijlcr.21004.gla","DOIUrl":"https://doi.org/10.1075/ijlcr.21004.gla","url":null,"abstract":"\u0000 This article presents the longitudinal trilingual corpus of young learners of Italian, German and English called\u0000 LEONIDE. The corpus consists of L1, L2 and L3 learner texts. L1 texts were written in two languages of schooling (i.e. Italian and\u0000 German), L2 texts in two languages learned as second languages (i.e. German and Italian), and L3 texts in an additional foreign\u0000 language (i.e. English). All texts were collected from a group of lower secondary school pupils from the multilingual Italian\u0000 province of South Tyrol whose development in all three languages was observed over a period of three years. Each text comes with\u0000 rich metadata as well as manual and automatic annotations.","PeriodicalId":29715,"journal":{"name":"International Journal of Learner Corpus Research","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2022-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46494819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This report introduces the University of Pittsburgh English Language Institute Corpus (PELIC; Juffs et al., 2020), a publicly available 4.2-million-word learner corpus of written texts. Collected over seven years in the University of Pittsburgh’s Intensive English Program, these texts were produced by more than 1,100 students with diverse linguistic backgrounds and proficiency levels. Unlike most learner corpora which are cross-sectional, PELIC is longitudinal, offering greater opportunities for tracking development in a natural classroom setting. This potential is illustrated in an overview of the research conducted to date with these data. The report also provides a description of PELIC’s creation and contents, including how the texts have been managed to facilitate natural language processing. Overall, the corpus contributes to the field of learner corpus research by adding to the pool of freely and publicly available learner corpora, supplemented by a useful set of Python tools and tutorials for accessing these data.
{"title":"The University of Pittsburgh English Language Institute Corpus (PELIC)","authors":"Ben Naismith, Na-Rae Han, Alan Juffs","doi":"10.1075/ijlcr.21002.nai","DOIUrl":"https://doi.org/10.1075/ijlcr.21002.nai","url":null,"abstract":"\u0000 This report introduces the University of Pittsburgh English Language Institute Corpus (PELIC;\u0000 Juffs et al., 2020), a publicly available 4.2-million-word learner corpus of\u0000 written texts. Collected over seven years in the University of Pittsburgh’s Intensive English Program, these texts were produced\u0000 by more than 1,100 students with diverse linguistic backgrounds and proficiency levels. Unlike most learner corpora which are\u0000 cross-sectional, PELIC is longitudinal, offering greater opportunities for tracking development in a natural classroom setting.\u0000 This potential is illustrated in an overview of the research conducted to date with these data. The report also provides a\u0000 description of PELIC’s creation and contents, including how the texts have been managed to facilitate natural language processing.\u0000 Overall, the corpus contributes to the field of learner corpus research by adding to the pool of freely and publicly available\u0000 learner corpora, supplemented by a useful set of Python tools and tutorials for accessing these data.","PeriodicalId":29715,"journal":{"name":"International Journal of Learner Corpus Research","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2022-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48300213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Review of Lu (2017): A Corpus Study of Collocation in Chinese Learner English","authors":"Luciana Forti","doi":"10.1075/ijlcr.00026.for","DOIUrl":"https://doi.org/10.1075/ijlcr.00026.for","url":null,"abstract":"","PeriodicalId":29715,"journal":{"name":"International Journal of Learner Corpus Research","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2022-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48010143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
English as a foreign language (EFL) textbooks typically present a prescriptive typology of three or four conditional types. We examine the extent to which this long-established English Language Teaching (ELT) typology is reflected in four varieties of English by comparing the forms and functions of four samples of 620 if-conditionals from French school EFL textbooks (TEC-Fr), French L1 Learner English (OpenCLC-Fr), Web English (EnTenTen15-S) and British English (BNC-S). The ELT typology accounts for considerably less than half of if-sentences in the reference data. Even in the EFL textbooks, only 57% of if-conditionals match the typology explicitly taught in their grammar sections. For many formal and functional features, the learner data sits halfway between the distributions of the textbook and reference data. We conclude that the ELT typology needs to be adapted to provide a more representative account of if-conditionals that focuses on L1 and L2 usage and meaning over form.
{"title":"Testing the pedagogical norm","authors":"Tatjana Winter, Elen Le Foll","doi":"10.1075/ijlcr.20021.win","DOIUrl":"https://doi.org/10.1075/ijlcr.20021.win","url":null,"abstract":"\u0000 English as a foreign language (EFL) textbooks typically present a prescriptive typology of three or four\u0000 conditional types. We examine the extent to which this long-established English Language Teaching (ELT) typology is reflected in\u0000 four varieties of English by comparing the forms and functions of four samples of 620 if-conditionals from French\u0000 school EFL textbooks (TEC-Fr), French L1 Learner English (OpenCLC-Fr), Web English (EnTenTen15-S) and British English (BNC-S). The\u0000 ELT typology accounts for considerably less than half of if-sentences in the reference data. Even in the EFL\u0000 textbooks, only 57% of if-conditionals match the typology explicitly taught in their grammar sections. For many\u0000 formal and functional features, the learner data sits halfway between the distributions of the textbook and reference data. We\u0000 conclude that the ELT typology needs to be adapted to provide a more representative account of if-conditionals\u0000 that focuses on L1 and L2 usage and meaning over form.","PeriodicalId":29715,"journal":{"name":"International Journal of Learner Corpus Research","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2022-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46231181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Review of Leńko-Szymańska (2020): Defining and Assessing Lexical Proficiency","authors":"Philip Durrant","doi":"10.1075/ijlcr.00025.dur","DOIUrl":"https://doi.org/10.1075/ijlcr.00025.dur","url":null,"abstract":"","PeriodicalId":29715,"journal":{"name":"International Journal of Learner Corpus Research","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2022-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47051653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Review of Abel, Glaznieks, Lyding & Nicolas (2019): Widening the scope of learner corpus research: Selected papers from the fourth Learner Corpus Research Conference","authors":"Agnieszka Leńko-Szymańska","doi":"10.1075/ijlcr.00023.len","DOIUrl":"https://doi.org/10.1075/ijlcr.00023.len","url":null,"abstract":"","PeriodicalId":29715,"journal":{"name":"International Journal of Learner Corpus Research","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2021-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42833718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}