首页 > 最新文献

Corpora最新文献

英文 中文
Introducing the Swedish Learner English Corpus: a corpus that enables investigations of the impact of extramural activities on L2 writing 介绍瑞典学习者英语语料库:一个有助于调查校外活动对 L2 写作影响的语料库
IF 0.5 Q1 Arts and Humanities Pub Date : 2024-04-01 DOI: 10.3366/cor.2024.0296
Henrik Kaatari, Ying Wang, Tove Larsson
This paper introduces the Swedish Learner English Corpus (slec), which consists of argumentative texts in English that are written by Swedish junior and senior high school students. slec includes rich metadata, enabling empirical studies of various extra-linguistic variables. Most noteworthy is the inclusion of detailed information on students’ extramural English activities (ee), such as reading, watching, conversing, gaming and engaging in social media in English. In addition, a sub-set of texts from slec have been assessed for proficiency using the Common European Framework of Reference for Languages (cefr). This paper provides an overview of the corpus compilation process, the metadata, and the available versions of slec. Researchers, teachers and students can access this resource to investigate various aspects of second language use and development, such as the impact of extramural language activities on linguistic complexity.
本文介绍了瑞典学习者英语语料库(Swedish Learner English Corpus,slec),该语料库由瑞典初中和高中学生用英语撰写的论证性文本组成。slec 包含丰富的元数据,可以对各种语言外变量进行实证研究。最值得注意的是其中包含了学生校外英语活动(ee)的详细信息,如用英语阅读、观看、会话、游戏和参与社交媒体。此外,还使用《欧洲语言共同参考框架》(cefr)对 slec 中的部分文本进行了能力评估。本文概述了语料库的编制过程、元数据以及 slec 的可用版本。研究人员、教师和学生可以利用这一资源来研究第二语言使用和发展的各个方面,例如校外语言活动对语言复杂性的影响。
{"title":"Introducing the Swedish Learner English Corpus: a corpus that enables investigations of the impact of extramural activities on L2 writing","authors":"Henrik Kaatari, Ying Wang, Tove Larsson","doi":"10.3366/cor.2024.0296","DOIUrl":"https://doi.org/10.3366/cor.2024.0296","url":null,"abstract":"This paper introduces the Swedish Learner English Corpus (slec), which consists of argumentative texts in English that are written by Swedish junior and senior high school students. slec includes rich metadata, enabling empirical studies of various extra-linguistic variables. Most noteworthy is the inclusion of detailed information on students’ extramural English activities (ee), such as reading, watching, conversing, gaming and engaging in social media in English. In addition, a sub-set of texts from slec have been assessed for proficiency using the Common European Framework of Reference for Languages (cefr). This paper provides an overview of the corpus compilation process, the metadata, and the available versions of slec. Researchers, teachers and students can access this resource to investigate various aspects of second language use and development, such as the impact of extramural language activities on linguistic complexity.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140756954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Introducing the Single Player Offline Game Corpus (spoc): a corpus of seven registers from digital role-playing games 单人离线游戏语料库(spoc)介绍:由数字角色扮演游戏中的七个寄存器组成的语料库
IF 0.5 Q1 Arts and Humanities Pub Date : 2024-04-01 DOI: 10.3366/cor.2024.0300
Daniel H. Dixon
This paper describes the compilation and design of the Single Player Offline Game Corpus (spoc), which is being made freely available for research and educational purposes. The spoc was compiled by extracting the localisation files from the digital directories of four popular commercial digital role-playing games: Divinity: Original Sin II, Fallout 4, the Elder Scrolls V: Skyrim, and the Witcher 3: Wild Hunt. The 3.7 million word corpus contains more than 30,000 texts and is unique compared with other game corpora in that it has the following three characteristics: ( 1) the texts are categorised into seven registers using Biber and Conrad’s (2019) register framework, ( 2) texts are systematically parsed into the smallest meaningful units of observation, and ( 3) all texts were compiled from the data files of the games themselves. Nearly all language use in the four games is accounted for and parsed into register categories based on their underlying situational characteristics – in particular, the communicative purposes and the associated contexts in which the texts appear in the games.
本文介绍了单人离线游戏语料库(spoc)的编制和设计,该语料库可免费用于研究和教育目的。spoc 是通过从四款流行的商业数字角色扮演游戏的数字目录中提取本地化文件编制而成的:神界:原罪 II》、《辐射 4》、《上古卷轴 5:天际》和《巫师 3:狂猎》。这个 370 万字的语料库包含 3 万多个文本,与其他游戏语料库相比,它具有以下三个独特之处:(1)使用 Biber 和 Conrad(2019 年)的语域框架将文本分为七个语域;(2)将文本系统地解析为最小的有意义的观察单元;(3)所有文本都是从游戏本身的数据文件中编译而来。四款游戏中几乎所有的语言使用都根据其基本的情景特征--尤其是文本在游戏中出现的交际目的和相关语境--进行了解释和语域类别解析。
{"title":"Introducing the Single Player Offline Game Corpus (spoc): a corpus of seven registers from digital role-playing games","authors":"Daniel H. Dixon","doi":"10.3366/cor.2024.0300","DOIUrl":"https://doi.org/10.3366/cor.2024.0300","url":null,"abstract":"This paper describes the compilation and design of the Single Player Offline Game Corpus (spoc), which is being made freely available for research and educational purposes. The spoc was compiled by extracting the localisation files from the digital directories of four popular commercial digital role-playing games: Divinity: Original Sin II, Fallout 4, the Elder Scrolls V: Skyrim, and the Witcher 3: Wild Hunt. The 3.7 million word corpus contains more than 30,000 texts and is unique compared with other game corpora in that it has the following three characteristics: ( 1) the texts are categorised into seven registers using Biber and Conrad’s (2019) register framework, ( 2) texts are systematically parsed into the smallest meaningful units of observation, and ( 3) all texts were compiled from the data files of the games themselves. Nearly all language use in the four games is accounted for and parsed into register categories based on their underlying situational characteristics – in particular, the communicative purposes and the associated contexts in which the texts appear in the games.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140787030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Video Game Dialogue Corpus 视频游戏对话语料库
IF 0.5 Q1 Arts and Humanities Pub Date : 2024-04-01 DOI: 10.3366/cor.2024.0299
Stephanie Rennick, Seán Roberts
This paper presents the Video Game Dialogue Corpus, the first large-scale, consistently coded, open source corpus of dialogue from video games. It contains over 6.2 million words of English dialogue from fifty games in the Role Playing Game (rpg) genre. This includes games produced between 1985 and 2020, rated for children, teenagers and adults, and in both ‘Western’ and ‘Japanese’ sub-genres. The corpus design is described, including custom data formats for representing branching dialogue. We demonstrate the use of the corpus by comparing the dialogue of female and male characters, where we find reflections of gendered language in other media as well as patterns that seem specific to video games. We provide the source code for a ‘self-inflating corpus’ – a pipeline that obtains the data then processes and parses it into a standard format. This makes the corpus available for teaching and research purposes, providing the first such resource for empirical analysis of video game dialogue.
本文介绍了视频游戏对话语料库(Video Game Dialogue Corpus),这是第一个大规模、持续编码、开源的视频游戏对话语料库。该语料库包含 50 款角色扮演游戏(rpg)类型游戏中超过 620 万字的英语对话。其中包括 1985 年至 2020 年间制作的游戏,分级为儿童、青少年和成人,有 "西方 "和 "日本 "两种子类型。我们介绍了语料库的设计,包括用于表示分支对话的自定义数据格式。我们通过比较女性和男性角色的对话来演示语料库的使用,我们发现了其他媒体中性别语言的反映,以及似乎是电子游戏特有的模式。我们提供了 "自充气语料库 "的源代码--这是一个获取数据、处理数据并将其解析为标准格式的管道。这使得该语料库可用于教学和研究目的,为视频游戏对话的实证分析提供了首个此类资源。
{"title":"The Video Game Dialogue Corpus","authors":"Stephanie Rennick, Seán Roberts","doi":"10.3366/cor.2024.0299","DOIUrl":"https://doi.org/10.3366/cor.2024.0299","url":null,"abstract":"This paper presents the Video Game Dialogue Corpus, the first large-scale, consistently coded, open source corpus of dialogue from video games. It contains over 6.2 million words of English dialogue from fifty games in the Role Playing Game (rpg) genre. This includes games produced between 1985 and 2020, rated for children, teenagers and adults, and in both ‘Western’ and ‘Japanese’ sub-genres. The corpus design is described, including custom data formats for representing branching dialogue. We demonstrate the use of the corpus by comparing the dialogue of female and male characters, where we find reflections of gendered language in other media as well as patterns that seem specific to video games. We provide the source code for a ‘self-inflating corpus’ – a pipeline that obtains the data then processes and parses it into a standard format. This makes the corpus available for teaching and research purposes, providing the first such resource for empirical analysis of video game dialogue.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140771700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Developing a multimodal corpus of L2 academic English from an English medium of instruction university in China 开发中国以英语为教学语言的大学的第二语言学术英语多模态语料库
IF 0.5 Q1 Arts and Humanities Pub Date : 2024-04-01 DOI: 10.3366/cor.2024.0295
Yu-Hua Chen, Simon Harrison, Michaël Stevens, Qianqian Zhou
This paper describes the rationale for and design of a new multimodal corpus of L2 academic English from a Sino-British university in China: the Corpus of Chinese Academic Written and Spoken English (cawse). The unique context for this corpus provides language samples from Chinese students who use English as a second language (L2) in a preliminary-year programme, which prepares students for academic studies at university level, at a campus where English is used as the Medium of Instruction (emi). Data were collected from a variety of settings, including written (i.e., exam scripts and essays) and spoken assessments (i.e., interviews and presentations), covering the full range of grades awarded to those language samples, as well as from student group interactions during teaching and learning activities. The multimodal nature of the corpus is realised through the availability of selected audio/video recordings accompanied by the orthographically transcribed text. This open-access corpus is designed to help shed light on Chinese students' academic L2 English language use in a variety of written, spoken and multimodal discourses.
本文介绍了从中国一所中英大学建立新的学术英语第二语言多模态语料库:中国学术英语口语和书面语语料库(cawse)的理由和设计。该语料库的独特背景是,在一个以英语为教学语言(emi)的校园里,中国学生在预科课程中使用英语作为第二语言(L2),该课程为学生进入大学进行学术研究做准备。收集数据的场合多种多样,包括书面(即考试答卷和论文)和口语评估(即访谈和演讲),涵盖了这些语言样本的全部成绩,以及教学活动中学生的小组互动。该语料库的多模态性质是通过提供选定的音频/视频录像以及正字法转录文本来实现的。这个开放存取的语料库旨在帮助了解中国学生在各种书面、口语和多模态话语中的学术英语第二语言使用情况。
{"title":"Developing a multimodal corpus of L2 academic English from an English medium of instruction university in China","authors":"Yu-Hua Chen, Simon Harrison, Michaël Stevens, Qianqian Zhou","doi":"10.3366/cor.2024.0295","DOIUrl":"https://doi.org/10.3366/cor.2024.0295","url":null,"abstract":"This paper describes the rationale for and design of a new multimodal corpus of L2 academic English from a Sino-British university in China: the Corpus of Chinese Academic Written and Spoken English (cawse). The unique context for this corpus provides language samples from Chinese students who use English as a second language (L2) in a preliminary-year programme, which prepares students for academic studies at university level, at a campus where English is used as the Medium of Instruction (emi). Data were collected from a variety of settings, including written (i.e., exam scripts and essays) and spoken assessments (i.e., interviews and presentations), covering the full range of grades awarded to those language samples, as well as from student group interactions during teaching and learning activities. The multimodal nature of the corpus is realised through the availability of selected audio/video recordings accompanied by the orthographically transcribed text. This open-access corpus is designed to help shed light on Chinese students' academic L2 English language use in a variety of written, spoken and multimodal discourses.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140773869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Review: Barth and Schnell. 2022. Understanding Corpus Linguistics. New York: Routledge 回顾:Barth and Schnell.2022.理解语料库语言学》。New York:Routledge
IF 0.5 Q1 Arts and Humanities Pub Date : 2024-04-01 DOI: 10.3366/cor.2024.0302
Mohsen Shirazizadeh, Narges Moeini
{"title":"Review: Barth and Schnell. 2022. Understanding Corpus Linguistics. New York: Routledge","authors":"Mohsen Shirazizadeh, Narges Moeini","doi":"10.3366/cor.2024.0302","DOIUrl":"https://doi.org/10.3366/cor.2024.0302","url":null,"abstract":"","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140769001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring part of speech (pos) tag sequences in a large-scale learner corpus of L2 English: a developmental perspective 探索 L2 英语大规模学习者语料库中的语篇(pos)标记序列:发展视角
IF 0.5 Q1 Arts and Humanities Pub Date : 2024-04-01 DOI: 10.3366/cor.2024.0297
Joyce Dong Ok Lim, Geraldine Mark, P. Pérez-Paredes, Anne O’Keeffe
This research explores the pos tag sequences that shape the transition from upper intermediate (B2 cefr) to near-native proficiency (C2 cefr) in a corpus of essays ( n=32,410) from the Cambridge Learner Corpus. Gilquin (2018) and others have shown that pos tag sequences offer a holistic approach to extracting the most commonly used patterns without a starting point of an a priori set of words and word sequences. Using corpus linguistics informed by usage-based theories of language learning, this paper examines the frequency and distribution of 4-slot pos-tag sequences in L2 English writing, drawing on the taxonomy of pattern grammar ( Francis et al., 1996 , 1998 ; and Hunston and Francis, 2000 ). Findings point to the presence of both core and emergent pos-tag sequences in learner language in the two proficiency levels analysed. These sequences point to the presence of dynamic language restructuring processes as learners become more proficient and re-evaluate their understanding of frequency and distribution in English. This paper shows evidence of how language competence increases with proficiency. The research offers new evidence in our understanding of the development of L2 writing in efl contexts.
本研究探讨了剑桥学习者语料库(Cambridge Learner Corpus)中的论文语料库(n=32,410)中形成从中高级(B2 cefr)到接近母语水平(C2 cefr)过渡的pos标记序列。Gilquin(2018)等人的研究表明,pos 标记序列提供了一种提取最常用模式的整体方法,而无需以先验的单词和单词序列集为起点。本文利用基于使用的语言学习理论的语料库语言学,借鉴模式语法分类学(Francis等人,1996年,1998年;Hunston和Francis,2000年),研究了4槽pos标记序列在L2英语写作中的频率和分布。研究结果表明,在所分析的两个能力水平中,学习者的语言中都存在核心和新出现的pos-tag序列。这些序列表明,随着学习者语言能力的提高和对英语中频率和分布的理解的重新评估,存在着动态的语言重组过程。本文证明了语言能力是如何随着熟练程度的提高而提高的。这项研究为我们理解电子英语环境下第二语言写作的发展提供了新的证据。
{"title":"Exploring part of speech (pos) tag sequences in a large-scale learner corpus of L2 English: a developmental perspective","authors":"Joyce Dong Ok Lim, Geraldine Mark, P. Pérez-Paredes, Anne O’Keeffe","doi":"10.3366/cor.2024.0297","DOIUrl":"https://doi.org/10.3366/cor.2024.0297","url":null,"abstract":"This research explores the pos tag sequences that shape the transition from upper intermediate (B2 cefr) to near-native proficiency (C2 cefr) in a corpus of essays ( n=32,410) from the Cambridge Learner Corpus. Gilquin (2018) and others have shown that pos tag sequences offer a holistic approach to extracting the most commonly used patterns without a starting point of an a priori set of words and word sequences. Using corpus linguistics informed by usage-based theories of language learning, this paper examines the frequency and distribution of 4-slot pos-tag sequences in L2 English writing, drawing on the taxonomy of pattern grammar ( Francis et al., 1996 , 1998 ; and Hunston and Francis, 2000 ). Findings point to the presence of both core and emergent pos-tag sequences in learner language in the two proficiency levels analysed. These sequences point to the presence of dynamic language restructuring processes as learners become more proficient and re-evaluate their understanding of frequency and distribution in English. This paper shows evidence of how language competence increases with proficiency. The research offers new evidence in our understanding of the development of L2 writing in efl contexts.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140764945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Triangulating visual and textual corpus-assisted discourse analysis to study social actor representations: the case of Saudi women in the British and Saudi news media 三角视觉和文本语料库辅助话语分析研究社会行为者的表征:英国和沙特新闻媒体中的沙特妇女案例
IF 0.5 Q1 Arts and Humanities Pub Date : 2024-04-01 DOI: 10.3366/cor.2024.0298
Dina Sibai, Sylvia Jaworska
Investigations of social actor representations across media present a large and important body of research in corpus-assisted discourse studies (cads). However, most studies focus exclusively on one mode, the text, whilst other modes of communication (for example, visuals) are either considered partially or not at all. Whilst insights from textual analyses are invaluable in revealing salient and nuanced patterns of social actor representations in the media, visual accompaniments can reinforce particular ‘angles’ creating lasting perceptions for readers and viewers. Though some approaches exist to study considerable numbers of images, visual media data can be complex rendering them difficult to be studied alongside textual cads. This paper uses a triangulation of visual and textual cads analysis to explore social actor representations in media texts and images. It does so by focussing on the representations of Saudi women in the UK and Saudi news media within the context of evolving women’s rights in Saudi Arabia. The study shows how such triangulation can be conducted in a doable and systematic way and how it can enrich cads research on discursive representations of social actors across contexts.
在语料库辅助话语研究(cads)中,对跨媒体社会行动者表征的调查是一项庞大而重要的研究内容。然而,大多数研究只关注一种模式,即文本,而对其他传播模式(如视觉效果)要么只作部分考虑,要么根本不作考虑。文本分析的洞察力对于揭示媒体中社会行动者表征的突出和细微模式非常宝贵,而视觉辅助则可以强化特定的 "角度",为读者和观众创造持久的感知。虽然有一些研究大量图像的方法,但视觉媒体数据可能很复杂,很难与文本数据一起研究。本文使用视觉和文本 cads 分析的三角分析方法来探讨媒体文本和图像中的社会行动者表征。在沙特阿拉伯妇女权利不断发展的背景下,本文重点研究了英国和沙特新闻媒体对沙特妇女的表述。本研究展示了如何以可行、系统的方式进行这种三角分析,以及如何丰富跨语境社会行动者话语表征的 cads 研究。
{"title":"Triangulating visual and textual corpus-assisted discourse analysis to study social actor representations: the case of Saudi women in the British and Saudi news media","authors":"Dina Sibai, Sylvia Jaworska","doi":"10.3366/cor.2024.0298","DOIUrl":"https://doi.org/10.3366/cor.2024.0298","url":null,"abstract":"Investigations of social actor representations across media present a large and important body of research in corpus-assisted discourse studies (cads). However, most studies focus exclusively on one mode, the text, whilst other modes of communication (for example, visuals) are either considered partially or not at all. Whilst insights from textual analyses are invaluable in revealing salient and nuanced patterns of social actor representations in the media, visual accompaniments can reinforce particular ‘angles’ creating lasting perceptions for readers and viewers. Though some approaches exist to study considerable numbers of images, visual media data can be complex rendering them difficult to be studied alongside textual cads. This paper uses a triangulation of visual and textual cads analysis to explore social actor representations in media texts and images. It does so by focussing on the representations of Saudi women in the UK and Saudi news media within the context of evolving women’s rights in Saudi Arabia. The study shows how such triangulation can be conducted in a doable and systematic way and how it can enrich cads research on discursive representations of social actors across contexts.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140777820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Review: Islentyeva. 2020. Corpus-based Analysis of Ideological Bias: Migration in the British Press. London: Routledge 评论:Islenteva。2020.基于语料库的意识形态偏见分析:英国媒体的移民。伦敦:劳特利奇
IF 0.5 Q1 Arts and Humanities Pub Date : 2023-08-01 DOI: 10.3366/cor.2023.0285
A. Black
{"title":"Review: Islentyeva. 2020. Corpus-based Analysis of Ideological Bias: Migration in the British Press. London: Routledge","authors":"A. Black","doi":"10.3366/cor.2023.0285","DOIUrl":"https://doi.org/10.3366/cor.2023.0285","url":null,"abstract":"","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48989826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Twenty-first century ideological discourses about US migrant education that transcend registers 21世纪关于美国移民教育的意识形态话语
IF 0.5 Q1 Arts and Humanities Pub Date : 2023-08-01 DOI: 10.3366/cor.2023.0280
Shannon Fitzsimmons‐Doolan
Widely distributed and often repeated discursive patterns which represent migrants can influence the education of migrant students ( Calavita, 1996 ; Santa Ana, 2002 ; Cutler, 2017 ; and Dabach et al., 2017 ). Ideological discourses (e.g., ‘immigrants are threats’) are particularly potent structures that mediate language, cognition and social life. Whilst there has been a recent increase in studies of texts on the topic of migration generally, there are few that focus on the intersection of migration and education or on discursive patterns that transcend registers. This study introduces a multi-dimensional analysis approach for the identification of ideological discourses from a 9 million-word corpus of twenty-first century, US texts about migrant education from multiple registers (online comments, national and regional newspaper texts, and federal and state government webpages) using the distribution of lexical variables that characterise variants of migrant/ migration. Eleven ideological discourses (e.g., ‘US immigration policies are problematic, but there is no consensus for solutions’) were found. Of these, several had not been previously identified, one confirmed a previously identified discourse, and several complemented and extended previously identified discursive patterns on this topic. Together, these findings reveal the highly naturalised ideologically discursive landscape that shapes educational opportunities for US migrant students.
代表移民的广泛分布和经常重复的话语模式可以影响移民学生的教育(Calavita, 1996;圣安娜,2002;卡特勒,2017;和Dabach等人,2017)。意识形态话语(例如,“移民是威胁”)是调解语言、认知和社会生活的特别有效的结构。虽然最近关于移民主题的文本研究有所增加,但很少有人关注移民与教育的交集或超越语域的话语模式。本研究引入了一种多维分析方法,利用表征移民/迁移变体的词汇变量分布,从21世纪900万字的语料库中识别意识形态话语,从多个登记册(在线评论,国家和地区报纸文本以及联邦和州政府网页)中识别关于移民教育的美国文本。发现了11种意识形态话语(例如,“美国移民政策有问题,但对解决方案没有共识”)。其中,有几个以前没有被确定,一个证实了以前确定的话语,还有几个补充和扩展了以前确定的关于这个主题的话语模式。总之,这些发现揭示了高度自然化的意识形态话语景观,它塑造了美国移民学生的教育机会。
{"title":"Twenty-first century ideological discourses about US migrant education that transcend registers","authors":"Shannon Fitzsimmons‐Doolan","doi":"10.3366/cor.2023.0280","DOIUrl":"https://doi.org/10.3366/cor.2023.0280","url":null,"abstract":"Widely distributed and often repeated discursive patterns which represent migrants can influence the education of migrant students ( Calavita, 1996 ; Santa Ana, 2002 ; Cutler, 2017 ; and Dabach et al., 2017 ). Ideological discourses (e.g., ‘immigrants are threats’) are particularly potent structures that mediate language, cognition and social life. Whilst there has been a recent increase in studies of texts on the topic of migration generally, there are few that focus on the intersection of migration and education or on discursive patterns that transcend registers. This study introduces a multi-dimensional analysis approach for the identification of ideological discourses from a 9 million-word corpus of twenty-first century, US texts about migrant education from multiple registers (online comments, national and regional newspaper texts, and federal and state government webpages) using the distribution of lexical variables that characterise variants of migrant/ migration. Eleven ideological discourses (e.g., ‘US immigration policies are problematic, but there is no consensus for solutions’) were found. Of these, several had not been previously identified, one confirmed a previously identified discourse, and several complemented and extended previously identified discursive patterns on this topic. Together, these findings reveal the highly naturalised ideologically discursive landscape that shapes educational opportunities for US migrant students.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48339384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards increased reliability and transparency in projects with manual linguistic coding 提高手动语言编码项目的可靠性和透明度
IF 0.5 Q1 Arts and Humanities Pub Date : 2023-08-01 DOI: 10.3366/cor.2023.0284
Nicole Hober, Tülay Dixon, Tove Larsson
Manually coded data form the basis of many of our analyses in corpus linguistics. It is thus imperative that we work towards increased reliability and enhanced transparency in our coding practices, since failing to do so may ultimately lead us to draw erroneous conclusions about language. Using spoken data from a study on adverb usage for illustration, this methods paper discusses some strategies for identifying threats to the reliability of our coding and offers suggestions for how to mitigate these and ensure that our coding can be assessed and replicated. The paper also includes suggestions for best practices for manual linguistic coding and concludes with a discussion of the benefits of such practices. With this paper, we expand on the ongoing discussions in the field on issues of reliability and transparency as they relate to manual coding. We argue that while tests of inter-rater reliability offer a helpful starting point, further steps are needed to ensure increased reliability and transparency.
人工编码的数据构成了我们在语料库语言学中许多分析的基础。因此,我们必须努力提高编码实践的可靠性和透明度,因为如果不这样做,最终可能会导致我们对语言得出错误的结论。本文以一项副词使用研究的口语数据为例,讨论了识别对我们编码可靠性的威胁的一些策略,并就如何减轻这些威胁以及确保我们的编码能够被评估和复制提出了建议。本文还包括对手动语言编码的最佳实践的建议,并以讨论这种实践的好处作为结论。通过这篇论文,我们扩展了该领域正在进行的关于可靠性和透明度问题的讨论,因为这些问题与手动编码有关。我们认为,虽然评级机构间可靠性测试提供了一个有用的起点,但还需要采取进一步措施来确保提高可靠性和透明度。
{"title":"Towards increased reliability and transparency in projects with manual linguistic coding","authors":"Nicole Hober, Tülay Dixon, Tove Larsson","doi":"10.3366/cor.2023.0284","DOIUrl":"https://doi.org/10.3366/cor.2023.0284","url":null,"abstract":"Manually coded data form the basis of many of our analyses in corpus linguistics. It is thus imperative that we work towards increased reliability and enhanced transparency in our coding practices, since failing to do so may ultimately lead us to draw erroneous conclusions about language. Using spoken data from a study on adverb usage for illustration, this methods paper discusses some strategies for identifying threats to the reliability of our coding and offers suggestions for how to mitigate these and ensure that our coding can be assessed and replicated. The paper also includes suggestions for best practices for manual linguistic coding and concludes with a discussion of the benefits of such practices. With this paper, we expand on the ongoing discussions in the field on issues of reliability and transparency as they relate to manual coding. We argue that while tests of inter-rater reliability offer a helpful starting point, further steps are needed to ensure increased reliability and transparency.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41419454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Corpora
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1