The Video Game Dialogue Corpus

IF 0.8 Q3 LINGUISTICS Corpora Pub Date : 2024-04-01 DOI:10.3366/cor.2024.0299
Stephanie Rennick, Seán Roberts
{"title":"The Video Game Dialogue Corpus","authors":"Stephanie Rennick, Seán Roberts","doi":"10.3366/cor.2024.0299","DOIUrl":null,"url":null,"abstract":"This paper presents the Video Game Dialogue Corpus, the first large-scale, consistently coded, open source corpus of dialogue from video games. It contains over 6.2 million words of English dialogue from fifty games in the Role Playing Game (rpg) genre. This includes games produced between 1985 and 2020, rated for children, teenagers and adults, and in both ‘Western’ and ‘Japanese’ sub-genres. The corpus design is described, including custom data formats for representing branching dialogue. We demonstrate the use of the corpus by comparing the dialogue of female and male characters, where we find reflections of gendered language in other media as well as patterns that seem specific to video games. We provide the source code for a ‘self-inflating corpus’ – a pipeline that obtains the data then processes and parses it into a standard format. This makes the corpus available for teaching and research purposes, providing the first such resource for empirical analysis of video game dialogue.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.8000,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Corpora","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3366/cor.2024.0299","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"LINGUISTICS","Score":null,"Total":0}
引用次数: 2

Abstract

This paper presents the Video Game Dialogue Corpus, the first large-scale, consistently coded, open source corpus of dialogue from video games. It contains over 6.2 million words of English dialogue from fifty games in the Role Playing Game (rpg) genre. This includes games produced between 1985 and 2020, rated for children, teenagers and adults, and in both ‘Western’ and ‘Japanese’ sub-genres. The corpus design is described, including custom data formats for representing branching dialogue. We demonstrate the use of the corpus by comparing the dialogue of female and male characters, where we find reflections of gendered language in other media as well as patterns that seem specific to video games. We provide the source code for a ‘self-inflating corpus’ – a pipeline that obtains the data then processes and parses it into a standard format. This makes the corpus available for teaching and research purposes, providing the first such resource for empirical analysis of video game dialogue.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
视频游戏对话语料库
本文介绍了视频游戏对话语料库(Video Game Dialogue Corpus),这是第一个大规模、持续编码、开源的视频游戏对话语料库。该语料库包含 50 款角色扮演游戏(rpg)类型游戏中超过 620 万字的英语对话。其中包括 1985 年至 2020 年间制作的游戏,分级为儿童、青少年和成人,有 "西方 "和 "日本 "两种子类型。我们介绍了语料库的设计,包括用于表示分支对话的自定义数据格式。我们通过比较女性和男性角色的对话来演示语料库的使用,我们发现了其他媒体中性别语言的反映,以及似乎是电子游戏特有的模式。我们提供了 "自充气语料库 "的源代码--这是一个获取数据、处理数据并将其解析为标准格式的管道。这使得该语料库可用于教学和研究目的,为视频游戏对话的实证分析提供了首个此类资源。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Corpora
Corpora LINGUISTICS-
CiteScore
1.70
自引率
0.00%
发文量
20
期刊最新文献
Introducing the Swedish Learner English Corpus: a corpus that enables investigations of the impact of extramural activities on L2 writing Exploring part of speech (pos) tag sequences in a large-scale learner corpus of L2 English: a developmental perspective Review: Barth and Schnell. 2022. Understanding Corpus Linguistics. New York: Routledge The Video Game Dialogue Corpus Developing a multimodal corpus of L2 academic English from an English medium of instruction university in China
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1