{"title":"Introducing the Single Player Offline Game Corpus (spoc): a corpus of seven registers from digital role-playing games","authors":"Daniel H. Dixon","doi":"10.3366/cor.2024.0300","DOIUrl":null,"url":null,"abstract":"This paper describes the compilation and design of the Single Player Offline Game Corpus (spoc), which is being made freely available for research and educational purposes. The spoc was compiled by extracting the localisation files from the digital directories of four popular commercial digital role-playing games: Divinity: Original Sin II, Fallout 4, the Elder Scrolls V: Skyrim, and the Witcher 3: Wild Hunt. The 3.7 million word corpus contains more than 30,000 texts and is unique compared with other game corpora in that it has the following three characteristics: ( 1) the texts are categorised into seven registers using Biber and Conrad’s (2019) register framework, ( 2) texts are systematically parsed into the smallest meaningful units of observation, and ( 3) all texts were compiled from the data files of the games themselves. Nearly all language use in the four games is accounted for and parsed into register categories based on their underlying situational characteristics – in particular, the communicative purposes and the associated contexts in which the texts appear in the games.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.8000,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Corpora","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3366/cor.2024.0300","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"LINGUISTICS","Score":null,"Total":0}
引用次数: 0
Abstract
This paper describes the compilation and design of the Single Player Offline Game Corpus (spoc), which is being made freely available for research and educational purposes. The spoc was compiled by extracting the localisation files from the digital directories of four popular commercial digital role-playing games: Divinity: Original Sin II, Fallout 4, the Elder Scrolls V: Skyrim, and the Witcher 3: Wild Hunt. The 3.7 million word corpus contains more than 30,000 texts and is unique compared with other game corpora in that it has the following three characteristics: ( 1) the texts are categorised into seven registers using Biber and Conrad’s (2019) register framework, ( 2) texts are systematically parsed into the smallest meaningful units of observation, and ( 3) all texts were compiled from the data files of the games themselves. Nearly all language use in the four games is accounted for and parsed into register categories based on their underlying situational characteristics – in particular, the communicative purposes and the associated contexts in which the texts appear in the games.