Corpus linguistics will benefit from greater adoption of pre-registration: A novice-friendly split-corpus approach to pre-registration

Matthew H.C. Mak
{"title":"Corpus linguistics will benefit from greater adoption of pre-registration: A novice-friendly split-corpus approach to pre-registration","authors":"Matthew H.C. Mak","doi":"10.1016/j.acorp.2024.100111","DOIUrl":null,"url":null,"abstract":"<div><div>In this brief article, I contend that the field of corpus linguistics stands to gain significantly from an increased adoption of pre-registration. Pre-registration serves to constrain the almost infinite degree of analytic freedom inherent in corpus analysis, thereby enhancing the transparency, reliability, and potential impact of corpus research. While pre-registration is increasingly popular in fields such as psychology and medicine, its uptake in corpus linguistics remains notably limited. To facilitate the transition toward pre-registration, I describe a straightforward split-corpus approach, ideally suited for corpus linguists new to pre-registration and for both hypothesis-testing and exploratory research. This method involves dividing a corpus into an exploratory set (20–40 % of the corpus) and a confirmatory set (the remaining 60–80 %). The exploratory set allows researchers to freely generate hypotheses and develop analysis plans, while the confirmatory set is then used for a more structured and objective analysis according to the pre-specified protocols. By employing this approach, corpus linguists can effectively balance exploratory flexibility with the rigour of confirmatory analysis, boosting the reliability of corpus findings. An increased uptake of pre-registration may not only bolster recognition of corpus linguistics as a robust empirical field, but it may also encourage a stronger emphasis on the building of cumulative knowledge.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"4 3","pages":"Article 100111"},"PeriodicalIF":0.0000,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Corpus Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666799124000285","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In this brief article, I contend that the field of corpus linguistics stands to gain significantly from an increased adoption of pre-registration. Pre-registration serves to constrain the almost infinite degree of analytic freedom inherent in corpus analysis, thereby enhancing the transparency, reliability, and potential impact of corpus research. While pre-registration is increasingly popular in fields such as psychology and medicine, its uptake in corpus linguistics remains notably limited. To facilitate the transition toward pre-registration, I describe a straightforward split-corpus approach, ideally suited for corpus linguists new to pre-registration and for both hypothesis-testing and exploratory research. This method involves dividing a corpus into an exploratory set (20–40 % of the corpus) and a confirmatory set (the remaining 60–80 %). The exploratory set allows researchers to freely generate hypotheses and develop analysis plans, while the confirmatory set is then used for a more structured and objective analysis according to the pre-specified protocols. By employing this approach, corpus linguists can effectively balance exploratory flexibility with the rigour of confirmatory analysis, boosting the reliability of corpus findings. An increased uptake of pre-registration may not only bolster recognition of corpus linguistics as a robust empirical field, but it may also encourage a stronger emphasis on the building of cumulative knowledge.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
语料库语言学将从更多采用预注册中受益:预注册的新手友好型分割语料库方法
在这篇简短的文章中,我认为语料库语言学领域可以从越来越多地采用预注册中获得巨大收益。预注册可以限制语料库分析固有的几乎无限的分析自由度,从而提高语料库研究的透明度、可靠性和潜在影响力。虽然预注册在心理学和医学等领域越来越流行,但在语料库语言学中的应用却仍然非常有限。为了促进向预注册的过渡,我介绍了一种直接的分割语料库方法,非常适合刚开始预注册的语料库语言学家,也适合假设检验和探索性研究。这种方法是将语料库分为探索集(语料库的 20-40%)和确认集(剩余的 60-80%)。探索集允许研究人员自由地提出假设和制定分析计划,而确认集则用于按照预先规定的协议进行更有条理和客观的分析。通过采用这种方法,语料库语言学家可以有效地平衡探索的灵活性和确认分析的严谨性,提高语料库研究结果的可靠性。更多地采用预注册的方法,不仅可以提高人们对语料库语言学作为一个强大的实证领域的认可,还可以鼓励人们更加重视积累知识。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Applied Corpus Linguistics
Applied Corpus Linguistics Linguistics and Language
CiteScore
1.30
自引率
0.00%
发文量
0
审稿时长
70 days
期刊最新文献
‘I am still unsure…’ – Spontaneous expressions of vaccine indecision on Mumsnet How humans and machines identify discourse topics: A methodological triangulation Anywhere but here: Discourses and representations surrounding same-sex marriage in Japanese newspapers Is LIWC reliable, efficient, and effective for the analysis of large online datasets in forensic and security contexts? The personal_relationship frame in love fraud
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1