首页 > 最新文献

Research Methods in Applied Linguistics最新文献

英文 中文
Stop splitting hairs: The problems with dichotomizing continuous data in language research 停止吹毛求疵:语言研究中连续数据二分类的问题
Pub Date : 2025-10-14 DOI: 10.1016/j.rmal.2025.100272
Shawn Hemelstrand , Tomohiro Inoue
It is common in the language sciences to dichotomize continuous data in order to fit models to data. However, several statisticians and methodologists have warned against this practice for years. Many in the language sciences seem unaware of this problem. Because of the lack of modern, robust, and open data simulations related to this issue in the language science literature, this article provides an empirical investigation of this practice. Across three different simulations, our analysis shows that dichotomization almost universally increases the standard errors, and consequently leads to inaccuracy of tests of statistical significance. Furthermore, effect sizes like R2 are often diminished by the reduction of available information in the data. We conclude by providing suggestions and considerations for future empirical studies.
在语言科学中,对连续数据进行二分类以拟合模型是很常见的。然而,一些统计学家和方法学家多年来一直警告这种做法。语言科学领域的许多人似乎没有意识到这个问题。由于语言科学文献中缺乏与此问题相关的现代、健壮和开放的数据模拟,因此本文提供了对这一实践的实证调查。通过三种不同的模拟,我们的分析表明,二分法几乎普遍地增加了标准误差,从而导致统计显著性测试的不准确性。此外,像R2这样的效应量通常会因数据中可用信息的减少而减小。最后,对未来的实证研究提出建议和思考。
{"title":"Stop splitting hairs: The problems with dichotomizing continuous data in language research","authors":"Shawn Hemelstrand ,&nbsp;Tomohiro Inoue","doi":"10.1016/j.rmal.2025.100272","DOIUrl":"10.1016/j.rmal.2025.100272","url":null,"abstract":"<div><div>It is common in the language sciences to dichotomize continuous data in order to fit models to data. However, several statisticians and methodologists have warned against this practice for years. Many in the language sciences seem unaware of this problem. Because of the lack of modern, robust, and open data simulations related to this issue in the language science literature, this article provides an empirical investigation of this practice. Across three different simulations, our analysis shows that dichotomization almost universally increases the standard errors, and consequently leads to inaccuracy of tests of statistical significance. Furthermore, effect sizes like <span><math><msup><mi>R</mi><mn>2</mn></msup></math></span> are often diminished by the reduction of available information in the data. We conclude by providing suggestions and considerations for future empirical studies.</div></div>","PeriodicalId":101075,"journal":{"name":"Research Methods in Applied Linguistics","volume":"4 3","pages":"Article 100272"},"PeriodicalIF":0.0,"publicationDate":"2025-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145319782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Applying the confirmatory bifactor modeling and omega-family indices: The case of the foreign language classroom anxiety scale 应用验证性双因素模型及欧米伽家族指数:以外语课堂焦虑量表为例
Pub Date : 2025-10-10 DOI: 10.1016/j.rmal.2025.100265
Xian Zhang
This study demonstrates how to use confirmatory bifactor analysis (CbFA) and omega-family indices to evaluate the dimensionality of the Foreign Language Classroom Anxiety Scale (FLCAS). The FLCAS is generally regarded to have high reliability, which is supported by high Cronbach's alpha values often observed in the literature. However, as the FLCAS has often been shown to have multidimensional constructs, it remains unclear if the scale can measure a general construct in the presence of multidimensionality. Confirmatory bifactor modeling can be used to assess whether an instrument can measure a single general psychological construct while accounting for multidimensionality. The model posits a general factor that explains the shared variance across all items, along with specific factors that capture the unique variance within subsets of items. With CbFA, the dimensionality of a factor structure can then be closely examined with statistics such as construct replicability, explained common variance, and omega family indices (e.g., Reise, 2012). In this demonstration, I will show that a smaller subset of FLCAS items effectively measures the general FLA construct, with the general factor explaining the largest portion of the model-wise variance. I will then present recommendations for determining when aggregated scores from a reduced item set can reliably represent the general construct while preserving essential psychometric properties. Finally, I will discuss key considerations for applying and interpreting CbFA in foreign language research.
本研究探讨了如何运用验证性双因子分析(CbFA)和omega家族指数来评估外语课堂焦虑量表(FLCAS)的维度。FLCAS通常被认为具有较高的信度,这从文献中经常观察到较高的Cronbach's alpha值得到了支持。然而,由于FLCAS经常被证明具有多维构式,因此尚不清楚该量表是否可以在多维存在的情况下测量一般构式。验证性双因素模型可以用来评估一个工具是否可以测量一个单一的一般心理结构,同时考虑到多维度。该模型假定了一个解释所有项目之间共享方差的一般因素,以及捕获项目子集内唯一方差的特定因素。使用CbFA,因子结构的维度可以通过统计数据进行仔细检查,例如结构可复制性,解释的共同方差和ω家族指数(例如,Reise, 2012)。在这个演示中,我将展示FLCAS项目的一个更小的子集有效地测量了一般的FLA结构,其中一般因素解释了模型相关方差的最大部分。然后,我将提出建议,以确定在保留基本心理测量属性的同时,何时从减少的项目集中汇总得分可以可靠地代表一般结构。最后,我将讨论在外语研究中应用和解释CbFA的关键考虑因素。
{"title":"Applying the confirmatory bifactor modeling and omega-family indices: The case of the foreign language classroom anxiety scale","authors":"Xian Zhang","doi":"10.1016/j.rmal.2025.100265","DOIUrl":"10.1016/j.rmal.2025.100265","url":null,"abstract":"<div><div>This study demonstrates how to use confirmatory bifactor analysis (CbFA) and omega-family indices to evaluate the dimensionality of the Foreign Language Classroom Anxiety Scale (FLCAS). The FLCAS is generally regarded to have high reliability, which is supported by high Cronbach's alpha values often observed in the literature. However, as the FLCAS has often been shown to have multidimensional constructs, it remains unclear if the scale can measure a general construct in the presence of multidimensionality. Confirmatory bifactor modeling can be used to assess whether an instrument can measure a single general psychological construct while accounting for multidimensionality. The model posits a general factor that explains the shared variance across all items, along with specific factors that capture the unique variance within subsets of items. With CbFA, the dimensionality of a factor structure can then be closely examined with statistics such as construct replicability, explained common variance, and omega family indices (e.g., Reise, 2012). In this demonstration, I will show that a smaller subset of FLCAS items effectively measures the general FLA construct, with the general factor explaining the largest portion of the model-wise variance. I will then present recommendations for determining when aggregated scores from a reduced item set can reliably represent the general construct while preserving essential psychometric properties. Finally, I will discuss key considerations for applying and interpreting CbFA in foreign language research.</div></div>","PeriodicalId":101075,"journal":{"name":"Research Methods in Applied Linguistics","volume":"4 3","pages":"Article 100265"},"PeriodicalIF":0.0,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145264664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Revisiting the impact of errors in L2/FL writing: A methodological review of five decades of research 重新审视二语/外语写作中错误的影响:五十年研究的方法论回顾
Pub Date : 2025-10-09 DOI: 10.1016/j.rmal.2025.100269
Wilson Cheong Hin Hong
Despite five decades of research into error gravity (EG) in writing, the field remains characterized by contradictory findings and limited practical applications. This methodology review critically examines the methods and their impact on findings across this fragmented research landscape. Through a two-phase approach—a selective review of studies from 1970–2015 (n = 21) and a PRISMA review of 2016–2025 research (n = 16)—four key constructs that have shaped the field are identified: reader perceptions, comprehension, awareness/sensitivity, and processing effort. Analyses reveal significant methodological limitations that have hindered progress, including over-reliance on subjective assessments, inconsistent error categorization, limited participant representativeness, and lack of theoretical grounding. The research trajectory shows a shifting focus from comprehension in early studies to reader preferences in the 2000s, followed by renewed interest in communication effectiveness and the emergence of processing effort as a novel yet critical construct, despite the persistent dominance of subjective ratings. Recommendations for future research are proposed, including the potential theories to frame studies, adoption of direct and validated methods and shifting focus to non-teacher participant populations. Only with genuine methodological advancements can this strand of study meaningfully inform L2/FL pedagogy and curriculum, providing evidence-based guidance for prioritizing certain L2 issues in instructions.
尽管对书面误差重力(EG)的研究已经进行了五十年,但该领域的特点仍然是发现矛盾,实际应用有限。本方法学综述批判性地考察了方法及其对这一支离破碎的研究领域的研究结果的影响。通过两阶段的方法——对1970-2015年研究的选择性回顾(n = 21)和对2016-2025年研究的PRISMA回顾(n = 16)——确定了塑造该领域的四个关键结构:读者感知、理解、意识/敏感性和处理努力。分析揭示了阻碍进展的重大方法局限性,包括过度依赖主观评估、不一致的错误分类、有限的参与者代表性和缺乏理论基础。研究轨迹显示,在21世纪初,研究的重点从早期的理解转向了读者偏好,随后对沟通有效性重新产生了兴趣,尽管主观评分一直占据主导地位,但处理努力作为一种新颖但重要的结构出现了。对未来的研究提出了建议,包括框架研究的潜在理论,采用直接和有效的方法,并将重点转移到非教师参与者群体。只有在方法论上取得真正的进步,这一研究链才能为L2/FL教学法和课程提供有意义的信息,为在教学中优先考虑某些L2问题提供循证指导。
{"title":"Revisiting the impact of errors in L2/FL writing: A methodological review of five decades of research","authors":"Wilson Cheong Hin Hong","doi":"10.1016/j.rmal.2025.100269","DOIUrl":"10.1016/j.rmal.2025.100269","url":null,"abstract":"<div><div>Despite five decades of research into error gravity (EG) in writing, the field remains characterized by contradictory findings and limited practical applications. This methodology review critically examines the methods and their impact on findings across this fragmented research landscape. Through a two-phase approach—a selective review of studies from 1970–2015 (<em>n</em> = 21) and a PRISMA review of 2016–2025 research (<em>n</em> = 16)—four key constructs that have shaped the field are identified: reader perceptions, comprehension, awareness/sensitivity, and processing effort. Analyses reveal significant methodological limitations that have hindered progress, including over-reliance on subjective assessments, inconsistent error categorization, limited participant representativeness, and lack of theoretical grounding. The research trajectory shows a shifting focus from comprehension in early studies to reader preferences in the 2000s, followed by renewed interest in communication effectiveness and the emergence of processing effort as a novel yet critical construct, despite the persistent dominance of subjective ratings. Recommendations for future research are proposed, including the potential theories to frame studies, adoption of direct and validated methods and shifting focus to non-teacher participant populations. Only with genuine methodological advancements can this strand of study meaningfully inform L2/FL pedagogy and curriculum, providing evidence-based guidance for prioritizing certain L2 issues in instructions.</div></div>","PeriodicalId":101075,"journal":{"name":"Research Methods in Applied Linguistics","volume":"4 3","pages":"Article 100269"},"PeriodicalIF":0.0,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145264724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating LLMs as proxies for humans in psycholinguistic ratings: A comparison of statistical knowledge 评估法学硕士作为人类心理语言学评级的代理:统计知识的比较
Pub Date : 2025-10-08 DOI: 10.1016/j.rmal.2025.100274
Yanlu Zhong, Simon Todd, Nicole Xu, Laurel Brehm
Psycholinguistic research has traditionally relied on human ratings for stimulus norming, but whether large language models (LLMs) can reliably replace human ratings remains uncertain. This study compares human participants and three LLMs—one proprietary model (ChatGPT-4o) and two open-source models (LLaMA-3.3–70B and DeepSeek-V3)—with respect to their statistical knowledge of English binomials. For each binomial, we obtained ratings of frequency, dispersion, forward association strength, and backward association strength from 34 human participants and from 30 output samples per LLM. We examined rating-to-corpus consistency (consistencyrating2corpus), the sensitivity of statistical ratings to corpus data, and the influence of other psycholinguistic factors on ratings. All LLMs’ statistical knowledge broadly mirrored that of humans. Ratings from both groups were sensitive to corpus data but not fully consistent with it. Frequency showed the highest consistencyrating2corpus, whereas dispersion showed the lowest consistencyrating2corpus and the weakest sensitivity. LLM ratings were also influenced by word-level cues. Nonetheless, LLM ratings showed greater consistencyrating2corpus, heightened sensitivity, and stronger reliance on other psycholinguistic cues than human ratings. Overall, while LLMs’ performance generally aligned with that of humans, their internal statistical representations differed significantly from human cognition. The three LLMs also showed variation in their rating behavior. Thus, although multi-LLM ratings can aid pilot studies in psycholinguistics, they should not replace human ratings in formal experiments.
心理语言学研究传统上依赖于人类对刺激规范的评级,但大型语言模型(llm)是否能可靠地取代人类评级仍不确定。这项研究比较了人类参与者和三个法学硕士——一个专有模型(chatgpt - 40)和两个开源模型(LLaMA-3.3-70B和DeepSeek-V3)——在英语二项统计知识方面的差异。对于每个二项,我们从34个人类参与者和每个LLM的30个输出样本中获得了频率、离散度、正向关联强度和反向关联强度的评级。我们检验了评级与语料库的一致性(consistencyrating2corpus)、统计评级对语料库数据的敏感性,以及其他心理语言学因素对评级的影响。所有法学硕士的统计知识大体上反映了人类的统计知识。两组的评分都对语料库数据敏感,但并不完全一致。频率的一致性最高,而离散度的一致性最低,灵敏度最低。法学硕士的评分也受到单词水平线索的影响。尽管如此,法学硕士的评分与人类的评分相比,显示出更大的一致性,更高的敏感性,以及对其他心理语言学线索的更强依赖。总体而言,法学硕士的表现与人类基本一致,但其内部统计表征与人类认知有显著差异。三位法学硕士在评分行为上也表现出差异。因此,尽管多llm评分可以帮助心理语言学的试点研究,但它们不应该取代正式实验中的人类评分。
{"title":"Evaluating LLMs as proxies for humans in psycholinguistic ratings: A comparison of statistical knowledge","authors":"Yanlu Zhong,&nbsp;Simon Todd,&nbsp;Nicole Xu,&nbsp;Laurel Brehm","doi":"10.1016/j.rmal.2025.100274","DOIUrl":"10.1016/j.rmal.2025.100274","url":null,"abstract":"<div><div>Psycholinguistic research has traditionally relied on human ratings for stimulus norming, but whether large language models (LLMs) can reliably replace human ratings remains uncertain. This study compares human participants and three LLMs—one proprietary model (ChatGPT-4o) and two open-source models (LLaMA-3.3–70B and DeepSeek-V3)—with respect to their statistical knowledge of English binomials. For each binomial, we obtained ratings of frequency, dispersion, forward association strength, and backward association strength from 34 human participants and from 30 output samples per LLM. We examined rating-to-corpus consistency (consistency<sub>rating2corpus</sub>), the sensitivity of statistical ratings to corpus data, and the influence of other psycholinguistic factors on ratings. All LLMs’ statistical knowledge broadly mirrored that of humans. Ratings from both groups were sensitive to corpus data but not fully consistent with it. Frequency showed the highest consistency<sub>rating2corpus</sub>, whereas dispersion showed the lowest consistency<sub>rating2corpus</sub> and the weakest sensitivity. LLM ratings were also influenced by word-level cues. Nonetheless, LLM ratings showed greater consistency<sub>rating2corpus</sub>, heightened sensitivity, and stronger reliance on other psycholinguistic cues than human ratings. Overall, while LLMs’ performance generally aligned with that of humans, their internal statistical representations differed significantly from human cognition. The three LLMs also showed variation in their rating behavior. Thus, although multi-LLM ratings can aid pilot studies in psycholinguistics, they should not replace human ratings in formal experiments.</div></div>","PeriodicalId":101075,"journal":{"name":"Research Methods in Applied Linguistics","volume":"4 3","pages":"Article 100274"},"PeriodicalIF":0.0,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145264725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Methodological rigor in quantitative L2 research: A focus on interventionist experimental studies 定量第二语言研究方法的严谨性:干预实验研究的焦点
Pub Date : 2025-10-08 DOI: 10.1016/j.rmal.2025.100273
Akbar A. Jahanbakhsh , Zahra Banitalebi , Jenifer Larson-Hall , Aya Shiiba
An increasing bulk of research highlights the importance of robust methodologies in ensuring the validity and reliability of research findings; yet, concerns remain regarding the quality of quantitative studies within second language (L2) research. To address the gap, this study aimed to systematically analyze a corpus of quantitative articles to assess the extent to which they adhere to established methodological best practices. The corpus comprises 791 interventionist quantitative articles published over 12 years in 8 journals selected based on their high impact factor and relevance to key areas within applied linguistics. A detailed coding scheme was developed to evaluate the articles across several crucial methodological dimensions, including sampling and design issues, types of statistical analyses, the necessary statistical assumptions to be checked, reporting practices, and visual presentation of data. The findings revealed that while improvements were evident in some areas, such as design-related issues and reporting practices, there is still a lack/shortage of attention to sampling issues like power analysis, practicing data sharing, and using data-rich/accountable visuals, with no significant improvement over time. These results highlighted areas where improvements in methodological rigor are needed to enhance the credibility and generalizability of quantitative research in applied linguistics. The study promotes best practices in research design and reporting, informs the development of guidelines for future research, and fosters a more critical and reflective approach to interpreting quantitative findings in the field.
越来越多的研究强调了强有力的方法在确保研究结果的有效性和可靠性方面的重要性;然而,关于第二语言(L2)研究中定量研究的质量问题仍然存在。为了解决这一差距,本研究旨在系统地分析定量文章的语料库,以评估它们坚持既定方法最佳实践的程度。该语料库包括12年来发表在8个期刊上的791篇干预主义定量文章,这些文章是根据其高影响因子和与应用语言学关键领域的相关性而选择的。制定了详细的编码方案,以便在几个关键的方法维度上对文章进行评估,包括抽样和设计问题、统计分析类型、需要检查的必要统计假设、报告实践和数据的可视化呈现。调查结果显示,虽然在某些领域(如设计相关问题和报告实践)有明显的改进,但对抽样问题(如功率分析、实践数据共享和使用数据丰富/负责任的视觉效果)的关注仍然不足,随着时间的推移没有显著改善。这些结果突出了需要改进方法严谨性的领域,以提高应用语言学定量研究的可信度和普遍性。这项研究促进了研究设计和报告的最佳实践,为未来研究指南的制定提供了信息,并促进了一种更加批判性和反思性的方法来解释该领域的定量研究结果。
{"title":"Methodological rigor in quantitative L2 research: A focus on interventionist experimental studies","authors":"Akbar A. Jahanbakhsh ,&nbsp;Zahra Banitalebi ,&nbsp;Jenifer Larson-Hall ,&nbsp;Aya Shiiba","doi":"10.1016/j.rmal.2025.100273","DOIUrl":"10.1016/j.rmal.2025.100273","url":null,"abstract":"<div><div>An increasing bulk of research highlights the importance of robust methodologies in ensuring the validity and reliability of research findings; yet, concerns remain regarding the quality of quantitative studies within second language (L2) research. To address the gap, this study aimed to systematically analyze a corpus of quantitative articles to assess the extent to which they adhere to established methodological best practices. The corpus comprises 791 interventionist quantitative articles published over 12 years in 8 journals selected based on their high impact factor and relevance to key areas within applied linguistics. A detailed coding scheme was developed to evaluate the articles across several crucial methodological dimensions, including sampling and design issues, types of statistical analyses, the necessary statistical assumptions to be checked, reporting practices, and visual presentation of data. The findings revealed that while improvements were evident in some areas, such as design-related issues and reporting practices, there is still a lack/shortage of attention to sampling issues like power analysis, practicing data sharing, and using data-rich/accountable visuals, with no significant improvement over time. These results highlighted areas where improvements in methodological rigor are needed to enhance the credibility and generalizability of quantitative research in applied linguistics. The study promotes best practices in research design and reporting, informs the development of guidelines for future research, and fosters a more critical and reflective approach to interpreting quantitative findings in the field.</div></div>","PeriodicalId":101075,"journal":{"name":"Research Methods in Applied Linguistics","volume":"4 3","pages":"Article 100273"},"PeriodicalIF":0.0,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145264686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bibliographic analysis in solving pitch doubling issues 书目分析解决音高加倍问题
Pub Date : 2025-09-30 DOI: 10.1016/j.rmal.2025.100271
Xiuming Wang , Shan Chen , Yuanzhao Ding
Pitch doubling is a pitch detection phenomenon in which an algorithm incorrectly identifies the frequency of a note as either double or half of its actual value, representing one of the major pitfalls for pitch detection accuracy. To review the literature on pitch doubling, this study searched the Web of Science Core Collection and systematically filtered relevant studies. Using VOSviewer bibliometric visualization, the research examined trends based on keywords, institutions, and countries or regions in the pitch doubling research field. Drawing on seminal contributions, the paper describes the underlying causes of pitch doubling (e.g., harmonic interference) and reviews existing mitigation methods (e.g., improved pitch detection algorithms). The weaknesses of current approaches are identified, and conclusions are provided to inform the development of more effective solutions to pitch doubling.
音调加倍是一种音调检测现象,其中算法错误地将音符的频率识别为其实际值的两倍或一半,这是音调检测精度的主要缺陷之一。为了回顾音高加倍的相关文献,本研究检索了Web of Science Core Collection,并对相关研究进行了系统筛选。利用VOSviewer文献计量可视化技术,该研究基于关键词、机构、国家或地区对音高加倍研究领域的趋势进行了调查。根据开创性的贡献,本文描述了基音加倍的潜在原因(例如,谐波干扰),并回顾了现有的缓解方法(例如,改进的基音检测算法)。确定了当前方法的弱点,并提供了结论,以便为制定更有效的解决方案提供信息。
{"title":"Bibliographic analysis in solving pitch doubling issues","authors":"Xiuming Wang ,&nbsp;Shan Chen ,&nbsp;Yuanzhao Ding","doi":"10.1016/j.rmal.2025.100271","DOIUrl":"10.1016/j.rmal.2025.100271","url":null,"abstract":"<div><div>Pitch doubling is a pitch detection phenomenon in which an algorithm incorrectly identifies the frequency of a note as either double or half of its actual value, representing one of the major pitfalls for pitch detection accuracy. To review the literature on pitch doubling, this study searched the Web of Science Core Collection and systematically filtered relevant studies. Using VOSviewer bibliometric visualization, the research examined trends based on keywords, institutions, and countries or regions in the pitch doubling research field. Drawing on seminal contributions, the paper describes the underlying causes of pitch doubling (e.g., harmonic interference) and reviews existing mitigation methods (e.g., improved pitch detection algorithms). The weaknesses of current approaches are identified, and conclusions are provided to inform the development of more effective solutions to pitch doubling.</div></div>","PeriodicalId":101075,"journal":{"name":"Research Methods in Applied Linguistics","volume":"4 3","pages":"Article 100271"},"PeriodicalIF":0.0,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145219397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Validating the Japanese version of the short-form foreign language enjoyment scale 日语简体外语享受量表的验证
Pub Date : 2025-09-30 DOI: 10.1016/j.rmal.2025.100270
Larry Xethakis , Michael Rupp , Oliver Edwards , Mark Howarth , Toshikazu Kawagoe
The study of positive emotions and their influence on language learning has gained considerable attention recently, with foreign language enjoyment being one of the most-studied emotions. The Short-form Foreign Language Enjoyment Scale (S-FLES) is a popular measure of enjoyment, however, this measure has yet to be validated for use in the Japanese context. This study aimed to address this gap by comparing hierarchical and bifactor confirmatory factor analysis (CFA) models, with analogous models employing the innovative technique of exploratory structural equation modeling (ESEM). Responses from 536 undergraduate EFL learners were used in the analysis of the models, with results indicating that the fit of the ESEM models were superior to that of the CFA models. The bifactor ESEM was chosen as the most suitable model of the S-FLES on the basis of its better convergent validity, divergent validity, and reliability, as well as its measurement quality. Invariance testing supported the bifactor model’s configural invariance, as well as its partial metric and scalar invariance across gender. The relationship between the bifactor model and social-behavioral engagement was evaluated as a measure of the S-FLES’s concurrent validity. The model exhibited a very strong degree of predictive power, with the general factor accounting for the greatest degree of variance in social-behavioral engagement. The bifactor model of the S-FLES was shown to be a valid and reliable measure of FLE among Japanese undergraduate EFL learners, providing further support to the use of ESEM in evaluating positive psychological instruments.
积极情绪及其对语言学习的影响的研究近年来引起了人们的广泛关注,其中外语享受是研究最多的情绪之一。简短形式外语享受量表(s - les)是一种流行的享受测量方法,然而,这种测量方法尚未在日语语境中得到验证。本研究旨在通过比较层次和双因素验证性因子分析(CFA)模型与采用探索性结构方程建模(ESEM)创新技术的类似模型来解决这一差距。536名本科英语学习者的反馈被用于模型分析,结果表明ESEM模型的拟合优于CFA模型。基于双因子ESEM模型具有较好的收敛效度、发散效度和信度,以及其测量质量,我们选择该模型作为s - les的最合适模型。不变性测试支持双因素模型的配置不变性,以及跨性别的部分度量和标量不变性。双因素模型与社会行为参与之间的关系被评估为s - les并发效度的衡量标准。该模型显示出很强的预测能力,一般因素在社会行为参与中占最大程度的差异。本研究的双因子模型是衡量日本大学生英语学习者英语学习能力的有效且可靠的方法,为ESEM在评估积极心理工具中的应用提供了进一步的支持。
{"title":"Validating the Japanese version of the short-form foreign language enjoyment scale","authors":"Larry Xethakis ,&nbsp;Michael Rupp ,&nbsp;Oliver Edwards ,&nbsp;Mark Howarth ,&nbsp;Toshikazu Kawagoe","doi":"10.1016/j.rmal.2025.100270","DOIUrl":"10.1016/j.rmal.2025.100270","url":null,"abstract":"<div><div>The study of positive emotions and their influence on language learning has gained considerable attention recently, with foreign language enjoyment being one of the most-studied emotions. The Short-form Foreign Language Enjoyment Scale (S-FLES) is a popular measure of enjoyment, however, this measure has yet to be validated for use in the Japanese context. This study aimed to address this gap by comparing hierarchical and bifactor confirmatory factor analysis (CFA) models, with analogous models employing the innovative technique of exploratory structural equation modeling (ESEM). Responses from 536 undergraduate EFL learners were used in the analysis of the models, with results indicating that the fit of the ESEM models were superior to that of the CFA models. The bifactor ESEM was chosen as the most suitable model of the S-FLES on the basis of its better convergent validity, divergent validity, and reliability, as well as its measurement quality. Invariance testing supported the bifactor model’s configural invariance, as well as its partial metric and scalar invariance across gender. The relationship between the bifactor model and social-behavioral engagement was evaluated as a measure of the S-FLES’s concurrent validity. The model exhibited a very strong degree of predictive power, with the general factor accounting for the greatest degree of variance in social-behavioral engagement. The bifactor model of the S-FLES was shown to be a valid and reliable measure of FLE among Japanese undergraduate EFL learners, providing further support to the use of ESEM in evaluating positive psychological instruments.</div></div>","PeriodicalId":101075,"journal":{"name":"Research Methods in Applied Linguistics","volume":"4 3","pages":"Article 100270"},"PeriodicalIF":0.0,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145219402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Designing and validating an AI-supported tool for enhancing critical inquiry in EFL education 设计和验证一个人工智能支持的工具,以加强英语教育中的批判性探究
Pub Date : 2025-09-30 DOI: 10.1016/j.rmal.2025.100266
Wan Yee Winsy Lai , Paul Kim , Ju Seong Lee
As Generative AI (GenAI) technologies advance rapidly, educational settings face an urgent need for targeted interventions to cultivate learners’ critical, higher-order inquiry skills, so they can effectively navigate, assess, and apply AI-generated content. The urgency of this imperative is magnified for EFL learners in test-driven educational contexts that foster passive learning behaviors, discourage questioning, and inhibit critical thinking. To address these issues, we developed an AI-powered tool designed to evaluate questions based on Bloom’s Taxonomy, a six-level framework of cognitive processes, ranging from basic recall questions (Level 1) to advanced questions that trigger creative and evaluative thinking (Level 5). In study 1, the reliability of the tool was confirmed through multiple inter-rater tests with strong agreement. In study 2, we implemented an intervention program that integrated Bloom’s Taxonomy, targeted readings, group discussions, and sharing to enhance inquiry skills among EFL undergraduate students. Four statistical analyses in SPSS 29.0—including ICC for inter-rater reliability, Pearson correlation, and regression—were conducted to validate the AI-powered inquiry evaluation tool. Across 174 questions, students’ average inquiry level improved from 3.3 to 4.1 (on a five-level scale), showing a significant 0.8-level increase and meaningful enhancement in question quality. The study provides solid evidence of the reliability and validity of the AI-powered inquiry evaluation tool as an objective, real-time method that enhances the efficiency, consistency, and scalability of assessments, offering valuable guidance for EFL practitioners, curriculum designers, researchers, educators, and institutions in integrating evidence-based, inquiry-driven tools into EFL programs.
随着生成式人工智能(GenAI)技术的快速发展,教育环境迫切需要有针对性的干预措施,以培养学习者的关键、高阶探究技能,使他们能够有效地导航、评估和应用人工智能生成的内容。在应试教育环境中,这种迫切性对英语学习者来说被放大了,这种环境助长了被动学习行为,阻碍了提问,抑制了批判性思维。为了解决这些问题,我们开发了一个人工智能驱动的工具,旨在根据布鲁姆分类法(Bloom’s Taxonomy)评估问题,布鲁姆分类法是一个六级认知过程框架,从基本的回忆问题(1级)到触发创造性和评估性思维的高级问题(5级)。在研究1中,该工具的可靠性通过多个评估者之间的检验得到了证实,并具有很强的一致性。在研究2中,我们实施了一项干预计划,将布鲁姆分类法、有针对性的阅读、小组讨论和分享结合起来,以提高英语本科生的探究技能。在SPSS 29.0中进行了四项统计分析,包括评估者间信度的ICC, Pearson相关性和回归,以验证人工智能驱动的查询评估工具。在174个问题中,学生的平均探究水平从3.3提高到4.1(5个等级),显著提高了0.8个等级,问题质量有意义的提高。该研究提供了确凿的证据,证明了人工智能驱动的探究性评估工具作为一种客观、实时的方法的可靠性和有效性,它提高了评估的效率、一致性和可扩展性,为EFL从业者、课程设计师、研究人员、教育工作者和机构在将基于证据的、探究性驱动的工具整合到EFL课程中提供了有价值的指导。
{"title":"Designing and validating an AI-supported tool for enhancing critical inquiry in EFL education","authors":"Wan Yee Winsy Lai ,&nbsp;Paul Kim ,&nbsp;Ju Seong Lee","doi":"10.1016/j.rmal.2025.100266","DOIUrl":"10.1016/j.rmal.2025.100266","url":null,"abstract":"<div><div>As Generative AI (GenAI) technologies advance rapidly, educational settings face an urgent need for targeted interventions to cultivate learners’ critical, higher-order inquiry skills, so they can effectively navigate, assess, and apply AI-generated content. The urgency of this imperative is magnified for EFL learners in test-driven educational contexts that foster passive learning behaviors, discourage questioning, and inhibit critical thinking. To address these issues, we developed an AI-powered tool designed to evaluate questions based on Bloom’s Taxonomy, a six-level framework of cognitive processes, ranging from basic recall questions (Level 1) to advanced questions that trigger creative and evaluative thinking (Level 5). In study 1, the reliability of the tool was confirmed through multiple inter-rater tests with strong agreement. In study 2, we implemented an intervention program that integrated Bloom’s Taxonomy, targeted readings, group discussions, and sharing to enhance inquiry skills among EFL undergraduate students. Four statistical analyses in SPSS 29.0—including ICC for inter-rater reliability, Pearson correlation, and regression—were conducted to validate the AI-powered inquiry evaluation tool. Across 174 questions, students’ average inquiry level improved from 3.3 to 4.1 (on a five-level scale), showing a significant 0.8-level increase and meaningful enhancement in question quality. The study provides solid evidence of the reliability and validity of the AI-powered inquiry evaluation tool as an objective, real-time method that enhances the efficiency, consistency, and scalability of assessments, offering valuable guidance for EFL practitioners, curriculum designers, researchers, educators, and institutions in integrating evidence-based, inquiry-driven tools into EFL programs.</div></div>","PeriodicalId":101075,"journal":{"name":"Research Methods in Applied Linguistics","volume":"4 3","pages":"Article 100266"},"PeriodicalIF":0.0,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145219400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semantic prosody, categorisation and inter-rater reliability 语义韵律、分类和信度
Pub Date : 2025-09-29 DOI: 10.1016/j.rmal.2025.100264
Mathias Russnes
This article investigates the inter-rater reliability of established methods of categorising semantic prosody. Semantic prosody is a concept associated with corpus linguistics, which describes the tendency of seemingly neutral items to occur in particular evaluative contexts. In previous research on semantic prosody, there has been a heavy reliance on manual analysis of smaller samples, and because of this, questions have been raised about the stability of the established methods for categorisation. Furthermore, there is also a lack of consensus regarding how such categorisations should be operationalised. Traditionally, it has often been viewed in binary terms, distinguishing between positive and negative prosodies. However, this restricted system has also received criticism, and certain researchers have adopted a more comprehensive (or fine-grained) categorisation, more connected to a unit’s semantic preference. This paper aims to evaluate the inter-analyst consistency of these systems through two experimental studies, in which four researchers independently analyse the same set of random concordance lines of the items habit and views from BNC2014, applying both methods of categorisation. The results indicate that a binary distinction between positive and negative offers a higher inter-analyst consistency than a more fine-grained categorisation. Additionally, this more comprehensive system was also found to obscure the borders between semantic preference and semantic prosody. However, because neither system achieved satisfactory inter-rater agreement, both studies highlight the need for more objective methods of analysing and categorising semantic prosody.
本文研究了已建立的语义韵律分类方法的间信度。语义韵律是一个与语料库语言学相关的概念,它描述了看似中性的项目在特定的评价语境中出现的趋势。在先前的语义韵律研究中,大量依赖于对较小样本的人工分析,因此,对既定分类方法的稳定性提出了质疑。此外,关于如何实施这种分类也缺乏共识。传统上,它经常被看作是二元的,区分积极和消极的韵律。然而,这种受限制的系统也受到了批评,某些研究人员采用了更全面(或细粒度)的分类,与单位的语义偏好更相关。本文旨在通过两项实验研究来评估这些系统的内部一致性,其中四名研究人员分别使用两种分类方法,独立分析了BNC2014中项目习惯和观点的同一组随机一致性线。结果表明,积极和消极之间的二元区分比更细粒度的分类提供了更高的分析师之间的一致性。此外,这个更全面的系统也被发现模糊了语义偏好和语义韵律之间的界限。然而,由于两种系统都没有达到令人满意的一致性,这两项研究都强调需要更客观的方法来分析和分类语义韵律。
{"title":"Semantic prosody, categorisation and inter-rater reliability","authors":"Mathias Russnes","doi":"10.1016/j.rmal.2025.100264","DOIUrl":"10.1016/j.rmal.2025.100264","url":null,"abstract":"<div><div>This article investigates the inter-rater reliability of established methods of categorising semantic prosody. Semantic prosody is a concept associated with corpus linguistics, which describes the tendency of seemingly neutral items to occur in particular evaluative contexts. In previous research on semantic prosody, there has been a heavy reliance on manual analysis of smaller samples, and because of this, questions have been raised about the stability of the established methods for categorisation. Furthermore, there is also a lack of consensus regarding how such categorisations should be operationalised. Traditionally, it has often been viewed in binary terms, distinguishing between <em>positive</em> and <em>negative</em> prosodies. However, this restricted system has also received criticism, and certain researchers have adopted a more comprehensive (or fine-grained) categorisation, more connected to a unit’s semantic preference. This paper aims to evaluate the inter-analyst consistency of these systems through two experimental studies, in which four researchers independently analyse the same set of random concordance lines of the items <em>habit</em> and <em>views</em> from BNC2014, applying both methods of categorisation. The results indicate that a binary distinction between <em>positive</em> and <em>negative</em> offers a higher inter-analyst consistency than a more fine-grained categorisation. Additionally, this more comprehensive system was also found to obscure the borders between semantic preference and semantic prosody. However, because neither system achieved satisfactory inter-rater agreement, both studies highlight the need for more objective methods of analysing and categorising semantic prosody.</div></div>","PeriodicalId":101075,"journal":{"name":"Research Methods in Applied Linguistics","volume":"4 3","pages":"Article 100264"},"PeriodicalIF":0.0,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145219401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
From monologic to dialogic: Conceptual and methodological issues in metadiscourse studies 从一元到对话:元话语研究中的概念与方法问题
Pub Date : 2025-09-26 DOI: 10.1016/j.rmal.2025.100268
Wen Xin, Lei Jiang
Metadiscourse has predominantly been studied in monologic written academic discourse, especially in research articles and student writing, largely due to the influence of two widely adopted models of metadiscourse developed by Hyland (2005) and Ädel (2006). In this article, we illustrate several conceptual and methodological challenges involved in implementing the two influential models into the genre of written feedback, a non-traditional, dialogic academic genre that both depends on and responds to another genre (student writing). We conclude by proposing potential pathways for addressing these conceptual and methodological challenges. The pathways may also be applicable to other written dialogic genres.
元话语主要在单一书面学术话语中进行研究,特别是在研究文章和学生写作中,这主要是由于Hyland(2005)和Ädel(2006)开发的两种广泛采用的元话语模型的影响。在本文中,我们阐述了将这两种有影响力的模型应用到书面反馈类型中所涉及的几个概念和方法上的挑战,书面反馈是一种非传统的、对话式的学术类型,既依赖于另一种类型(学生写作),也对其做出反应。最后,我们提出了解决这些概念和方法挑战的潜在途径。所述路径也可适用于其他书面对话类型。
{"title":"From monologic to dialogic: Conceptual and methodological issues in metadiscourse studies","authors":"Wen Xin,&nbsp;Lei Jiang","doi":"10.1016/j.rmal.2025.100268","DOIUrl":"10.1016/j.rmal.2025.100268","url":null,"abstract":"<div><div>Metadiscourse has predominantly been studied in monologic written academic discourse, especially in research articles and student writing, largely due to the influence of two widely adopted models of metadiscourse developed by Hyland (2005) and Ädel (2006). In this article, we illustrate several conceptual and methodological challenges involved in implementing the two influential models into the genre of written feedback, a non-traditional, dialogic academic genre that both depends on and responds to another genre (student writing). We conclude by proposing potential pathways for addressing these conceptual and methodological challenges. The pathways may also be applicable to other written dialogic genres.</div></div>","PeriodicalId":101075,"journal":{"name":"Research Methods in Applied Linguistics","volume":"4 3","pages":"Article 100268"},"PeriodicalIF":0.0,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145157422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Research Methods in Applied Linguistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1