Exploring examinees' responses to constructed response items with a supervised topic model

IF 1.5 3区 心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS British Journal of Mathematical & Statistical Psychology Pub Date : 2023-09-13 DOI:10.1111/bmsp.12319
Seohyun Kim, Zhenqiu Lu, Allan S. Cohen
{"title":"Exploring examinees' responses to constructed response items with a supervised topic model","authors":"Seohyun Kim,&nbsp;Zhenqiu Lu,&nbsp;Allan S. Cohen","doi":"10.1111/bmsp.12319","DOIUrl":null,"url":null,"abstract":"<p>Textual data are increasingly common in test data as many assessments include constructed response (CR) items as indicators of participants' understanding. The development of techniques based on natural language processing has made it possible for researchers to rapidly analyse large sets of textual data. One family of statistical techniques for this purpose are probabilistic topic models. Topic modelling is a technique for detecting the latent topic structure in a collection of documents and has been widely used to analyse texts in a variety of areas. The detected topics can reveal primary themes in the documents, and the relative use of topics can be useful in investigating the variability of the documents. Supervised latent Dirichlet allocation (SLDA) is a popular topic model in that family that jointly models textual data and paired responses such as could occur with participants' textual answers to CR items and their rubric-based scores. SLDA has an assumption of a homogeneous relationship between textual data and paired responses across all documents. This approach, while useful for some purposes, may not be satisfied for situations in which a population has subgroups that have different relationships. In this study, we introduce a new supervised topic model that incorporates finite-mixture modelling into the SLDA. This new model can detect latent groups of participants that have different relationships between their textual responses and associated scores. The model is illustrated with an example from an analysis of a set of textual responses and paired scores from a middle grades assessment of science inquiry knowledge. A simulation study is presented to investigate the performance of the proposed model under practical testing conditions.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":null,"pages":null},"PeriodicalIF":1.5000,"publicationDate":"2023-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/bmsp.12319","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"British Journal of Mathematical & Statistical Psychology","FirstCategoryId":"102","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/bmsp.12319","RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Textual data are increasingly common in test data as many assessments include constructed response (CR) items as indicators of participants' understanding. The development of techniques based on natural language processing has made it possible for researchers to rapidly analyse large sets of textual data. One family of statistical techniques for this purpose are probabilistic topic models. Topic modelling is a technique for detecting the latent topic structure in a collection of documents and has been widely used to analyse texts in a variety of areas. The detected topics can reveal primary themes in the documents, and the relative use of topics can be useful in investigating the variability of the documents. Supervised latent Dirichlet allocation (SLDA) is a popular topic model in that family that jointly models textual data and paired responses such as could occur with participants' textual answers to CR items and their rubric-based scores. SLDA has an assumption of a homogeneous relationship between textual data and paired responses across all documents. This approach, while useful for some purposes, may not be satisfied for situations in which a population has subgroups that have different relationships. In this study, we introduce a new supervised topic model that incorporates finite-mixture modelling into the SLDA. This new model can detect latent groups of participants that have different relationships between their textual responses and associated scores. The model is illustrated with an example from an analysis of a set of textual responses and paired scores from a middle grades assessment of science inquiry knowledge. A simulation study is presented to investigate the performance of the proposed model under practical testing conditions.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用监督话题模型探索考生对构建的回答项目的反应。
文本数据在测试数据中越来越普遍,因为许多评估包括构建反应(CR)项目作为参与者理解的指标。基于自然语言处理技术的发展使研究人员能够快速分析大量文本数据。用于此目的的一类统计技术是概率主题模型。主题建模是一种检测文档集合中潜在主题结构的技术,已被广泛用于分析各个领域的文本。检测到的主题可以揭示文档中的主要主题,并且主题的相对使用可以用于调查文档的可变性。监督潜狄利克雷分配(SLDA)是该家族中流行的主题模型,它联合建模文本数据和配对反应,例如参与者对CR项目的文本答案及其基于规则的分数。SLDA假设所有文档中的文本数据和成对响应之间存在同构关系。这种方法虽然对某些目的有用,但可能不适用于总体中具有不同关系的子组的情况。在本研究中,我们引入了一种新的监督主题模型,该模型将有限混合模型引入到SLDA中。这个新模型可以检测潜在的参与者群体,他们的文本回复和相关分数之间有不同的关系。该模型是由一个例子,从一组文本回应的分析和配对分数从科学探究知识的中级评估。通过仿真研究,验证了该模型在实际测试条件下的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
5.00
自引率
3.80%
发文量
34
审稿时长
>12 weeks
期刊介绍: The British Journal of Mathematical and Statistical Psychology publishes articles relating to areas of psychology which have a greater mathematical or statistical aspect of their argument than is usually acceptable to other journals including: • mathematical psychology • statistics • psychometrics • decision making • psychophysics • classification • relevant areas of mathematics, computing and computer software These include articles that address substantitive psychological issues or that develop and extend techniques useful to psychologists. New models for psychological processes, new approaches to existing data, critiques of existing models and improved algorithms for estimating the parameters of a model are examples of articles which may be favoured.
期刊最新文献
Average treatment effects on binary outcomes with stochastic covariates. Are alternative variables in a set differently associated with a target variable? Statistical tests and practical advice for dealing with dependent correlations. Determining the number of attributes in the GDINA model. Nonparametric CD-CAT for multiple-choice items: Item selection method and Q-optimality. Incorporating calibration errors in oral reading fluency scoring.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1