Models for rankings have been shown to produce more efficient estimators than comparable models for first/top choices. The discussions and applications of these models typically only consider unordered alternatives. But these models can be usefully adapted to the case where a respondent ranks a set of ordered alternatives that are ordered response categories. This paper proposes eliciting a rank order that is consistent with the ordering of the response categories, and then modelling the observed rankings using a variant of the rank ordered logit model where the distribution of rankings has been truncated to the set of admissible rankings. This results in lower standard errors in comparison to when only a single top category is selected by the respondents. And the restrictions on the set of admissible rankings reduces the number of decisions needed to be made by respondents in comparison to ranking a set of unordered alternatives. Simulation studies and application examples featuring models based on a stereotype regression model and a rating scale item response model are provided to demonstrate the utility of this approach.
{"title":"A note on the use of rank-ordered logit models for ordered response categories","authors":"Timothy R. Johnson","doi":"10.1111/bmsp.12292","DOIUrl":"10.1111/bmsp.12292","url":null,"abstract":"<p>Models for rankings have been shown to produce more efficient estimators than comparable models for first/top choices. The discussions and applications of these models typically only consider unordered alternatives. But these models can be usefully adapted to the case where a respondent ranks a set of ordered alternatives that are ordered response categories. This paper proposes eliciting a rank order that is consistent with the ordering of the response categories, and then modelling the observed rankings using a variant of the rank ordered logit model where the distribution of rankings has been truncated to the set of admissible rankings. This results in lower standard errors in comparison to when only a single top category is selected by the respondents. And the restrictions on the set of admissible rankings reduces the number of decisions needed to be made by respondents in comparison to ranking a set of unordered alternatives. Simulation studies and application examples featuring models based on a stereotype regression model and a rating scale item response model are provided to demonstrate the utility of this approach.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":"76 1","pages":"236-256"},"PeriodicalIF":2.6,"publicationDate":"2022-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10510690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Response process data collected from human–computer interactive items contain detailed information about respondents' behavioural patterns and cognitive processes. Such data are valuable sources for analysing respondents' problem-solving strategies. However, the irregular data format and the complex structure make standard statistical tools difficult to apply. This article develops a computationally efficient method for exploratory analysis of such process data. The new approach segments a lengthy individual process into a sequence of short subprocesses to achieve complexity reduction, easy clustering and meaningful interpretation. Each subprocess is considered a subtask. The segmentation is based on sequential action predictability using a parsimonious predictive model combined with the Shannon entropy. Simulation studies are conducted to assess the performance of the new method. We use a case study of PIAAC 2012 to demonstrate how exploratory analysis for process data can be carried out with the new approach.
{"title":"Subtask analysis of process data through a predictive model","authors":"Zhi Wang, Xueying Tang, Jingchen Liu, Zhiliang Ying","doi":"10.1111/bmsp.12290","DOIUrl":"10.1111/bmsp.12290","url":null,"abstract":"<p>Response process data collected from human–computer interactive items contain detailed information about respondents' behavioural patterns and cognitive processes. Such data are valuable sources for analysing respondents' problem-solving strategies. However, the irregular data format and the complex structure make standard statistical tools difficult to apply. This article develops a computationally efficient method for exploratory analysis of such process data. The new approach segments a lengthy individual process into a sequence of short subprocesses to achieve complexity reduction, easy clustering and meaningful interpretation. Each subprocess is considered a subtask. The segmentation is based on sequential action predictability using a parsimonious predictive model combined with the Shannon entropy. Simulation studies are conducted to assess the performance of the new method. We use a case study of PIAAC 2012 to demonstrate how exploratory analysis for process data can be carried out with the new approach.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":"76 1","pages":"211-235"},"PeriodicalIF":2.6,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9075644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Computerized adaptive testing for cognitive diagnosis (CD-CAT) needs to be efficient and responsive in real time to meet practical applications' requirements. For high-dimensional data, the number of categories to be recognized in a test grows exponentially as the number of attributes increases, which can easily cause system reaction time to be too long such that it adversely affects the examinees and thus seriously impacts the measurement efficiency. More importantly, the long-time CPU operations and memory usage of item selection in CD-CAT due to intensive computation are impractical and cannot wholly meet practice needs. This paper proposed two new efficient selection strategies (HIA and CEL) for high-dimensional CD-CAT to address this issue by incorporating the max-marginals from the maximum a posteriori query and integrating the ensemble learning approach into the previous efficient selection methods, respectively. The performance of the proposed selection method was compared with the conventional selection method using simulated and real item pools. The results showed that the proposed methods could significantly improve the measurement efficiency with about 1/2–1/200 of the conventional methods' computation time while retaining similar measurement accuracy. With increasing number of attributes and size of the item pool, the computation time advantage of the proposed methods becomes more significant.
{"title":"Two efficient selection methods for high-dimensional CD-CAT utilizing max-marginals factor from MAP query and ensemble learning approach","authors":"Fen Luo, Xiaoqing Wang, Yan Cai, Dongbo Tu","doi":"10.1111/bmsp.12288","DOIUrl":"10.1111/bmsp.12288","url":null,"abstract":"<p>Computerized adaptive testing for cognitive diagnosis (CD-CAT) needs to be efficient and responsive in real time to meet practical applications' requirements. For high-dimensional data, the number of categories to be recognized in a test grows exponentially as the number of attributes increases, which can easily cause system reaction time to be too long such that it adversely affects the examinees and thus seriously impacts the measurement efficiency. More importantly, the long-time CPU operations and memory usage of item selection in CD-CAT due to intensive computation are impractical and cannot wholly meet practice needs. This paper proposed two new efficient selection strategies (HIA and CEL) for high-dimensional CD-CAT to address this issue by incorporating the max-marginals from the maximum a posteriori query and integrating the ensemble learning approach into the previous efficient selection methods, respectively. The performance of the proposed selection method was compared with the conventional selection method using simulated and real item pools. The results showed that the proposed methods could significantly improve the measurement efficiency with about 1/2–1/200 of the conventional methods' computation time while retaining similar measurement accuracy. With increasing number of attributes and size of the item pool, the computation time advantage of the proposed methods becomes more significant.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":"76 2","pages":"283-311"},"PeriodicalIF":2.6,"publicationDate":"2022-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9254138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dungang Liu, Xiaorui Zhu, Brandon Greenwell, Zewei Lin
Probit models are used extensively for inferential purposes in the social sciences as discrete data are prevalent in a vast body of social studies. Among many accompanying model inference problems, a critical question remains unsettled: how to develop a goodness-of-fit measure that resembles the ordinary least square (OLS) R2 used for linear models. Such a measure has long been sought to achieve ‘comparability’ of different empirical models across multiple samples addressing similar social questions. To this end, we propose a novel R2 measure for probit models using the notion of surrogacy – simulating a continuous variable