Job and practice analysis is a commonly used method for determining examination content specifications. However, difficulties arise when many domains are present, as mainstream approaches do not fully adhere to the essence of the weighing process, namely a “comparison-evaluation-decision” framework for assigning percentage values to the content. Stemming from the principle of comparing multiple criteria for making decisions, the Analytic Hierarchy Process (AHP) provides an appropriate solution that circumvents the aforementioned obstacle. We propose using an extended version of AHP called Group AHP (GAHP) to weight content specifications for standardized medical education assessment. Specifically, GAHP is integrated with the Delphi method and expected to aid exam developers in integrating feedback from diverse experienced physicians when determining content specifications for the National Medical Licensing Examination (NMLE) in China. The complete flow of the proposed approach was demonstrated in this study with an application to the NMLE.
{"title":"Weighting Content Specifications for the National Medical Licensing Examination via Group Analytic Hierarchy Process","authors":"Xiaomei Hong, Zhehan Jiang, Hanyu Liu, Fen Cai","doi":"10.1111/emip.12620","DOIUrl":"10.1111/emip.12620","url":null,"abstract":"<p>Job and practice analysis is a commonly used method for determining examination content specifications. However, difficulties arise when many domains are present, as mainstream approaches do not fully adhere to the essence of the weighing process, namely a “comparison-evaluation-decision” framework for assigning percentage values to the content. Stemming from the principle of comparing multiple criteria for making decisions, the Analytic Hierarchy Process (AHP) provides an appropriate solution that circumvents the aforementioned obstacle. We propose using an extended version of AHP called Group AHP (GAHP) to weight content specifications for standardized medical education assessment. Specifically, GAHP is integrated with the Delphi method and expected to aid exam developers in integrating feedback from diverse experienced physicians when determining content specifications for the National Medical Licensing Examination (NMLE) in China. The complete flow of the proposed approach was demonstrated in this study with an application to the NMLE.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"44 1","pages":"7-17"},"PeriodicalIF":2.7,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141823629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In recent years, educators, administrators, policymakers, and measurement experts have called for assessments that support educators in making better instructional decisions. One promising approach to measurement to support instructional decision-making is diagnostic classification models (DCMs). DCMs are flexible psychometric models that facilitate fine-grained reporting on skills that students have mastered. In this article, we describe how DCMs can be leveraged to support better decision-making. We first provide a high-level overview of DCMs. We then describe different methods for reporting results from DCM-based assessments that support decision-making for different stakeholder groups. We close with a discussion of considerations for implementing DCMs in an operational setting, including how they can inform decision-making at state and local levels, and share future directions for research.
{"title":"Improving Instructional Decision-Making Using Diagnostic Classification Models","authors":"W. Jake Thompson, Amy K. Clark","doi":"10.1111/emip.12619","DOIUrl":"10.1111/emip.12619","url":null,"abstract":"<p>In recent years, educators, administrators, policymakers, and measurement experts have called for assessments that support educators in making better instructional decisions. One promising approach to measurement to support instructional decision-making is diagnostic classification models (DCMs). DCMs are flexible psychometric models that facilitate fine-grained reporting on skills that students have mastered. In this article, we describe how DCMs can be leveraged to support better decision-making. We first provide a high-level overview of DCMs. We then describe different methods for reporting results from DCM-based assessments that support decision-making for different stakeholder groups. We close with a discussion of considerations for implementing DCMs in an operational setting, including how they can inform decision-making at state and local levels, and share future directions for research.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"43 4","pages":"146-156"},"PeriodicalIF":2.7,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141509562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xuelan Qiu, Jimmy de la Torre, You-Gan Wang, Jinran Wu
Multidimensional forced-choice (MFC) items have been found to be useful to reduce response biases in personality assessments. However, conventional scoring methods for the MFC items result in ipsative data, hindering the wider applications of the MFC format. In the last decade, a number of item response theory (IRT) models have been developed, majority of which are for MFC items with binary responses. However, MFC items with polytomous responses are more informative and have many applications. This paper develops a polytomous Rasch ipsative model (pRIM) that can deal with ipsative data and yield estimates that measure construct differentiation—a latent trait that describes the degree to which the personality constructs (e.g., interests) distinguish between each other. The pRIM and its simpler form are applied to a career interests assessment containing four-category MFC items and the measures of interests differentiation are used for both intra- and interpersonal comparisons. Simulations are conducted to examine the recovery of the parameters under various conditions. The results show that the parameters of the pRIM can be well recovered, particularly when a complete linking design and a large sample are used. The implications and application of the pRIM in the personality assessment using MFC items are discussed.
{"title":"Item Response Theory Models for Polytomous Multidimensional Forced-Choice Items to Measure Construct Differentiation","authors":"Xuelan Qiu, Jimmy de la Torre, You-Gan Wang, Jinran Wu","doi":"10.1111/emip.12621","DOIUrl":"10.1111/emip.12621","url":null,"abstract":"<p>Multidimensional forced-choice (MFC) items have been found to be useful to reduce response biases in personality assessments. However, conventional scoring methods for the MFC items result in ipsative data, hindering the wider applications of the MFC format. In the last decade, a number of item response theory (IRT) models have been developed, majority of which are for MFC items with binary responses. However, MFC items with polytomous responses are more informative and have many applications. This paper develops a polytomous Rasch ipsative model (pRIM) that can deal with ipsative data and yield estimates that measure construct differentiation—a latent trait that describes the degree to which the personality constructs (e.g., interests) distinguish between each other. The pRIM and its simpler form are applied to a career interests assessment containing four-category MFC items and the measures of interests differentiation are used for both intra- and interpersonal comparisons. Simulations are conducted to examine the recovery of the parameters under various conditions. The results show that the parameters of the pRIM can be well recovered, particularly when a complete linking design and a large sample are used. The implications and application of the pRIM in the personality assessment using MFC items are discussed.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"43 4","pages":"157-168"},"PeriodicalIF":2.7,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141362892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}