Pub Date : 2023-03-21DOI: 10.1027/1015-5759/a000749
J. Veerbeek, B. Vogelaar
Abstract: The study investigated the value of process data obtained from a group-administered computerized dynamic test of analogical reasoning, consisting of a pretest-training-posttest design. We sought to evaluate the effects of training on processes and performance, and the relationships between process measures and performance on the dynamic test. Participants were N = 86 primary school children ( Mage = 8.11 years, SD = 0.63). The test consisted of constructed-response geometrical analogy items, requiring several actions to construct an answer. Process data enabled scoring of the total time, the time taken for initial planning of the task, the time taken for checking the answer that was provided, and variation in solving time. Training led to improved performance compared to repeated practice, but this improvement was not reflected in task-solving processes. Almost all process measures were related to performance, but the effects of training or repeated practice on this relationship differed widely between measures. In conclusion, the findings seemed to indicate that investigating process indicators within computerized dynamic testing of analogical reasoning ability provided information about children’s learning processes, but that not all processes were affected in the same way by training.
{"title":"Computerized Process-Oriented Dynamic Testing of Children’s Ability to Reason by Analogy Using Log Data","authors":"J. Veerbeek, B. Vogelaar","doi":"10.1027/1015-5759/a000749","DOIUrl":"https://doi.org/10.1027/1015-5759/a000749","url":null,"abstract":"Abstract: The study investigated the value of process data obtained from a group-administered computerized dynamic test of analogical reasoning, consisting of a pretest-training-posttest design. We sought to evaluate the effects of training on processes and performance, and the relationships between process measures and performance on the dynamic test. Participants were N = 86 primary school children ( Mage = 8.11 years, SD = 0.63). The test consisted of constructed-response geometrical analogy items, requiring several actions to construct an answer. Process data enabled scoring of the total time, the time taken for initial planning of the task, the time taken for checking the answer that was provided, and variation in solving time. Training led to improved performance compared to repeated practice, but this improvement was not reflected in task-solving processes. Almost all process measures were related to performance, but the effects of training or repeated practice on this relationship differed widely between measures. In conclusion, the findings seemed to indicate that investigating process indicators within computerized dynamic testing of analogical reasoning ability provided information about children’s learning processes, but that not all processes were affected in the same way by training.","PeriodicalId":48018,"journal":{"name":"European Journal of Psychological Assessment","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45712674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-21DOI: 10.1027/1015-5759/a000748
B. Zumbo, B. Maddox, Naomi M. Care
Abstract: There is no consensus among assessment researchers about many of the central problems of response process data, including what is it and what is it comprised of. The Standards for Educational and Psychological Testing ( American Educational Research Association et al., 2014 ) locate process data within their five sources of validity evidence. However, we rarely see a conceptualization of response processes; rather, the focus is on the techniques and methods of assembling response process indices or statistical models. The method often overrides clear definitions, and, as a field, we may therefore conflate method and methodology – much like we have conflated validity and validation ( Zumbo, 2007 ). In this paper, we aim to clear the conceptual ground to explore the scope of a holistic framework for the validation of process and product. We review prominent conceptualizations of response processes and their sources and explore some fundamental questions: Should we make a theoretical and practical distinction between response processes and response data? To what extent do the uses of process data reflect the principles of deliberate, educational, and psychological measurement? To answer these questions, we consider the case of item response times and the potential for variation associated with disability and neurodiversity.
{"title":"Process and Product in Computer-Based Assessments","authors":"B. Zumbo, B. Maddox, Naomi M. Care","doi":"10.1027/1015-5759/a000748","DOIUrl":"https://doi.org/10.1027/1015-5759/a000748","url":null,"abstract":"Abstract: There is no consensus among assessment researchers about many of the central problems of response process data, including what is it and what is it comprised of. The Standards for Educational and Psychological Testing ( American Educational Research Association et al., 2014 ) locate process data within their five sources of validity evidence. However, we rarely see a conceptualization of response processes; rather, the focus is on the techniques and methods of assembling response process indices or statistical models. The method often overrides clear definitions, and, as a field, we may therefore conflate method and methodology – much like we have conflated validity and validation ( Zumbo, 2007 ). In this paper, we aim to clear the conceptual ground to explore the scope of a holistic framework for the validation of process and product. We review prominent conceptualizations of response processes and their sources and explore some fundamental questions: Should we make a theoretical and practical distinction between response processes and response data? To what extent do the uses of process data reflect the principles of deliberate, educational, and psychological measurement? To answer these questions, we consider the case of item response times and the potential for variation associated with disability and neurodiversity.","PeriodicalId":48018,"journal":{"name":"European Journal of Psychological Assessment","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44906567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-21DOI: 10.1027/1015-5759/a000758
A. Pokropek, Tomasz Żółtak, M. Muszyński
Abstract: Web surveys offer new research possibilities, but they also have specific problems. One of them is a higher risk of careless, inattentive, or otherwise invalid responses. Using paradata, which are data collected apart from reactionary data, is one of the potential tools that can help to screen for problematic responses in web-based surveys. One of the most promising forms of paradata is the movement, or trajectory, of the cursor in making a response. This study constructed indicators of such data presented correlations between them and provided an interpretation and validation of these components by correlating them with previously known indices of careless responding. Finally, it tested cursor movement indices during different motivational states induced by experimental instructions. Cursor movement indices proved to be moderately related to classical careless responding indices but some of them (horizontal distance traveled as well speed and acceleration on vertical dimension) were as responsive to manipulation conditions as classical indices. The potential role of cursor movement indices in survey practice and future studies in this area are discussed.
{"title":"Mouse Chase","authors":"A. Pokropek, Tomasz Żółtak, M. Muszyński","doi":"10.1027/1015-5759/a000758","DOIUrl":"https://doi.org/10.1027/1015-5759/a000758","url":null,"abstract":"Abstract: Web surveys offer new research possibilities, but they also have specific problems. One of them is a higher risk of careless, inattentive, or otherwise invalid responses. Using paradata, which are data collected apart from reactionary data, is one of the potential tools that can help to screen for problematic responses in web-based surveys. One of the most promising forms of paradata is the movement, or trajectory, of the cursor in making a response. This study constructed indicators of such data presented correlations between them and provided an interpretation and validation of these components by correlating them with previously known indices of careless responding. Finally, it tested cursor movement indices during different motivational states induced by experimental instructions. Cursor movement indices proved to be moderately related to classical careless responding indices but some of them (horizontal distance traveled as well speed and acceleration on vertical dimension) were as responsive to manipulation conditions as classical indices. The potential role of cursor movement indices in survey practice and future studies in this area are discussed.","PeriodicalId":48018,"journal":{"name":"European Journal of Psychological Assessment","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43638034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-17DOI: 10.1027/1015-5759/a000763
Sinja Müser, J. Fleischer, Olga Kunina-Habenicht, D. Leutner
Abstract: Teacher students’ professional educational knowledge is of great importance in academic teacher education. In response to the need to continuously optimize and improve teacher education, we developed a standards-based test instrument designed along the Standards of Teacher Education of the German education administration. The so-called ESBW (Essen Test for the Assessment of Standards-Based Educational Knowledge) is intended to assess educational knowledge as it is defined in these standards. This Brief Report aims to investigate whether the ESBW, as an exclusively standards-based developed test, can empirically be distinguished from a similar, but non-originally standards-based developed test, here the BilWiss 2.0 test, which also partially covers the standards. Competing structural equation models based on a study with 216 teacher students revealed that the ESBW short scale can be empirically distinguished from the BilWiss 2.0 short version, indicating that both instruments partly measure different aspects of educational knowledge. In addition, the examination of measurement invariance revealed that the ESBW performed similarly well for both beginning and advanced teacher students. Thus, our results further underline the usefulness of the ESBW for the assessment and evaluation of the German Standards of Teacher Education.
{"title":"The ESBW Short Scale","authors":"Sinja Müser, J. Fleischer, Olga Kunina-Habenicht, D. Leutner","doi":"10.1027/1015-5759/a000763","DOIUrl":"https://doi.org/10.1027/1015-5759/a000763","url":null,"abstract":"Abstract: Teacher students’ professional educational knowledge is of great importance in academic teacher education. In response to the need to continuously optimize and improve teacher education, we developed a standards-based test instrument designed along the Standards of Teacher Education of the German education administration. The so-called ESBW (Essen Test for the Assessment of Standards-Based Educational Knowledge) is intended to assess educational knowledge as it is defined in these standards. This Brief Report aims to investigate whether the ESBW, as an exclusively standards-based developed test, can empirically be distinguished from a similar, but non-originally standards-based developed test, here the BilWiss 2.0 test, which also partially covers the standards. Competing structural equation models based on a study with 216 teacher students revealed that the ESBW short scale can be empirically distinguished from the BilWiss 2.0 short version, indicating that both instruments partly measure different aspects of educational knowledge. In addition, the examination of measurement invariance revealed that the ESBW performed similarly well for both beginning and advanced teacher students. Thus, our results further underline the usefulness of the ESBW for the assessment and evaluation of the German Standards of Teacher Education.","PeriodicalId":48018,"journal":{"name":"European Journal of Psychological Assessment","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49091325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.1027/1015-5759/a000764
Matthias Ziegler, D. Iliescu
{"title":"Measurement Does Not Take Place in a Legal Vacuum","authors":"Matthias Ziegler, D. Iliescu","doi":"10.1027/1015-5759/a000764","DOIUrl":"https://doi.org/10.1027/1015-5759/a000764","url":null,"abstract":"","PeriodicalId":48018,"journal":{"name":"European Journal of Psychological Assessment","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41986985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.1027/1015-5759/a000691
Natalie Förster, Jörg-Tobias Kuhn
Abstract: To monitor students’ progress and adapt instruction to students’ needs, teachers increasingly use repeated assessments of equivalent tests. The present study investigates whether equivalent reading tests can be successfully developed via rule-based item design. Based on theoretical considerations, we identified 3-item features for reading comprehension at the word, sentence, and text levels, respectively, which should influence the difficulty and time intensity of reading processes. Using optimal design algorithms, a design matrix was calculated, and four equivalent test forms of the German reading test series for second graders (quop-L2) were developed. A total of N = 7,751 students completed the tests. We estimated item difficulty and time intensity parameters as well as person ability and speed parameters using bivariate item response theory (IRT) models, and we investigated the influence of item features on item parameters. Results indicate that all item properties significantly affected either item difficulty or response time. Moreover, as indicated by the IRT-based test information functions and analyses of variance, the four different test forms showed similar levels of difficulty and time-intensity at the word, sentence, and text levels (all η 2 < .002). Results were successfully cross-validated using a sample of N = 5,654 students.
{"title":"Ice Is Hot and Water Is Dry","authors":"Natalie Förster, Jörg-Tobias Kuhn","doi":"10.1027/1015-5759/a000691","DOIUrl":"https://doi.org/10.1027/1015-5759/a000691","url":null,"abstract":"Abstract: To monitor students’ progress and adapt instruction to students’ needs, teachers increasingly use repeated assessments of equivalent tests. The present study investigates whether equivalent reading tests can be successfully developed via rule-based item design. Based on theoretical considerations, we identified 3-item features for reading comprehension at the word, sentence, and text levels, respectively, which should influence the difficulty and time intensity of reading processes. Using optimal design algorithms, a design matrix was calculated, and four equivalent test forms of the German reading test series for second graders (quop-L2) were developed. A total of N = 7,751 students completed the tests. We estimated item difficulty and time intensity parameters as well as person ability and speed parameters using bivariate item response theory (IRT) models, and we investigated the influence of item features on item parameters. Results indicate that all item properties significantly affected either item difficulty or response time. Moreover, as indicated by the IRT-based test information functions and analyses of variance, the four different test forms showed similar levels of difficulty and time-intensity at the word, sentence, and text levels (all η 2 < .002). Results were successfully cross-validated using a sample of N = 5,654 students.","PeriodicalId":48018,"journal":{"name":"European Journal of Psychological Assessment","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135594057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-01DOI: 10.1027/1015-5759/a000687
M. Bäckström, F. Björklund, R. Maddux, M. Lindén
Abstract. Personality is usually measured by means of self-ratings. Despite some drawbacks, the method is here to stay, and improving on it, particularly regarding social desirability, is essential. One way to do this is evaluative neutralization, that is, to rephrase items such that it is less obvious to the respondent what would be a desirable response. We present a 120-item evaluatively neutralized five-factor inventory and compare it to the IPIP-NEO ( Goldberg et al., 2006 ). Psychometric analyses revealed that the new inventory has high factor homogeneity, relatively independent facets with acceptable homogeneity and normally distributed ratings, and relatively evaluatively neutral ratings (as indicated by the level of item popularity). In sum, this new inventory captures the same personality variance as other five-factor inventories but with less influence from individual differences in evaluative responding, resulting in less correlation between factors and a factor structure more in line with the simple structure model than many other five-factor inventories. Evaluatively neutralized inventories should be particularly useful when the factor structure is central to the research question and focuses on discriminant validity, such as identifying theoretically valid relationships between personality traits and other concepts.
摘要个性通常是通过自我评价来衡量的。尽管存在一些缺点,但这种方法仍将继续存在,并且对其进行改进,特别是在社会可取性方面,是必不可少的。这样做的一种方法是评估中性化,也就是说,重新表述项目,这样对被调查者来说,什么是理想的回答就不那么明显了。我们提出了一个120项评估中和的五因素清单,并将其与IPIP-NEO进行比较(Goldberg et al., 2006)。心理测量分析表明,新量表具有较高的因素同质性,相对独立的方面具有可接受的同质性和正态分布的评分,以及相对评价中性的评分(如项目受欢迎程度所示)。总而言之,这一新的五因素量表捕获了与其他五因素量表相同的人格差异,但受评估性反应的个体差异的影响较小,导致因素之间的相关性较小,因素结构比许多其他五因素量表更符合简单结构模型。当因素结构是研究问题的中心并侧重于判别效度时,例如确定人格特征和其他概念之间理论上有效的关系时,评估中性清单应该特别有用。
{"title":"The NB5I","authors":"M. Bäckström, F. Björklund, R. Maddux, M. Lindén","doi":"10.1027/1015-5759/a000687","DOIUrl":"https://doi.org/10.1027/1015-5759/a000687","url":null,"abstract":"Abstract. Personality is usually measured by means of self-ratings. Despite some drawbacks, the method is here to stay, and improving on it, particularly regarding social desirability, is essential. One way to do this is evaluative neutralization, that is, to rephrase items such that it is less obvious to the respondent what would be a desirable response. We present a 120-item evaluatively neutralized five-factor inventory and compare it to the IPIP-NEO ( Goldberg et al., 2006 ). Psychometric analyses revealed that the new inventory has high factor homogeneity, relatively independent facets with acceptable homogeneity and normally distributed ratings, and relatively evaluatively neutral ratings (as indicated by the level of item popularity). In sum, this new inventory captures the same personality variance as other five-factor inventories but with less influence from individual differences in evaluative responding, resulting in less correlation between factors and a factor structure more in line with the simple structure model than many other five-factor inventories. Evaluatively neutralized inventories should be particularly useful when the factor structure is central to the research question and focuses on discriminant validity, such as identifying theoretically valid relationships between personality traits and other concepts.","PeriodicalId":48018,"journal":{"name":"European Journal of Psychological Assessment","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47630225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-31DOI: 10.1027/1015-5759/a000753
C. J. Anthony, Pui‐wa Lei, S. Elliott, J. DiPerna, C. Cefai, P. Bartolo, L. Camilleri, M. O’Riordan, I. Grazzani, V. Cavioni, E. Conte, V. Ornaghi, S. Tatalović Vorkapić, M. Poulou, B. Martinsone, C. Simões, A. A. Colomeischi
Abstract. Although children use social and emotional learning skills (SEL) across the world, the expression of these skills may vary across cultures and developmental levels. Such variability complicates the process of assessing SEL competencies with consequences for understanding differences in SEL skills and developing interventions. To address these challenges, the current study examined the measurement invariance of translated versions of a brief, multi-informant (Teacher, Parent, Student) measure of SEL skills developed in the US with data from six European countries (Croatia, Greece, Italy, Latvia, Portugal, and Romania; n = 10,602; 8,520; 6,611, for the SSIS SEL b – Teacher, Parent, and Student versions, respectively). In addition to cross-country invariance testing, we conducted measurement invariance testing across ages (Primary and Secondary students) for the Teacher and Student forms of the measure. Results revealed a high degree of measurement invariance across countries (Scalar for the Teacher form and Partial Scalar for the Parent and Student form) and developmental levels (Scalar for the Teacher form and Partial Scalar for the Student form), supporting the use of translated versions of the SSIS SEL b for international research across these countries and developmental levels. Implications are discussed for assessment and promoting children’s SEL competencies globally.
摘要虽然世界各地的儿童都使用社交和情感学习技能(SEL),但这些技能的表达可能因文化和发展水平而异。这种可变性使评估SEL能力的过程复杂化,从而影响理解SEL技能的差异和制定干预措施。为了应对这些挑战,本研究利用来自六个欧洲国家(克罗地亚、希腊、意大利、拉脱维亚、葡萄牙和罗马尼亚)的数据,检验了美国开发的一种简短、多信息提供者(教师、家长、学生)SEL技能测量的翻译版本的测量不变性;n = 10,602;8520;SSIS SEL b(教师、家长和学生版本分别为6,611)。除了跨国不变性检验外,我们还对教师和学生形式的测量进行了跨年龄(小学生和中学生)的测量不变性检验。结果显示,不同国家(教师表格为标量,家长和学生表格为部分标量)和发展水平(教师表格为标量,学生表格为部分标量)的测量具有高度的不变性,支持在这些国家和发展水平的国际研究中使用SSIS SEL b的翻译版本。讨论了评估和促进全球儿童SEL能力的意义。
{"title":"Measurement Invariance of Children’s SEL Competencies","authors":"C. J. Anthony, Pui‐wa Lei, S. Elliott, J. DiPerna, C. Cefai, P. Bartolo, L. Camilleri, M. O’Riordan, I. Grazzani, V. Cavioni, E. Conte, V. Ornaghi, S. Tatalović Vorkapić, M. Poulou, B. Martinsone, C. Simões, A. A. Colomeischi","doi":"10.1027/1015-5759/a000753","DOIUrl":"https://doi.org/10.1027/1015-5759/a000753","url":null,"abstract":"Abstract. Although children use social and emotional learning skills (SEL) across the world, the expression of these skills may vary across cultures and developmental levels. Such variability complicates the process of assessing SEL competencies with consequences for understanding differences in SEL skills and developing interventions. To address these challenges, the current study examined the measurement invariance of translated versions of a brief, multi-informant (Teacher, Parent, Student) measure of SEL skills developed in the US with data from six European countries (Croatia, Greece, Italy, Latvia, Portugal, and Romania; n = 10,602; 8,520; 6,611, for the SSIS SEL b – Teacher, Parent, and Student versions, respectively). In addition to cross-country invariance testing, we conducted measurement invariance testing across ages (Primary and Secondary students) for the Teacher and Student forms of the measure. Results revealed a high degree of measurement invariance across countries (Scalar for the Teacher form and Partial Scalar for the Parent and Student form) and developmental levels (Scalar for the Teacher form and Partial Scalar for the Student form), supporting the use of translated versions of the SSIS SEL b for international research across these countries and developmental levels. Implications are discussed for assessment and promoting children’s SEL competencies globally.","PeriodicalId":48018,"journal":{"name":"European Journal of Psychological Assessment","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47436645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-31DOI: 10.1027/1015-5759/a000751
Franz L. Classe, R. Steyer
Abstract. A probit multistate Item Response Theory (IRT) model for ordinal response variables is introduced. It comprises a reference latent state variable for each occasion of measurement and a latent item effect variable for each item except for one reference item. The latent item effect variable is defined as the difference between the latent state variable pertaining to the non-reference item and the latent state variable pertaining to the reference item. They are assumed to be identical for all occasions of measurement. The new model is applied to a real data example. Including item effect variables improve model fit considerably. Hence, the items are not strictly unidimensional within each occasion of measurement.
摘要介绍了一种概率多状态项目反应理论(probit multi - state Item Response Theory, IRT)模型。它包括用于每个测量场合的参考潜在状态变量和用于除一个参考项目外的每个项目的潜在项目效应变量。潜在项目效应变量定义为与非参考项目相关的潜在状态变量与与参考项目相关的潜在状态变量之间的差异。假定它们在所有测量场合都是相同的。将该模型应用于一个实际的数据实例。纳入项目效应变量可显著改善模型拟合。因此,在每次测量中,这些项目并不是严格的单维的。
{"title":"A Probit Multistate IRT Model With Latent Item Effect Variables for Graded Responses","authors":"Franz L. Classe, R. Steyer","doi":"10.1027/1015-5759/a000751","DOIUrl":"https://doi.org/10.1027/1015-5759/a000751","url":null,"abstract":"Abstract. A probit multistate Item Response Theory (IRT) model for ordinal response variables is introduced. It comprises a reference latent state variable for each occasion of measurement and a latent item effect variable for each item except for one reference item. The latent item effect variable is defined as the difference between the latent state variable pertaining to the non-reference item and the latent state variable pertaining to the reference item. They are assumed to be identical for all occasions of measurement. The new model is applied to a real data example. Including item effect variables improve model fit considerably. Hence, the items are not strictly unidimensional within each occasion of measurement.","PeriodicalId":48018,"journal":{"name":"European Journal of Psychological Assessment","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42754722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-31DOI: 10.1027/1015-5759/a000754
Bowen Xiao, Xiaolong Xie, Wanfen Chen, D. Law, Hezron Z. Onditi, Junsheng Liu, J. Shapka
Abstract. The current study aimed to test for measurement invariance of the Resistance to Peer Influence scale across samples of Chinese, Canadian, and Tanzanian. Participants included N = 3,771 students from four public schools in China ( N = 2,073, Mage = 16.36 years, SD = 1.14 years; 925 boys), from sixteen public schools in Canada ( N = 642, Mage = 12.13 years, SD = 0.78 years; 321 boys), and from four public schools in Tanzanian ( N = 1,056, Mage = 15.87 years, SD = 2.02 years; 558 boys). Students provided self-reports of resistance to peer influence. The results from multigroup confirmatory factor analysis and the alignment optimization method demonstrated that configural, metric, and partial scalar invariances of resistance to peer influence held across gender and all three countries. Chinese boys had the highest factor mean levels and Canadian boys had the lowest. The findings help us understand peer influence resistance across cultures and genders.
{"title":"Measurement Invariance of the Resistance to Peer Influence Scale Across Culture and Gender","authors":"Bowen Xiao, Xiaolong Xie, Wanfen Chen, D. Law, Hezron Z. Onditi, Junsheng Liu, J. Shapka","doi":"10.1027/1015-5759/a000754","DOIUrl":"https://doi.org/10.1027/1015-5759/a000754","url":null,"abstract":"Abstract. The current study aimed to test for measurement invariance of the Resistance to Peer Influence scale across samples of Chinese, Canadian, and Tanzanian. Participants included N = 3,771 students from four public schools in China ( N = 2,073, Mage = 16.36 years, SD = 1.14 years; 925 boys), from sixteen public schools in Canada ( N = 642, Mage = 12.13 years, SD = 0.78 years; 321 boys), and from four public schools in Tanzanian ( N = 1,056, Mage = 15.87 years, SD = 2.02 years; 558 boys). Students provided self-reports of resistance to peer influence. The results from multigroup confirmatory factor analysis and the alignment optimization method demonstrated that configural, metric, and partial scalar invariances of resistance to peer influence held across gender and all three countries. Chinese boys had the highest factor mean levels and Canadian boys had the lowest. The findings help us understand peer influence resistance across cultures and genders.","PeriodicalId":48018,"journal":{"name":"European Journal of Psychological Assessment","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42313718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}