Journal of Measurement and Evaluation in Education and Psychology-EPOD最新文献

Learning analytics in formative assessment: A systematic literature review 形成性评价中的学习分析:系统的文献综述

Q4 PSYCHOLOGY, EDUCATIONAL

Journal of Measurement and Evaluation in Education and Psychology-EPOD

Pub Date : 2023-10-08 DOI: 10.21031/epod.1272054

Ke ZHANG, Ramazan YILMAZ, Ahmet Berk USTUN, Fatma Gizem KARAOĞLAN YILMAZ

This systematic review examines the use of learning analytics (LA) in formative assessment (FA). LA is a powerful tool that can support FA by providing real-time feedback to students and teachers. The review analyzes studies published on Web of Science and Scopus databases between 2011 and 2022 that provide an overview of the current state of published research on the use of LA for FA in diverse learning environments and through different delivery modes. This review also explores the significant potential of LA in FA practices in digital learning. A total of 63 studies met all selection criteria and were fully reviewed by conducting multiple analyses including selected bibliometrics, a categorical meta-trends analysis and inductive content analysis. The results indicate that the number of LA in FA studies has experienced a significant surge over the past decade. The results also show the current state of research on LA in FA, through a range of disciplines, journals, research methods, learning environments and delivery modes. This review can help inform the implementation of LA in educational contexts to support effective FA practices. However, the review also highlights the need for further research.

这个系统的回顾检查了学习分析(LA)在形成性评估(FA)中的使用。LA是一个强大的工具，可以通过向学生和老师提供实时反馈来支持FA。该综述分析了2011年至2022年间发表在Web of Science和Scopus数据库上的研究，概述了在不同学习环境中通过不同交付模式使用LA进行FA的已发表研究的现状。这篇综述还探讨了在数字化学习的FA实践中LA的巨大潜力。共有63项研究符合所有选择标准，并通过多种分析进行全面审查，包括选定的文献计量学、分类元趋势分析和归纳内容分析。结果表明，FA研究中LA的数量在过去十年中经历了显著的激增。研究结果还从学科、期刊、研究方法、学习环境和交付模式等方面显示了FA中LA的研究现状。这篇综述可以帮助告知在教育背景下LA的实施，以支持有效的FA实践。然而，该综述也强调了进一步研究的必要性。

{"title":"Learning analytics in formative assessment: A systematic literature review","authors":"Ke ZHANG, Ramazan YILMAZ, Ahmet Berk USTUN, Fatma Gizem KARAOĞLAN YILMAZ","doi":"10.21031/epod.1272054","DOIUrl":"https://doi.org/10.21031/epod.1272054","url":null,"abstract":"This systematic review examines the use of learning analytics (LA) in formative assessment (FA). LA is a powerful tool that can support FA by providing real-time feedback to students and teachers. The review analyzes studies published on Web of Science and Scopus databases between 2011 and 2022 that provide an overview of the current state of published research on the use of LA for FA in diverse learning environments and through different delivery modes. This review also explores the significant potential of LA in FA practices in digital learning. A total of 63 studies met all selection criteria and were fully reviewed by conducting multiple analyses including selected bibliometrics, a categorical meta-trends analysis and inductive content analysis. The results indicate that the number of LA in FA studies has experienced a significant surge over the past decade. The results also show the current state of research on LA in FA, through a range of disciplines, journals, research methods, learning environments and delivery modes. This review can help inform the implementation of LA in educational contexts to support effective FA practices. However, the review also highlights the need for further research.","PeriodicalId":43015,"journal":{"name":"Journal of Measurement and Evaluation in Education and Psychology-EPOD","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135252156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Investigation of differential item and step functioning procedurs in polytomously scored items 多元计分项目中微分项目和阶跃功能程序的研究

Q4 PSYCHOLOGY, EDUCATIONAL

Journal of Measurement and Evaluation in Education and Psychology-EPOD

Pub Date : 2023-09-30 DOI: 10.21031/epod.1221823

Yasemin KUZU, Selahattin GELBAL

This study aimed to compare differential item functioning (DIF) and differential step function (DSF) detection methods in polytomously scored items under various conditions. In this context, the study examined Kazakhstan, Turkey and USA data obtained from the items related to the frequency of using digital devices at school in PISA 2018 students’ “ICT Familiarity Questionnaire”. Mantel test, Liu-Agresti statistics, Cox β and poly-SIBTEST methods were used for polytomous DIF analysis while Adjacent Category Logistic Regression Model and Cumulative Category Log Odds Ratio methods were used for DSF analysis. This study was carried out with correlational survey model, by using “differential category combining, focus group sample size, focus group: reference group sample ratio and DIF/DSF detection method”. SAS and R software were utilized in the creation of conditions; SIBTEST was used for poly-SIBTEST for analysis and DIFAS programs were used for the other methods. Analyses demonstrated that the number of items/steps exhibiting high level of DIF/DSF was higher in the small sample according to polytomous DIF methods and in the large sample compared to DSF methods. During the steps, it was stated that the DIF value was lower in the items containing DSF with the opposite sign; therefore, not performing DSF analysis in an item with no DIF may yield erroneous results. Although the differential category combining conditions created within the scope of the research did not have a systematic effect on the results, it was suggested to examine this situation in future studies, considering that the frequency of marking the combined categories differentiated the results.

本研究旨在比较不同条件下多元计分项目的差分项目功能(DIF)和差分阶跃函数(DSF)检测方法。在此背景下，该研究检查了哈萨克斯坦、土耳其和美国的数据，这些数据来自2018年PISA学生“ICT熟悉度问卷”中与学校使用数字设备频率相关的项目。多元DIF分析采用Mantel检验、Liu-Agresti统计、Cox β和poly-SIBTEST方法，DSF分析采用邻类Logistic回归模型和累积类对数优势比方法。本研究采用相关调查模型，采用“差异类别组合、焦点组样本量、焦点组:参照组样本比例、DIF/DSF检测法”。采用SAS和R软件进行条件创设;采用SIBTEST进行poly-SIBTEST分析，其他方法采用DIFAS程序。分析表明，与多元DIF方法相比，在小样本中表现出高水平DIF/DSF的项目/步骤数量更高，而在大样本中表现出高水平的DIF/DSF。在步骤中，指出DIF值在含有相反符号的DSF的项目中较低;因此，在没有DIF的项目中不执行DSF分析可能会产生错误的结果。虽然在研究范围内创建的差异类别组合条件对结果没有系统影响，但考虑到标记组合类别的频率会对结果产生差异，建议在未来的研究中检查这种情况。

{"title":"Investigation of differential item and step functioning procedurs in polytomously scored items","authors":"Yasemin KUZU, Selahattin GELBAL","doi":"10.21031/epod.1221823","DOIUrl":"https://doi.org/10.21031/epod.1221823","url":null,"abstract":"This study aimed to compare differential item functioning (DIF) and differential step function (DSF) detection methods in polytomously scored items under various conditions. In this context, the study examined Kazakhstan, Turkey and USA data obtained from the items related to the frequency of using digital devices at school in PISA 2018 students’ “ICT Familiarity Questionnaire”. Mantel test, Liu-Agresti statistics, Cox β and poly-SIBTEST methods were used for polytomous DIF analysis while Adjacent Category Logistic Regression Model and Cumulative Category Log Odds Ratio methods were used for DSF analysis. This study was carried out with correlational survey model, by using “differential category combining, focus group sample size, focus group: reference group sample ratio and DIF/DSF detection method”. SAS and R software were utilized in the creation of conditions; SIBTEST was used for poly-SIBTEST for analysis and DIFAS programs were used for the other methods. Analyses demonstrated that the number of items/steps exhibiting high level of DIF/DSF was higher in the small sample according to polytomous DIF methods and in the large sample compared to DSF methods. During the steps, it was stated that the DIF value was lower in the items containing DSF with the opposite sign; therefore, not performing DSF analysis in an item with no DIF may yield erroneous results. Although the differential category combining conditions created within the scope of the research did not have a systematic effect on the results, it was suggested to examine this situation in future studies, considering that the frequency of marking the combined categories differentiated the results.","PeriodicalId":43015,"journal":{"name":"Journal of Measurement and Evaluation in Education and Psychology-EPOD","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136279144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Ability Estimation with Polytomous Items in Computerized Multistage Tests 计算机化多阶段测验中多同构题的能力估计

Q4 PSYCHOLOGY, EDUCATIONAL

Journal of Measurement and Evaluation in Education and Psychology-EPOD

Pub Date : 2023-09-30 DOI: 10.21031/epod.1056079

Hasibe YAHSİ SARI, Hülya KELECİOĞLU

The aim of the study is to examine how the ability estimations of individuals change under different conditions in tests consisting of polytomous items in an computerized multistage test environment. The research is a simulation study. In the study, 108 (3x3x6x2=108) conditions were examined consisting of three categories (3, 4 and 5), three test lengths (10, 20 and 30), six panel designs (1-2, 1-2-2, 1-3, 1-3-3, 1-4 and 1-4-4) and two routing methods (Maximum Fisher Information (MFI) and Random). Simulations and analyses were carried out in the mstR package in R program, with a pool of 200 items, 1000 people and 100 replications (e.g., iterations). As the outcomes of the research, mean absolute bias, RMSE and correlation values were calculated. It was found that as the number of categories and test length increase, the mean absolute bias and RMSE values decrease, while the correlation values increase. In terms of routing methods, although MFI and random methods have similar tendencies, MFI gives better results. There is a similarity between the panel designs in terms of results.

本研究的目的是研究在计算机化的多阶段测试环境中，个体在不同条件下的能力估计是如何变化的。该研究是一项模拟研究。本研究共检测了108个(3x3x6x2=108)个条件，包括3个类别(3,4和5)，3个测试长度(10,20和30)，6个面板设计(1-2,1-2-2,1-3,1-3-3,1-4和1-4-4)和2种路由方法(最大费雪信息(MFI)和Random)。在R程序中的mstR包中进行模拟和分析，池为200个项目，1000人，100个重复(例如迭代)。作为研究的结果，计算了平均绝对偏差、RMSE和相关值。结果发现，随着类别数和检验长度的增加，平均绝对偏差和RMSE值减小，而相关值增大。在路由方法方面，虽然MFI和随机方法有相似的倾向，但MFI的结果更好。就结果而言，面板设计之间存在相似之处。

引用次数: 0

Rubrics in Terms of Development Processes and Misconceptions 开发过程和误解方面的准则

Q4 PSYCHOLOGY, EDUCATIONAL

Journal of Measurement and Evaluation in Education and Psychology-EPOD

Pub Date : 2023-09-30 DOI: 10.21031/epod.1251470

Fuat ELKONCA, Görkem CEYHAN, Mehmet ŞATA

The present study aimed to examine the development process of rubrics in theses indexed in the national thesis database and to identify any misconceptions presented in these rubrics. A qualitative research approach utilizing document analysis was employed. The sample of theses was selected based on literature review and criteria established by expert opinions, resulting in a total of 395 theses being included in the study using criterion sampling. Data were collected through a "thesis review form" developed by the researchers. Descriptive analysis was employed for data analysis. Findings indicated that approximately 27% of the 395 theses contained misconceptions, with a disproportionate percentage of these misconceptions being found in master's theses. Regarding the field of the thesis, the highest rate of misconceptions was observed in health, social sciences, special education, and fine arts, while the lowest rate was found in education and linguistics. Additionally, theses with misconceptions tended to possess a lower degree of validity and reliability evidence compared to those without misconceptions. This difference was found to be statistically significant for both validity evidence and reliability evidence. In theses without misconceptions, the most frequently presented validity evidence was expert opinion, while the reliability evidence was found to be the percentage of agreement. The findings were discussed in relation to the existing literature, and recommendations were proposed.

本研究旨在检查国家论文数据库索引的论文标题的发展过程，并确定这些标题中提出的任何误解。本研究采用文献分析的定性研究方法。论文样本的选择基于文献综述和专家意见建立的标准，采用标准抽样的方法共纳入395篇论文。数据是通过研究人员开发的“论文审查表”收集的。数据分析采用描述性分析。调查结果表明，395篇论文中约有27%存在误解，其中硕士论文中存在的误解比例不成比例。关于论文领域，卫生、社会科学、特殊教育和美术领域的误解率最高，而教育和语言学领域的误解率最低。此外，与没有误解的论文相比，有误解的论文往往具有较低的效度和信度证据。这种差异在效度证据和信度证据上都有统计学意义。在没有误解的论文中，最常出现的效度证据是专家意见，而信度证据是一致的百分比。研究结果与现有文献进行了讨论，并提出了建议。

{"title":"Rubrics in Terms of Development Processes and Misconceptions","authors":"Fuat ELKONCA, Görkem CEYHAN, Mehmet ŞATA","doi":"10.21031/epod.1251470","DOIUrl":"https://doi.org/10.21031/epod.1251470","url":null,"abstract":"The present study aimed to examine the development process of rubrics in theses indexed in the national thesis database and to identify any misconceptions presented in these rubrics. A qualitative research approach utilizing document analysis was employed. The sample of theses was selected based on literature review and criteria established by expert opinions, resulting in a total of 395 theses being included in the study using criterion sampling. Data were collected through a \"thesis review form\" developed by the researchers. Descriptive analysis was employed for data analysis. Findings indicated that approximately 27% of the 395 theses contained misconceptions, with a disproportionate percentage of these misconceptions being found in master's theses. Regarding the field of the thesis, the highest rate of misconceptions was observed in health, social sciences, special education, and fine arts, while the lowest rate was found in education and linguistics. Additionally, theses with misconceptions tended to possess a lower degree of validity and reliability evidence compared to those without misconceptions. This difference was found to be statistically significant for both validity evidence and reliability evidence. In theses without misconceptions, the most frequently presented validity evidence was expert opinion, while the reliability evidence was found to be the percentage of agreement. The findings were discussed in relation to the existing literature, and recommendations were proposed.","PeriodicalId":43015,"journal":{"name":"Journal of Measurement and Evaluation in Education and Psychology-EPOD","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136279405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Analysis of Peer and Self-Assessments Using the Many-facet Rasch Measurement Model and Student Opinions 用多面Rasch测量模型和学生意见分析同伴和自我评价

Q4 PSYCHOLOGY, EDUCATIONAL

Journal of Measurement and Evaluation in Education and Psychology-EPOD

Pub Date : 2023-09-30 DOI: 10.21031/epod.1344196

Seda DEMİR

The aim of this study is to analyze the peer and self-assessments of higher education students' oral presentation skills with the many-facet Rasch measurement model and to determine students' opinions on peer and self-assessment. In the study, convergent parallel method, one of the mixed-method research approaches, was used. The study group consisted of 11 university students studying at a state university in the 2022-2023 academic year. The FACETS program was used to analyze the data. The three facets identified in the study were the assessee (11 students), the assessor (11 students) and the items (16 items). Therefore, 11 participants scored (peer and self-assessment) on a 16-item assessment form. In addition, students' opinions on peer and self-assessment were obtained through three open-ended interview questions prepared by the researcher. According to the results of the study, it was determined that there was a statistically significant difference between the students in terms of their oral presentation skills, between the assessors in terms of their strictness/generosity in scoring, and between the criteria (items) in terms of the level of difficulty in realization. In addition, the participant opinions obtained from each interview question were analyzed through themes and sub-themes formed according to the general thoughts on peer and self-assessment, experiences, and whether the participants considered themselves as a reliable rater or not. In terms of practice, it can be suggested to provide detailed and enlightening information to students before peer and/or self-assessment in the classroom environment, and to give quick feedback to those who have not done the assessment appropriately. In addition, the reasons for the biases identified in peer and self-assessments in the current study can be investigated in future studies.

摘要本研究旨在运用多面Rasch测量模型，分析高教学生口头陈述技巧的同伴评价和自我评价，并确定学生对同伴评价和自我评价的看法。本研究采用了混合方法研究方法之一的收敛并行方法。该研究小组由11名在2022-2023学年就读于一所州立大学的大学生组成。使用FACETS程序对数据进行分析。在研究中确定的三个方面是被评估者(11名学生)，评估者(11名学生)和项目(16项)。因此，11名参与者在16个项目的评估表格上得分(同伴和自我评估)。此外，通过研究者准备的三个开放式访谈问题，获得学生对同伴和自我评价的看法。根据研究结果，学生之间在口头陈述技巧上有统计学上的显著差异，评估者之间在评分上的严格/慷慨程度上有统计学上的显著差异，标准(项目)之间在实现难度上有统计学上的显著差异。此外，根据参与者对同伴和自我评价的总体思路、经历以及参与者是否认为自己是一个可靠的评分者，通过形成主题和子主题来分析从每个访谈问题中获得的参与者意见。在实践方面，可以建议在课堂环境中同伴和/或自我评估之前，为学生提供详细的启发性信息，并对没有适当评估的学生进行快速反馈。此外，本研究在同伴评价和自我评价中发现的偏差的原因可以在未来的研究中进行调查。

{"title":"Analysis of Peer and Self-Assessments Using the Many-facet Rasch Measurement Model and Student Opinions","authors":"Seda DEMİR","doi":"10.21031/epod.1344196","DOIUrl":"https://doi.org/10.21031/epod.1344196","url":null,"abstract":"The aim of this study is to analyze the peer and self-assessments of higher education students' oral presentation skills with the many-facet Rasch measurement model and to determine students' opinions on peer and self-assessment. In the study, convergent parallel method, one of the mixed-method research approaches, was used. The study group consisted of 11 university students studying at a state university in the 2022-2023 academic year. The FACETS program was used to analyze the data. The three facets identified in the study were the assessee (11 students), the assessor (11 students) and the items (16 items). Therefore, 11 participants scored (peer and self-assessment) on a 16-item assessment form. In addition, students' opinions on peer and self-assessment were obtained through three open-ended interview questions prepared by the researcher. According to the results of the study, it was determined that there was a statistically significant difference between the students in terms of their oral presentation skills, between the assessors in terms of their strictness/generosity in scoring, and between the criteria (items) in terms of the level of difficulty in realization. In addition, the participant opinions obtained from each interview question were analyzed through themes and sub-themes formed according to the general thoughts on peer and self-assessment, experiences, and whether the participants considered themselves as a reliable rater or not. In terms of practice, it can be suggested to provide detailed and enlightening information to students before peer and/or self-assessment in the classroom environment, and to give quick feedback to those who have not done the assessment appropriately. In addition, the reasons for the biases identified in peer and self-assessments in the current study can be investigated in future studies.","PeriodicalId":43015,"journal":{"name":"Journal of Measurement and Evaluation in Education and Psychology-EPOD","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136277919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Bibliometric Analysis on Power Analysis Studies 功率分析研究的文献计量学分析

Q4 PSYCHOLOGY, EDUCATIONAL

Journal of Measurement and Evaluation in Education and Psychology-EPOD

Pub Date : 2023-09-30 DOI: 10.21031/epod.1343984

Gül GÜLER

The primary purpose of this study was to establish a theoretical framework for studies on power analysis conducted in the fields of education, psychology, and statistics for researchers. Therefore, the bibliometric characteristics of publications related to power analysis in the Web of Science database were analyzed using the Biblioshiny interface in the R programming language. The study identified influential studies on power analysis in education, psychology, and statistics. It also determined which concepts were associated with power analysis over the years and the authors and countries that contributed to the advancement of research regarding this concept. This research was conducted based on 515 studies that were included following specific criteria. It was found that the studies published between 1970 and 2023 were obtained from 183 sources, with a total of 1246 authors. There were 98 single-authored studies, and the number of co-authors per study was 2.88 on average. According to Bradford’s Law, Behavior Research Methods, Psychological Methods, and Multivariate Behavioral Research were the most productive journals concerning power analysis, taking up a larger proportion within the core sources compared to other journals. These journals were among the top three in terms of the number of publications, h-index, total number of citations, and publication rankings. These journals were followed by Structural Equation Modeling-A Multidisciplinary Journal, Frontiers in Psychology, and Educational and Psychological Measurement. An examination of studies on power analysis in education, psychology, and statistics according to Lotka's Law indicated that the relevant literature is insufficient and needs further development.

本研究的主要目的是为研究者在教育、心理学和统计学领域的权力分析研究建立一个理论框架。因此，使用R编程语言的Biblioshiny界面对Web of Science数据库中功率分析相关出版物的文献计量学特征进行分析。该研究确定了在教育、心理学和统计学方面对权力分析有影响的研究。它还确定了多年来哪些概念与权力分析有关，以及对这一概念的研究作出贡献的作者和国家。这项研究是在515项研究的基础上进行的，这些研究被纳入了特定的标准。研究发现，1970年至2023年间发表的研究来自183个来源，共有1246位作者。共有98项单作者研究，平均每项研究的共同作者数量为2.88人。根据布拉德福德定律，与权力分析相关的期刊中，行为研究方法、心理学方法和多元行为研究是产出最多的期刊，在核心来源中所占比例高于其他期刊。这些期刊在发表数、h-index、总被引数、发表排名等方面均进入前三名。这些期刊之后是结构方程建模-多学科期刊，心理学前沿和教育与心理测量。根据洛特卡定律对教育学、心理学、统计学等领域的权力分析研究进行考察，发现相关文献不足，有待进一步发展。

{"title":"A Bibliometric Analysis on Power Analysis Studies","authors":"Gül GÜLER","doi":"10.21031/epod.1343984","DOIUrl":"https://doi.org/10.21031/epod.1343984","url":null,"abstract":"The primary purpose of this study was to establish a theoretical framework for studies on power analysis conducted in the fields of education, psychology, and statistics for researchers. Therefore, the bibliometric characteristics of publications related to power analysis in the Web of Science database were analyzed using the Biblioshiny interface in the R programming language. The study identified influential studies on power analysis in education, psychology, and statistics. It also determined which concepts were associated with power analysis over the years and the authors and countries that contributed to the advancement of research regarding this concept. This research was conducted based on 515 studies that were included following specific criteria. It was found that the studies published between 1970 and 2023 were obtained from 183 sources, with a total of 1246 authors. There were 98 single-authored studies, and the number of co-authors per study was 2.88 on average. According to Bradford’s Law, Behavior Research Methods, Psychological Methods, and Multivariate Behavioral Research were the most productive journals concerning power analysis, taking up a larger proportion within the core sources compared to other journals. These journals were among the top three in terms of the number of publications, h-index, total number of citations, and publication rankings. These journals were followed by Structural Equation Modeling-A Multidisciplinary Journal, Frontiers in Psychology, and Educational and Psychological Measurement. An examination of studies on power analysis in education, psychology, and statistics according to Lotka's Law indicated that the relevant literature is insufficient and needs further development.","PeriodicalId":43015,"journal":{"name":"Journal of Measurement and Evaluation in Education and Psychology-EPOD","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136279023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Investigation of The Measurement Invariance of Affective Characteristics Related to TIMSS 2019 Mathematics Achievement by Gender 性别对TIMSS 2019数学成绩相关情感特征测量不变性的调查

Q4 PSYCHOLOGY, EDUCATIONAL

Journal of Measurement and Evaluation in Education and Psychology-EPOD

Pub Date : 2023-09-30 DOI: 10.21031/epod.1221365

Mehmet ATILGAN, Kaan Zulfikar DENİZ

This research examines whether the affective characteristics of the TIMSS 2019 Turkey mathematics application provide measurement invariance according to gender. The research sample consists of 4048 8th-grade students participating in the TIMSS in 2019. Research data were downloaded from the international website of TIMSS. The research data collection tools are “Sense of School Belonging”, “Students Confident in Mathematics”, “Students Like Learning Mathematics”, and “Students Value Mathematics” scales. Exploratory Factor Analysis (EFA) and Confirmatory Factor Analysis (CFA) were performed in the context of validity analyses to examine measurement invariance. In terms of reliability, the Cronbach Alfa internal consistency coefficient was calculated. Accordingly, out of the four scales in the study, only “Students Confident in Mathematics” scale could not be confirmed in confirmatory factor analysis. Therefore, while “Students Confident in Mathematics” scale was not examined for measurement invariance, the other three scales were examined within the scope of measurement invariance. For measurement invariance, research data were tested with Multiple Group Confirmatory Factor Analysis (MG-CFA), one of the Structural Equation Modeling (SEM) techniques. As a result of the analyses, while the strict invariance model was provided in “Students Like Learning Mathematics” scale and “Students Value Mathematics” scale, strong invariance/scale invariance model was provided in “Sense of School Belonging” scale. It was concluded that there was no gender bias in the three scales for which MG-CFA was performed, and the mean scores were comparable according to gender. In this context, it can be said that “Sense of School Belonging”, “Students Like Learning Mathematics”, and “Students Value Mathematics” scales are valid in determining the differences according to gender.

本研究考察了TIMSS 2019土耳其数学应用程序的情感特征是否根据性别提供测量不变性。研究样本为2019年参加TIMSS的4048名8年级学生。研究数据从TIMSS国际网站下载。研究数据收集工具为“学校归属感”、“学生对数学的信心”、“学生喜欢学习数学”和“学生重视数学”量表。在效度分析的背景下，进行探索性因子分析(EFA)和验证性因子分析(CFA)来检验测量的不变性。信度方面，计算Cronbach alpha内部一致性系数。因此，在本研究的四个量表中，只有“学生数学自信”量表在验证性因子分析中无法得到证实。因此，在对“学生数学自信”量表不进行测量不变性检查的同时，对其他三个量表在测量不变性范围内进行了检查。为了测量不变性，研究数据采用结构方程建模(SEM)技术之一的多组验证因子分析(MG-CFA)进行检验。分析结果表明，“学生喜欢学习数学”量表和“学生重视数学”量表提供了严格不变性模型，而“学校归属感”量表提供了强不变性/尺度不变性模型。我们得出结论，MG-CFA的三个量表不存在性别偏倚，平均得分根据性别具有可比性。在此背景下，可以说“学校归属感”、“学生喜欢学习数学”和“学生重视数学”量表在确定性别差异方面是有效的。

{"title":"Investigation of The Measurement Invariance of Affective Characteristics Related to TIMSS 2019 Mathematics Achievement by Gender","authors":"Mehmet ATILGAN, Kaan Zulfikar DENİZ","doi":"10.21031/epod.1221365","DOIUrl":"https://doi.org/10.21031/epod.1221365","url":null,"abstract":"This research examines whether the affective characteristics of the TIMSS 2019 Turkey mathematics application provide measurement invariance according to gender. The research sample consists of 4048 8th-grade students participating in the TIMSS in 2019. Research data were downloaded from the international website of TIMSS. The research data collection tools are “Sense of School Belonging”, “Students Confident in Mathematics”, “Students Like Learning Mathematics”, and “Students Value Mathematics” scales. Exploratory Factor Analysis (EFA) and Confirmatory Factor Analysis (CFA) were performed in the context of validity analyses to examine measurement invariance. In terms of reliability, the Cronbach Alfa internal consistency coefficient was calculated. Accordingly, out of the four scales in the study, only “Students Confident in Mathematics” scale could not be confirmed in confirmatory factor analysis. Therefore, while “Students Confident in Mathematics” scale was not examined for measurement invariance, the other three scales were examined within the scope of measurement invariance. For measurement invariance, research data were tested with Multiple Group Confirmatory Factor Analysis (MG-CFA), one of the Structural Equation Modeling (SEM) techniques. As a result of the analyses, while the strict invariance model was provided in “Students Like Learning Mathematics” scale and “Students Value Mathematics” scale, strong invariance/scale invariance model was provided in “Sense of School Belonging” scale. It was concluded that there was no gender bias in the three scales for which MG-CFA was performed, and the mean scores were comparable according to gender. In this context, it can be said that “Sense of School Belonging”, “Students Like Learning Mathematics”, and “Students Value Mathematics” scales are valid in determining the differences according to gender.","PeriodicalId":43015,"journal":{"name":"Journal of Measurement and Evaluation in Education and Psychology-EPOD","volume":"2013 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136278878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A New Weighting Method in Meta-Analysis: The Weighting with Reliability Coefficient 一种新的元分析加权方法:信度系数加权

Q4 PSYCHOLOGY, EDUCATIONAL

Journal of Measurement and Evaluation in Education and Psychology-EPOD

Pub Date : 2023-09-30 DOI: 10.21031/epod.1351485

Yıldız YILDIRIM, Şeref TAN

This study aimed to investigate the impact of various weighting methods for effect sizes on the outcomes of meta-analyses that examined the effects of the 5E teaching method on academic achievement in science education. Two effect size weighting methods were explored: one based on the inverse of the sampling error variance and the other utilizing the reliability of measures in primary studies. The study also assessed the influence of including gray literature on the meta-analysis results, considering factors such as high heterogeneity and publication bias. The research followed a basic research design and drew data from 112 studies, encompassing a total of 149 effect sizes. An exhaustive search of databases and archives, including Google Scholar, Dergipark, HEI Thesis Center, Proquest, Science Direct, ERIC, Taylor & Francis, EBSCOhost, Web of Science, and five journals was conducted to gather these studies. Analyses were performed by utilizing the CMA v2 software and employing the random effects model. The findings demonstrated divergent outcomes between the two weighting methods—weighting by reliability coefficient yielded higher overall effect sizes and standard errors compared to weighting by inverse variance. Ultimately, the inclusion of gray literature was found to not significantly impact any of the weighting methods employed.

本研究旨在探讨影响大小的各种加权方法对meta分析结果的影响，该meta分析检验了5E教学方法对科学教育学业成绩的影响。研究了两种效应大小加权方法:一种是基于抽样误差方差的倒数，另一种是利用初步研究中测量的可靠性。考虑到高异质性和发表偏倚等因素，本研究还评估了纳入灰色文献对meta分析结果的影响。这项研究遵循了基本的研究设计，并从112项研究中提取了数据，总共包括149个效应值。详尽搜索数据库和档案，包括b谷歌Scholar, Dergipark, HEI Thesis Center, Proquest, Science Direct, ERIC, Taylor &Francis, EBSCOhost, Web of Science和五家期刊收集了这些研究。利用CMA v2软件，采用随机效应模型进行分析。研究结果表明，两种加权方法之间的结果存在差异——与逆方差加权相比，信度系数加权产生了更高的总体效应大小和标准误差。最终，发现灰色文献的纳入对所采用的任何加权方法都没有显著影响。

{"title":"A New Weighting Method in Meta-Analysis: The Weighting with Reliability Coefficient","authors":"Yıldız YILDIRIM, Şeref TAN","doi":"10.21031/epod.1351485","DOIUrl":"https://doi.org/10.21031/epod.1351485","url":null,"abstract":"This study aimed to investigate the impact of various weighting methods for effect sizes on the outcomes of meta-analyses that examined the effects of the 5E teaching method on academic achievement in science education. Two effect size weighting methods were explored: one based on the inverse of the sampling error variance and the other utilizing the reliability of measures in primary studies. The study also assessed the influence of including gray literature on the meta-analysis results, considering factors such as high heterogeneity and publication bias. The research followed a basic research design and drew data from 112 studies, encompassing a total of 149 effect sizes. An exhaustive search of databases and archives, including Google Scholar, Dergipark, HEI Thesis Center, Proquest, Science Direct, ERIC, Taylor & Francis, EBSCOhost, Web of Science, and five journals was conducted to gather these studies. Analyses were performed by utilizing the CMA v2 software and employing the random effects model. The findings demonstrated divergent outcomes between the two weighting methods—weighting by reliability coefficient yielded higher overall effect sizes and standard errors compared to weighting by inverse variance. Ultimately, the inclusion of gray literature was found to not significantly impact any of the weighting methods employed.","PeriodicalId":43015,"journal":{"name":"Journal of Measurement and Evaluation in Education and Psychology-EPOD","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136279253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Training 21st Century English Language Teachers in Turkish Context: Development of a Technology-Enhanced Measurement Curriculum 在土耳其语境下培养21世纪英语教师:一种技术增强的测量课程的开发

Q4 PSYCHOLOGY, EDUCATIONAL

Journal of Measurement and Evaluation in Education and Psychology-EPOD

Pub Date : 2023-09-29 DOI: 10.21031/epod.1261763

Burcu ŞENTÜRK, Beyza AKSU DÜNYA, Mehmet Can DEMİR

A case study that included 26 English Language teacher candidates was designed for developing an evidence-based measurement curriculum in Turkey, examining teacher candidates’ experiences on the newly developed course and taking remedial actions for updating the syllabus if needed. Data was collected using multiple sources: a pre-course survey, weekly discussion board on Edmodo and a post-course survey. Survey data obtained from rating-scale items was analyzed using descriptive statistics and data visualization packages embedded in R. Open-ended survey data and discussion board data were content-analyzed using MaxQDA software. The results revealed that students had limited awareness regarding assessment for learning concept and digital tools that could be used for assessment for learning purposes at the beginning of the course. Course content, in-class activities and projects helped them develop hands-on skills in developing sound language assessments as well as raised their awareness with respect to the importance of computer-based language assessment.

一项包括26名英语语言教师候选人的案例研究旨在开发土耳其的循证测量课程，检查教师候选人在新开发课程中的经验，并在必要时采取补救措施更新教学大纲。数据是通过多种渠道收集的:课前调查、每周一次的Edmodo讨论板和课后调查。使用描述性统计和r中嵌入的数据可视化软件包对从评定量表项目中获得的调查数据进行分析。开放式调查数据和讨论板数据使用MaxQDA软件进行内容分析。结果显示，在课程开始时，学生对学习概念的评估和可用于学习目的评估的数字工具的认识有限。课程内容、课堂活动和项目帮助他们培养了制定健全的语言评估的实际技能，并提高了他们对计算机语言评估重要性的认识。

{"title":"Training 21st Century English Language Teachers in Turkish Context: Development of a Technology-Enhanced Measurement Curriculum","authors":"Burcu ŞENTÜRK, Beyza AKSU DÜNYA, Mehmet Can DEMİR","doi":"10.21031/epod.1261763","DOIUrl":"https://doi.org/10.21031/epod.1261763","url":null,"abstract":"A case study that included 26 English Language teacher candidates was designed for developing an evidence-based measurement curriculum in Turkey, examining teacher candidates’ experiences on the newly developed course and taking remedial actions for updating the syllabus if needed. Data was collected using multiple sources: a pre-course survey, weekly discussion board on Edmodo and a post-course survey. Survey data obtained from rating-scale items was analyzed using descriptive statistics and data visualization packages embedded in R. Open-ended survey data and discussion board data were content-analyzed using MaxQDA software. The results revealed that students had limited awareness regarding assessment for learning concept and digital tools that could be used for assessment for learning purposes at the beginning of the course. Course content, in-class activities and projects helped them develop hands-on skills in developing sound language assessments as well as raised their awareness with respect to the importance of computer-based language assessment.","PeriodicalId":43015,"journal":{"name":"Journal of Measurement and Evaluation in Education and Psychology-EPOD","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135297591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Investigation of the Effect of Online (Web-Based) Formative Assessment Applications on Students' Academic Achievement 网络形成性评价应用对学生学业成绩影响的调查

Q4 PSYCHOLOGY, EDUCATIONAL

Journal of Measurement and Evaluation in Education and Psychology-EPOD

Pub Date : 2023-09-10 DOI: 10.21031/epod.1320182

Bayram ÇETİN, Şeref AKPINAR

The aim of this research is to determine the secondary education 10. the aim of this study is to examine the effect of the applications of providing resources for learning disabilities by the system and providing feedback for learning disabilities by the teacher within the scope of online (web-based) formative evaluation application of mathematics course of second-degree equations of classroom students on the students' achievements. In the research, it was used using a semi-experimental pattern. Pre-test - post-test success tests and monitoring facilities were used. The research was conducted in the 2022-2023 academic year with a total of 302 students selected from 4 schools and 12 branches in Göksun and Ağrın districts using stratified, random cluster sampling method. The data were analyzed by one-way analysis of variance (ANOVA) and covariance analysis (ANCOVA). According to the results of the research, it was found that there was no statistically significant difference between the pre-test averages of the groups, but a statistically significant difference appeared in the post-test. Dec. The provision of resources for learning disabilities by the system applied to the Experiment-2 group and the provision of detailed feedback by the teacher according to the Cognitive Diagnostic Modeling (BTM) for learning disabilities, the provision of resources for learning disabilities by the system applied to the Experiment-1 group and normal teaching applied to the Control group; the provision of resources for learning disabilities by the system applied to the experiment-1 group and normal teaching applied to the control group were also found to be effective. In addition, according to the results of the experimental processing process, Experiment-2 showed a higher level of development between the pre- and Decal test averages than Experiment-1 and Experiment-1 from the Control group.

本研究的目的是确定中等教育的10。本研究旨在探讨系统提供学习障碍资源和教师提供学习障碍反馈在课堂学生二次方程数学课程在线(网络)形成性评价应用范围内对学生成绩的影响。在这项研究中，它采用了半实验模式。使用了测试前-测试后成功测试和监测设施。研究时间为2022-2023学年，采用分层随机整群抽样的方法，从Göksun和Ağrın地区的4所学校和12个分校中抽取302名学生。采用单因素方差分析(ANOVA)和协方差分析(ANCOVA)对资料进行分析。根据研究结果，两组的前测平均值差异无统计学意义，后测平均值差异有统计学意义。12 .实验2组使用系统提供学习障碍资源，教师根据学习障碍认知诊断模型(Cognitive Diagnostic Modeling, BTM)提供详细反馈，实验1组使用系统提供学习障碍资源，对照组使用正常教学;实验1组使用系统提供学习障碍资源，对照组使用正常教学，均取得了良好效果。此外，根据实验加工过程的结果，实验-2在前测验和贴花测验平均值之间的发展水平高于对照组的实验-1和实验-1。

{"title":"Investigation of the Effect of Online (Web-Based) Formative Assessment Applications on Students' Academic Achievement","authors":"Bayram ÇETİN, Şeref AKPINAR","doi":"10.21031/epod.1320182","DOIUrl":"https://doi.org/10.21031/epod.1320182","url":null,"abstract":"The aim of this research is to determine the secondary education 10. the aim of this study is to examine the effect of the applications of providing resources for learning disabilities by the system and providing feedback for learning disabilities by the teacher within the scope of online (web-based) formative evaluation application of mathematics course of second-degree equations of classroom students on the students' achievements. In the research, it was used using a semi-experimental pattern. Pre-test - post-test success tests and monitoring facilities were used. The research was conducted in the 2022-2023 academic year with a total of 302 students selected from 4 schools and 12 branches in Göksun and Ağrın districts using stratified, random cluster sampling method. The data were analyzed by one-way analysis of variance (ANOVA) and covariance analysis (ANCOVA). According to the results of the research, it was found that there was no statistically significant difference between the pre-test averages of the groups, but a statistically significant difference appeared in the post-test. Dec. The provision of resources for learning disabilities by the system applied to the Experiment-2 group and the provision of detailed feedback by the teacher according to the Cognitive Diagnostic Modeling (BTM) for learning disabilities, the provision of resources for learning disabilities by the system applied to the Experiment-1 group and normal teaching applied to the Control group; the provision of resources for learning disabilities by the system applied to the experiment-1 group and normal teaching applied to the control group were also found to be effective. In addition, according to the results of the experimental processing process, Experiment-2 showed a higher level of development between the pre- and Decal test averages than Experiment-1 and Experiment-1 from the Control group.","PeriodicalId":43015,"journal":{"name":"Journal of Measurement and Evaluation in Education and Psychology-EPOD","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136108528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0