Ke ZHANG, Ramazan YILMAZ, Ahmet Berk USTUN, Fatma Gizem KARAOĞLAN YILMAZ
This systematic review examines the use of learning analytics (LA) in formative assessment (FA). LA is a powerful tool that can support FA by providing real-time feedback to students and teachers. The review analyzes studies published on Web of Science and Scopus databases between 2011 and 2022 that provide an overview of the current state of published research on the use of LA for FA in diverse learning environments and through different delivery modes. This review also explores the significant potential of LA in FA practices in digital learning. A total of 63 studies met all selection criteria and were fully reviewed by conducting multiple analyses including selected bibliometrics, a categorical meta-trends analysis and inductive content analysis. The results indicate that the number of LA in FA studies has experienced a significant surge over the past decade. The results also show the current state of research on LA in FA, through a range of disciplines, journals, research methods, learning environments and delivery modes. This review can help inform the implementation of LA in educational contexts to support effective FA practices. However, the review also highlights the need for further research.
这个系统的回顾检查了学习分析(LA)在形成性评估(FA)中的使用。LA是一个强大的工具,可以通过向学生和老师提供实时反馈来支持FA。该综述分析了2011年至2022年间发表在Web of Science和Scopus数据库上的研究,概述了在不同学习环境中通过不同交付模式使用LA进行FA的已发表研究的现状。这篇综述还探讨了在数字化学习的FA实践中LA的巨大潜力。共有63项研究符合所有选择标准,并通过多种分析进行全面审查,包括选定的文献计量学、分类元趋势分析和归纳内容分析。结果表明,FA研究中LA的数量在过去十年中经历了显著的激增。研究结果还从学科、期刊、研究方法、学习环境和交付模式等方面显示了FA中LA的研究现状。这篇综述可以帮助告知在教育背景下LA的实施,以支持有效的FA实践。然而,该综述也强调了进一步研究的必要性。
{"title":"Learning analytics in formative assessment: A systematic literature review","authors":"Ke ZHANG, Ramazan YILMAZ, Ahmet Berk USTUN, Fatma Gizem KARAOĞLAN YILMAZ","doi":"10.21031/epod.1272054","DOIUrl":"https://doi.org/10.21031/epod.1272054","url":null,"abstract":"This systematic review examines the use of learning analytics (LA) in formative assessment (FA). LA is a powerful tool that can support FA by providing real-time feedback to students and teachers. The review analyzes studies published on Web of Science and Scopus databases between 2011 and 2022 that provide an overview of the current state of published research on the use of LA for FA in diverse learning environments and through different delivery modes. This review also explores the significant potential of LA in FA practices in digital learning. A total of 63 studies met all selection criteria and were fully reviewed by conducting multiple analyses including selected bibliometrics, a categorical meta-trends analysis and inductive content analysis. The results indicate that the number of LA in FA studies has experienced a significant surge over the past decade. The results also show the current state of research on LA in FA, through a range of disciplines, journals, research methods, learning environments and delivery modes. This review can help inform the implementation of LA in educational contexts to support effective FA practices. However, the review also highlights the need for further research.","PeriodicalId":43015,"journal":{"name":"Journal of Measurement and Evaluation in Education and Psychology-EPOD","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135252156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study aimed to compare differential item functioning (DIF) and differential step function (DSF) detection methods in polytomously scored items under various conditions. In this context, the study examined Kazakhstan, Turkey and USA data obtained from the items related to the frequency of using digital devices at school in PISA 2018 students’ “ICT Familiarity Questionnaire”. Mantel test, Liu-Agresti statistics, Cox β and poly-SIBTEST methods were used for polytomous DIF analysis while Adjacent Category Logistic Regression Model and Cumulative Category Log Odds Ratio methods were used for DSF analysis. This study was carried out with correlational survey model, by using “differential category combining, focus group sample size, focus group: reference group sample ratio and DIF/DSF detection method”. SAS and R software were utilized in the creation of conditions; SIBTEST was used for poly-SIBTEST for analysis and DIFAS programs were used for the other methods. Analyses demonstrated that the number of items/steps exhibiting high level of DIF/DSF was higher in the small sample according to polytomous DIF methods and in the large sample compared to DSF methods. During the steps, it was stated that the DIF value was lower in the items containing DSF with the opposite sign; therefore, not performing DSF analysis in an item with no DIF may yield erroneous results. Although the differential category combining conditions created within the scope of the research did not have a systematic effect on the results, it was suggested to examine this situation in future studies, considering that the frequency of marking the combined categories differentiated the results.
{"title":"Investigation of differential item and step functioning procedurs in polytomously scored items","authors":"Yasemin KUZU, Selahattin GELBAL","doi":"10.21031/epod.1221823","DOIUrl":"https://doi.org/10.21031/epod.1221823","url":null,"abstract":"This study aimed to compare differential item functioning (DIF) and differential step function (DSF) detection methods in polytomously scored items under various conditions. In this context, the study examined Kazakhstan, Turkey and USA data obtained from the items related to the frequency of using digital devices at school in PISA 2018 students’ “ICT Familiarity Questionnaire”. Mantel test, Liu-Agresti statistics, Cox β and poly-SIBTEST methods were used for polytomous DIF analysis while Adjacent Category Logistic Regression Model and Cumulative Category Log Odds Ratio methods were used for DSF analysis. This study was carried out with correlational survey model, by using “differential category combining, focus group sample size, focus group: reference group sample ratio and DIF/DSF detection method”. SAS and R software were utilized in the creation of conditions; SIBTEST was used for poly-SIBTEST for analysis and DIFAS programs were used for the other methods. Analyses demonstrated that the number of items/steps exhibiting high level of DIF/DSF was higher in the small sample according to polytomous DIF methods and in the large sample compared to DSF methods. During the steps, it was stated that the DIF value was lower in the items containing DSF with the opposite sign; therefore, not performing DSF analysis in an item with no DIF may yield erroneous results. Although the differential category combining conditions created within the scope of the research did not have a systematic effect on the results, it was suggested to examine this situation in future studies, considering that the frequency of marking the combined categories differentiated the results.","PeriodicalId":43015,"journal":{"name":"Journal of Measurement and Evaluation in Education and Psychology-EPOD","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136279144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The aim of the study is to examine how the ability estimations of individuals change under different conditions in tests consisting of polytomous items in an computerized multistage test environment. The research is a simulation study. In the study, 108 (3x3x6x2=108) conditions were examined consisting of three categories (3, 4 and 5), three test lengths (10, 20 and 30), six panel designs (1-2, 1-2-2, 1-3, 1-3-3, 1-4 and 1-4-4) and two routing methods (Maximum Fisher Information (MFI) and Random). Simulations and analyses were carried out in the mstR package in R program, with a pool of 200 items, 1000 people and 100 replications (e.g., iterations). As the outcomes of the research, mean absolute bias, RMSE and correlation values were calculated. It was found that as the number of categories and test length increase, the mean absolute bias and RMSE values decrease, while the correlation values increase. In terms of routing methods, although MFI and random methods have similar tendencies, MFI gives better results. There is a similarity between the panel designs in terms of results.
{"title":"Ability Estimation with Polytomous Items in Computerized Multistage Tests","authors":"Hasibe YAHSİ SARI, Hülya KELECİOĞLU","doi":"10.21031/epod.1056079","DOIUrl":"https://doi.org/10.21031/epod.1056079","url":null,"abstract":"The aim of the study is to examine how the ability estimations of individuals change under different conditions in tests consisting of polytomous items in an computerized multistage test environment. The research is a simulation study. In the study, 108 (3x3x6x2=108) conditions were examined consisting of three categories (3, 4 and 5), three test lengths (10, 20 and 30), six panel designs (1-2, 1-2-2, 1-3, 1-3-3, 1-4 and 1-4-4) and two routing methods (Maximum Fisher Information (MFI) and Random). Simulations and analyses were carried out in the mstR package in R program, with a pool of 200 items, 1000 people and 100 replications (e.g., iterations). As the outcomes of the research, mean absolute bias, RMSE and correlation values were calculated. It was found that as the number of categories and test length increase, the mean absolute bias and RMSE values decrease, while the correlation values increase. In terms of routing methods, although MFI and random methods have similar tendencies, MFI gives better results. There is a similarity between the panel designs in terms of results.","PeriodicalId":43015,"journal":{"name":"Journal of Measurement and Evaluation in Education and Psychology-EPOD","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136278093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The present study aimed to examine the development process of rubrics in theses indexed in the national thesis database and to identify any misconceptions presented in these rubrics. A qualitative research approach utilizing document analysis was employed. The sample of theses was selected based on literature review and criteria established by expert opinions, resulting in a total of 395 theses being included in the study using criterion sampling. Data were collected through a "thesis review form" developed by the researchers. Descriptive analysis was employed for data analysis. Findings indicated that approximately 27% of the 395 theses contained misconceptions, with a disproportionate percentage of these misconceptions being found in master's theses. Regarding the field of the thesis, the highest rate of misconceptions was observed in health, social sciences, special education, and fine arts, while the lowest rate was found in education and linguistics. Additionally, theses with misconceptions tended to possess a lower degree of validity and reliability evidence compared to those without misconceptions. This difference was found to be statistically significant for both validity evidence and reliability evidence. In theses without misconceptions, the most frequently presented validity evidence was expert opinion, while the reliability evidence was found to be the percentage of agreement. The findings were discussed in relation to the existing literature, and recommendations were proposed.
{"title":"Rubrics in Terms of Development Processes and Misconceptions","authors":"Fuat ELKONCA, Görkem CEYHAN, Mehmet ŞATA","doi":"10.21031/epod.1251470","DOIUrl":"https://doi.org/10.21031/epod.1251470","url":null,"abstract":"The present study aimed to examine the development process of rubrics in theses indexed in the national thesis database and to identify any misconceptions presented in these rubrics. A qualitative research approach utilizing document analysis was employed. The sample of theses was selected based on literature review and criteria established by expert opinions, resulting in a total of 395 theses being included in the study using criterion sampling. Data were collected through a \"thesis review form\" developed by the researchers. Descriptive analysis was employed for data analysis. Findings indicated that approximately 27% of the 395 theses contained misconceptions, with a disproportionate percentage of these misconceptions being found in master's theses. Regarding the field of the thesis, the highest rate of misconceptions was observed in health, social sciences, special education, and fine arts, while the lowest rate was found in education and linguistics. Additionally, theses with misconceptions tended to possess a lower degree of validity and reliability evidence compared to those without misconceptions. This difference was found to be statistically significant for both validity evidence and reliability evidence. In theses without misconceptions, the most frequently presented validity evidence was expert opinion, while the reliability evidence was found to be the percentage of agreement. The findings were discussed in relation to the existing literature, and recommendations were proposed.","PeriodicalId":43015,"journal":{"name":"Journal of Measurement and Evaluation in Education and Psychology-EPOD","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136279405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The aim of this study is to analyze the peer and self-assessments of higher education students' oral presentation skills with the many-facet Rasch measurement model and to determine students' opinions on peer and self-assessment. In the study, convergent parallel method, one of the mixed-method research approaches, was used. The study group consisted of 11 university students studying at a state university in the 2022-2023 academic year. The FACETS program was used to analyze the data. The three facets identified in the study were the assessee (11 students), the assessor (11 students) and the items (16 items). Therefore, 11 participants scored (peer and self-assessment) on a 16-item assessment form. In addition, students' opinions on peer and self-assessment were obtained through three open-ended interview questions prepared by the researcher. According to the results of the study, it was determined that there was a statistically significant difference between the students in terms of their oral presentation skills, between the assessors in terms of their strictness/generosity in scoring, and between the criteria (items) in terms of the level of difficulty in realization. In addition, the participant opinions obtained from each interview question were analyzed through themes and sub-themes formed according to the general thoughts on peer and self-assessment, experiences, and whether the participants considered themselves as a reliable rater or not. In terms of practice, it can be suggested to provide detailed and enlightening information to students before peer and/or self-assessment in the classroom environment, and to give quick feedback to those who have not done the assessment appropriately. In addition, the reasons for the biases identified in peer and self-assessments in the current study can be investigated in future studies.
{"title":"Analysis of Peer and Self-Assessments Using the Many-facet Rasch Measurement Model and Student Opinions","authors":"Seda DEMİR","doi":"10.21031/epod.1344196","DOIUrl":"https://doi.org/10.21031/epod.1344196","url":null,"abstract":"The aim of this study is to analyze the peer and self-assessments of higher education students' oral presentation skills with the many-facet Rasch measurement model and to determine students' opinions on peer and self-assessment. In the study, convergent parallel method, one of the mixed-method research approaches, was used. The study group consisted of 11 university students studying at a state university in the 2022-2023 academic year. The FACETS program was used to analyze the data. The three facets identified in the study were the assessee (11 students), the assessor (11 students) and the items (16 items). Therefore, 11 participants scored (peer and self-assessment) on a 16-item assessment form. In addition, students' opinions on peer and self-assessment were obtained through three open-ended interview questions prepared by the researcher. According to the results of the study, it was determined that there was a statistically significant difference between the students in terms of their oral presentation skills, between the assessors in terms of their strictness/generosity in scoring, and between the criteria (items) in terms of the level of difficulty in realization. In addition, the participant opinions obtained from each interview question were analyzed through themes and sub-themes formed according to the general thoughts on peer and self-assessment, experiences, and whether the participants considered themselves as a reliable rater or not. In terms of practice, it can be suggested to provide detailed and enlightening information to students before peer and/or self-assessment in the classroom environment, and to give quick feedback to those who have not done the assessment appropriately. In addition, the reasons for the biases identified in peer and self-assessments in the current study can be investigated in future studies.","PeriodicalId":43015,"journal":{"name":"Journal of Measurement and Evaluation in Education and Psychology-EPOD","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136277919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The primary purpose of this study was to establish a theoretical framework for studies on power analysis conducted in the fields of education, psychology, and statistics for researchers. Therefore, the bibliometric characteristics of publications related to power analysis in the Web of Science database were analyzed using the Biblioshiny interface in the R programming language. The study identified influential studies on power analysis in education, psychology, and statistics. It also determined which concepts were associated with power analysis over the years and the authors and countries that contributed to the advancement of research regarding this concept. This research was conducted based on 515 studies that were included following specific criteria. It was found that the studies published between 1970 and 2023 were obtained from 183 sources, with a total of 1246 authors. There were 98 single-authored studies, and the number of co-authors per study was 2.88 on average. According to Bradford’s Law, Behavior Research Methods, Psychological Methods, and Multivariate Behavioral Research were the most productive journals concerning power analysis, taking up a larger proportion within the core sources compared to other journals. These journals were among the top three in terms of the number of publications, h-index, total number of citations, and publication rankings. These journals were followed by Structural Equation Modeling-A Multidisciplinary Journal, Frontiers in Psychology, and Educational and Psychological Measurement. An examination of studies on power analysis in education, psychology, and statistics according to Lotka's Law indicated that the relevant literature is insufficient and needs further development.
本研究的主要目的是为研究者在教育、心理学和统计学领域的权力分析研究建立一个理论框架。因此,使用R编程语言的Biblioshiny界面对Web of Science数据库中功率分析相关出版物的文献计量学特征进行分析。该研究确定了在教育、心理学和统计学方面对权力分析有影响的研究。它还确定了多年来哪些概念与权力分析有关,以及对这一概念的研究作出贡献的作者和国家。这项研究是在515项研究的基础上进行的,这些研究被纳入了特定的标准。研究发现,1970年至2023年间发表的研究来自183个来源,共有1246位作者。共有98项单作者研究,平均每项研究的共同作者数量为2.88人。根据布拉德福德定律,与权力分析相关的期刊中,行为研究方法、心理学方法和多元行为研究是产出最多的期刊,在核心来源中所占比例高于其他期刊。这些期刊在发表数、h-index、总被引数、发表排名等方面均进入前三名。这些期刊之后是结构方程建模-多学科期刊,心理学前沿和教育与心理测量。根据洛特卡定律对教育学、心理学、统计学等领域的权力分析研究进行考察,发现相关文献不足,有待进一步发展。
{"title":"A Bibliometric Analysis on Power Analysis Studies","authors":"Gül GÜLER","doi":"10.21031/epod.1343984","DOIUrl":"https://doi.org/10.21031/epod.1343984","url":null,"abstract":"The primary purpose of this study was to establish a theoretical framework for studies on power analysis conducted in the fields of education, psychology, and statistics for researchers. Therefore, the bibliometric characteristics of publications related to power analysis in the Web of Science database were analyzed using the Biblioshiny interface in the R programming language. The study identified influential studies on power analysis in education, psychology, and statistics. It also determined which concepts were associated with power analysis over the years and the authors and countries that contributed to the advancement of research regarding this concept. This research was conducted based on 515 studies that were included following specific criteria. It was found that the studies published between 1970 and 2023 were obtained from 183 sources, with a total of 1246 authors. There were 98 single-authored studies, and the number of co-authors per study was 2.88 on average. According to Bradford’s Law, Behavior Research Methods, Psychological Methods, and Multivariate Behavioral Research were the most productive journals concerning power analysis, taking up a larger proportion within the core sources compared to other journals. These journals were among the top three in terms of the number of publications, h-index, total number of citations, and publication rankings. These journals were followed by Structural Equation Modeling-A Multidisciplinary Journal, Frontiers in Psychology, and Educational and Psychological Measurement. An examination of studies on power analysis in education, psychology, and statistics according to Lotka's Law indicated that the relevant literature is insufficient and needs further development.","PeriodicalId":43015,"journal":{"name":"Journal of Measurement and Evaluation in Education and Psychology-EPOD","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136279023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This research examines whether the affective characteristics of the TIMSS 2019 Turkey mathematics application provide measurement invariance according to gender. The research sample consists of 4048 8th-grade students participating in the TIMSS in 2019. Research data were downloaded from the international website of TIMSS. The research data collection tools are “Sense of School Belonging”, “Students Confident in Mathematics”, “Students Like Learning Mathematics”, and “Students Value Mathematics” scales. Exploratory Factor Analysis (EFA) and Confirmatory Factor Analysis (CFA) were performed in the context of validity analyses to examine measurement invariance. In terms of reliability, the Cronbach Alfa internal consistency coefficient was calculated. Accordingly, out of the four scales in the study, only “Students Confident in Mathematics” scale could not be confirmed in confirmatory factor analysis. Therefore, while “Students Confident in Mathematics” scale was not examined for measurement invariance, the other three scales were examined within the scope of measurement invariance. For measurement invariance, research data were tested with Multiple Group Confirmatory Factor Analysis (MG-CFA), one of the Structural Equation Modeling (SEM) techniques. As a result of the analyses, while the strict invariance model was provided in “Students Like Learning Mathematics” scale and “Students Value Mathematics” scale, strong invariance/scale invariance model was provided in “Sense of School Belonging” scale. It was concluded that there was no gender bias in the three scales for which MG-CFA was performed, and the mean scores were comparable according to gender. In this context, it can be said that “Sense of School Belonging”, “Students Like Learning Mathematics”, and “Students Value Mathematics” scales are valid in determining the differences according to gender.
{"title":"Investigation of The Measurement Invariance of Affective Characteristics Related to TIMSS 2019 Mathematics Achievement by Gender","authors":"Mehmet ATILGAN, Kaan Zulfikar DENİZ","doi":"10.21031/epod.1221365","DOIUrl":"https://doi.org/10.21031/epod.1221365","url":null,"abstract":"This research examines whether the affective characteristics of the TIMSS 2019 Turkey mathematics application provide measurement invariance according to gender. The research sample consists of 4048 8th-grade students participating in the TIMSS in 2019. Research data were downloaded from the international website of TIMSS. The research data collection tools are “Sense of School Belonging”, “Students Confident in Mathematics”, “Students Like Learning Mathematics”, and “Students Value Mathematics” scales. Exploratory Factor Analysis (EFA) and Confirmatory Factor Analysis (CFA) were performed in the context of validity analyses to examine measurement invariance. In terms of reliability, the Cronbach Alfa internal consistency coefficient was calculated. Accordingly, out of the four scales in the study, only “Students Confident in Mathematics” scale could not be confirmed in confirmatory factor analysis. Therefore, while “Students Confident in Mathematics” scale was not examined for measurement invariance, the other three scales were examined within the scope of measurement invariance. For measurement invariance, research data were tested with Multiple Group Confirmatory Factor Analysis (MG-CFA), one of the Structural Equation Modeling (SEM) techniques. As a result of the analyses, while the strict invariance model was provided in “Students Like Learning Mathematics” scale and “Students Value Mathematics” scale, strong invariance/scale invariance model was provided in “Sense of School Belonging” scale. It was concluded that there was no gender bias in the three scales for which MG-CFA was performed, and the mean scores were comparable according to gender. In this context, it can be said that “Sense of School Belonging”, “Students Like Learning Mathematics”, and “Students Value Mathematics” scales are valid in determining the differences according to gender.","PeriodicalId":43015,"journal":{"name":"Journal of Measurement and Evaluation in Education and Psychology-EPOD","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136278878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study aimed to investigate the impact of various weighting methods for effect sizes on the outcomes of meta-analyses that examined the effects of the 5E teaching method on academic achievement in science education. Two effect size weighting methods were explored: one based on the inverse of the sampling error variance and the other utilizing the reliability of measures in primary studies. The study also assessed the influence of including gray literature on the meta-analysis results, considering factors such as high heterogeneity and publication bias. The research followed a basic research design and drew data from 112 studies, encompassing a total of 149 effect sizes. An exhaustive search of databases and archives, including Google Scholar, Dergipark, HEI Thesis Center, Proquest, Science Direct, ERIC, Taylor & Francis, EBSCOhost, Web of Science, and five journals was conducted to gather these studies. Analyses were performed by utilizing the CMA v2 software and employing the random effects model. The findings demonstrated divergent outcomes between the two weighting methods—weighting by reliability coefficient yielded higher overall effect sizes and standard errors compared to weighting by inverse variance. Ultimately, the inclusion of gray literature was found to not significantly impact any of the weighting methods employed.
本研究旨在探讨影响大小的各种加权方法对meta分析结果的影响,该meta分析检验了5E教学方法对科学教育学业成绩的影响。研究了两种效应大小加权方法:一种是基于抽样误差方差的倒数,另一种是利用初步研究中测量的可靠性。考虑到高异质性和发表偏倚等因素,本研究还评估了纳入灰色文献对meta分析结果的影响。这项研究遵循了基本的研究设计,并从112项研究中提取了数据,总共包括149个效应值。详尽搜索数据库和档案,包括b谷歌Scholar, Dergipark, HEI Thesis Center, Proquest, Science Direct, ERIC, Taylor &Francis, EBSCOhost, Web of Science和五家期刊收集了这些研究。利用CMA v2软件,采用随机效应模型进行分析。研究结果表明,两种加权方法之间的结果存在差异——与逆方差加权相比,信度系数加权产生了更高的总体效应大小和标准误差。最终,发现灰色文献的纳入对所采用的任何加权方法都没有显著影响。
{"title":"A New Weighting Method in Meta-Analysis: The Weighting with Reliability Coefficient","authors":"Yıldız YILDIRIM, Şeref TAN","doi":"10.21031/epod.1351485","DOIUrl":"https://doi.org/10.21031/epod.1351485","url":null,"abstract":"This study aimed to investigate the impact of various weighting methods for effect sizes on the outcomes of meta-analyses that examined the effects of the 5E teaching method on academic achievement in science education. Two effect size weighting methods were explored: one based on the inverse of the sampling error variance and the other utilizing the reliability of measures in primary studies. The study also assessed the influence of including gray literature on the meta-analysis results, considering factors such as high heterogeneity and publication bias. The research followed a basic research design and drew data from 112 studies, encompassing a total of 149 effect sizes. An exhaustive search of databases and archives, including Google Scholar, Dergipark, HEI Thesis Center, Proquest, Science Direct, ERIC, Taylor & Francis, EBSCOhost, Web of Science, and five journals was conducted to gather these studies. Analyses were performed by utilizing the CMA v2 software and employing the random effects model. The findings demonstrated divergent outcomes between the two weighting methods—weighting by reliability coefficient yielded higher overall effect sizes and standard errors compared to weighting by inverse variance. Ultimately, the inclusion of gray literature was found to not significantly impact any of the weighting methods employed.","PeriodicalId":43015,"journal":{"name":"Journal of Measurement and Evaluation in Education and Psychology-EPOD","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136279253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A case study that included 26 English Language teacher candidates was designed for developing an evidence-based measurement curriculum in Turkey, examining teacher candidates’ experiences on the newly developed course and taking remedial actions for updating the syllabus if needed. Data was collected using multiple sources: a pre-course survey, weekly discussion board on Edmodo and a post-course survey. Survey data obtained from rating-scale items was analyzed using descriptive statistics and data visualization packages embedded in R. Open-ended survey data and discussion board data were content-analyzed using MaxQDA software. The results revealed that students had limited awareness regarding assessment for learning concept and digital tools that could be used for assessment for learning purposes at the beginning of the course. Course content, in-class activities and projects helped them develop hands-on skills in developing sound language assessments as well as raised their awareness with respect to the importance of computer-based language assessment.
{"title":"Training 21st Century English Language Teachers in Turkish Context: Development of a Technology-Enhanced Measurement Curriculum","authors":"Burcu ŞENTÜRK, Beyza AKSU DÜNYA, Mehmet Can DEMİR","doi":"10.21031/epod.1261763","DOIUrl":"https://doi.org/10.21031/epod.1261763","url":null,"abstract":"A case study that included 26 English Language teacher candidates was designed for developing an evidence-based measurement curriculum in Turkey, examining teacher candidates’ experiences on the newly developed course and taking remedial actions for updating the syllabus if needed. Data was collected using multiple sources: a pre-course survey, weekly discussion board on Edmodo and a post-course survey. Survey data obtained from rating-scale items was analyzed using descriptive statistics and data visualization packages embedded in R. Open-ended survey data and discussion board data were content-analyzed using MaxQDA software. The results revealed that students had limited awareness regarding assessment for learning concept and digital tools that could be used for assessment for learning purposes at the beginning of the course. Course content, in-class activities and projects helped them develop hands-on skills in developing sound language assessments as well as raised their awareness with respect to the importance of computer-based language assessment.","PeriodicalId":43015,"journal":{"name":"Journal of Measurement and Evaluation in Education and Psychology-EPOD","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135297591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The aim of this research is to determine the secondary education 10. the aim of this study is to examine the effect of the applications of providing resources for learning disabilities by the system and providing feedback for learning disabilities by the teacher within the scope of online (web-based) formative evaluation application of mathematics course of second-degree equations of classroom students on the students' achievements. In the research, it was used using a semi-experimental pattern. Pre-test - post-test success tests and monitoring facilities were used. The research was conducted in the 2022-2023 academic year with a total of 302 students selected from 4 schools and 12 branches in Göksun and Ağrın districts using stratified, random cluster sampling method. The data were analyzed by one-way analysis of variance (ANOVA) and covariance analysis (ANCOVA). According to the results of the research, it was found that there was no statistically significant difference between the pre-test averages of the groups, but a statistically significant difference appeared in the post-test. Dec. The provision of resources for learning disabilities by the system applied to the Experiment-2 group and the provision of detailed feedback by the teacher according to the Cognitive Diagnostic Modeling (BTM) for learning disabilities, the provision of resources for learning disabilities by the system applied to the Experiment-1 group and normal teaching applied to the Control group; the provision of resources for learning disabilities by the system applied to the experiment-1 group and normal teaching applied to the control group were also found to be effective. In addition, according to the results of the experimental processing process, Experiment-2 showed a higher level of development between the pre- and Decal test averages than Experiment-1 and Experiment-1 from the Control group.
{"title":"Investigation of the Effect of Online (Web-Based) Formative Assessment Applications on Students' Academic Achievement","authors":"Bayram ÇETİN, Şeref AKPINAR","doi":"10.21031/epod.1320182","DOIUrl":"https://doi.org/10.21031/epod.1320182","url":null,"abstract":"The aim of this research is to determine the secondary education 10. the aim of this study is to examine the effect of the applications of providing resources for learning disabilities by the system and providing feedback for learning disabilities by the teacher within the scope of online (web-based) formative evaluation application of mathematics course of second-degree equations of classroom students on the students' achievements. In the research, it was used using a semi-experimental pattern. Pre-test - post-test success tests and monitoring facilities were used. The research was conducted in the 2022-2023 academic year with a total of 302 students selected from 4 schools and 12 branches in Göksun and Ağrın districts using stratified, random cluster sampling method. The data were analyzed by one-way analysis of variance (ANOVA) and covariance analysis (ANCOVA). According to the results of the research, it was found that there was no statistically significant difference between the pre-test averages of the groups, but a statistically significant difference appeared in the post-test. Dec. The provision of resources for learning disabilities by the system applied to the Experiment-2 group and the provision of detailed feedback by the teacher according to the Cognitive Diagnostic Modeling (BTM) for learning disabilities, the provision of resources for learning disabilities by the system applied to the Experiment-1 group and normal teaching applied to the Control group; the provision of resources for learning disabilities by the system applied to the experiment-1 group and normal teaching applied to the control group were also found to be effective. In addition, according to the results of the experimental processing process, Experiment-2 showed a higher level of development between the pre- and Decal test averages than Experiment-1 and Experiment-1 from the Control group.","PeriodicalId":43015,"journal":{"name":"Journal of Measurement and Evaluation in Education and Psychology-EPOD","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136108528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}