Alexis A. Lopez, Danielle Guzman-Orth, Diego Zapata-Rivera, Carolyn M. Forsyth, Christine Luce
Substantial progress has been made toward applying technology enhanced conversation-based assessments (CBAs) to measure the English-language proficiency of English learners (ELs). CBAs are conversation-based systems that use conversations among computer-animated agents and a test taker. We expanded the design and capability of prior conversation-based instructional and assessment systems and developed a CBA designed to measure the English language skills and the mathematics knowledge of middle school ELs. The prototype CBA simulates an authentic and engaging mathematics classroom where the test taker interacts with two virtual agents to solve math problems. We embedded feedback and supports that are triggered by how the CBA interprets students' written responses. In this study, we administered the CBA to middle school ELs (N = 82) residing in the United States. We examined the extent to which the CBA system was able to consistently interpret the students' responses (722 responses for the 82 students). The study findings helped us to understand the factors that affect the accuracy of the CBA system's interpretations and shed light on how to improve CBA systems that incorporate scaffolding.
{"title":"Examining the Accuracy of a Conversation-Based Assessment in Interpreting English Learners' Written Responses","authors":"Alexis A. Lopez, Danielle Guzman-Orth, Diego Zapata-Rivera, Carolyn M. Forsyth, Christine Luce","doi":"10.1002/ets2.12315","DOIUrl":"10.1002/ets2.12315","url":null,"abstract":"<p>Substantial progress has been made toward applying technology enhanced conversation-based assessments (CBAs) to measure the English-language proficiency of English learners (ELs). CBAs are conversation-based systems that use conversations among computer-animated agents and a test taker. We expanded the design and capability of prior conversation-based instructional and assessment systems and developed a CBA designed to measure the English language skills and the mathematics knowledge of middle school ELs. The prototype CBA simulates an authentic and engaging mathematics classroom where the test taker interacts with two virtual agents to solve math problems. We embedded feedback and supports that are triggered by how the CBA interprets students' written responses. In this study, we administered the CBA to middle school ELs (<i>N</i> = 82) residing in the United States. We examined the extent to which the CBA system was able to consistently interpret the students' responses (722 responses for the 82 students). The study findings helped us to understand the factors that affect the accuracy of the CBA system's interpretations and shed light on how to improve CBA systems that incorporate scaffolding.</p>","PeriodicalId":11972,"journal":{"name":"ETS Research Report Series","volume":"2021 1","pages":"1-15"},"PeriodicalIF":0.0,"publicationDate":"2021-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1002/ets2.12315","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45411329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Laura K. Halderman, Bridgid Finn, J.R. Lockwood, Nicole M. Long, Michael J. Kahana
In educational assessment, low engagement is problematic when tests are low stakes for students but have significant consequences for teachers or schools. In the current study, we sought to establish the electroencephalographic (EEG) correlates of engagement and to distinguish engagement from mental effort. Forty university students participated in a simulated GRE® General Test session while scalp EEG was recorded from 128 channels. Participants completed two verbal and two quantitative GRE test blocks for a total of 40 items each, and after each item, rated either their engagement or mental effort on a scale of 1–6. We computed power for seven frequency bands (delta, theta, alpha, beta, and low, medium, and high gamma) across six regions of interest: left hemisphere (LH) and right hemisphere (RH) frontal, temporal, and parietal. Preliminary results suggested that gamma power (30–150 hertz [Hz]) indexed differences between high- and low-engagement ratings. This pattern was similar but weaker for mental effort. A cumulative logit model with cross-classified random effects determined that high gamma (90–150 Hz) over the LH temporal cortex predicted engagement ratings, while controlling for reaction time and accuracy. However, for effort ratings, reaction time was the sole significant predictor. These results suggest that high gamma may be a correlate of engagement during complex cognitive tasks, but not a correlate of effort. The findings are a promising step toward the goal of objectively measuring engagement during assessment tasks.
{"title":"EEG Correlates of Engagement During Assessment","authors":"Laura K. Halderman, Bridgid Finn, J.R. Lockwood, Nicole M. Long, Michael J. Kahana","doi":"10.1002/ets2.12312","DOIUrl":"10.1002/ets2.12312","url":null,"abstract":"<p>In educational assessment, low engagement is problematic when tests are low stakes for students but have significant consequences for teachers or schools. In the current study, we sought to establish the electroencephalographic (EEG) correlates of engagement and to distinguish engagement from mental effort. Forty university students participated in a simulated <i>GRE</i>® General Test session while scalp EEG was recorded from 128 channels. Participants completed two verbal and two quantitative GRE test blocks for a total of 40 items each, and after each item, rated either their engagement or mental effort on a scale of 1–6. We computed power for seven frequency bands (delta, theta, alpha, beta, and low, medium, and high gamma) across six regions of interest: left hemisphere (LH) and right hemisphere (RH) frontal, temporal, and parietal. Preliminary results suggested that gamma power (30–150 hertz [Hz]) indexed differences between high- and low-engagement ratings. This pattern was similar but weaker for mental effort. A cumulative logit model with cross-classified random effects determined that high gamma (90–150 Hz) over the LH temporal cortex predicted engagement ratings, while controlling for reaction time and accuracy. However, for effort ratings, reaction time was the sole significant predictor. These results suggest that high gamma may be a correlate of engagement during complex cognitive tasks, but not a correlate of effort. The findings are a promising step toward the goal of objectively measuring engagement during assessment tasks.</p>","PeriodicalId":11972,"journal":{"name":"ETS Research Report Series","volume":"2021 1","pages":"1-17"},"PeriodicalIF":0.0,"publicationDate":"2021-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1002/ets2.12312","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45584417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study evaluated the impact of subgroup weighting for equating through a common-item anchor. We used data from a single test form to create two research forms for which the equating relationship was known. The results showed that equating was most accurate when the new form and reference form samples were weighted to be similar to the target population. When the target population was a combination of the two equating samples and one sample was weighted to be similar to the other, the equating was less accurate but still much more accurate than equating with unweighted samples.
{"title":"Effect of Statistically Matching Equating Samples for Common-Item Equating","authors":"Ru Lu, Sooyeon Kim","doi":"10.1002/ets2.12313","DOIUrl":"10.1002/ets2.12313","url":null,"abstract":"<p>This study evaluated the impact of subgroup weighting for equating through a common-item anchor. We used data from a single test form to create two research forms for which the equating relationship was known. The results showed that equating was most accurate when the new form and reference form samples were weighted to be similar to the target population. When the target population was a combination of the two equating samples and one sample was weighted to be similar to the other, the equating was less accurate but still much more accurate than equating with unweighted samples.</p>","PeriodicalId":11972,"journal":{"name":"ETS Research Report Series","volume":"2021 1","pages":"1-14"},"PeriodicalIF":0.0,"publicationDate":"2021-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1002/ets2.12313","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42806952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jonathan Steinberg, Jessica Andrews-Todd, Carol M. Forsyth, J. Chamberlain, P. Horwitz, Al Koon, A. Rupp, Laura McCulla
{"title":"The Development of a Content Assessment of Basic Electronics Knowledge","authors":"Jonathan Steinberg, Jessica Andrews-Todd, Carol M. Forsyth, J. Chamberlain, P. Horwitz, Al Koon, A. Rupp, Laura McCulla","doi":"10.1002/ets2.12311","DOIUrl":"https://doi.org/10.1002/ets2.12311","url":null,"abstract":"","PeriodicalId":11972,"journal":{"name":"ETS Research Report Series","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1002/ets2.12311","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41627499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Perspectives on Social and Emotional Learning in Tertiary Education","authors":"Catherine M. Millett","doi":"10.1002/ets2.12303","DOIUrl":"https://doi.org/10.1002/ets2.12303","url":null,"abstract":"","PeriodicalId":11972,"journal":{"name":"ETS Research Report Series","volume":"2020 1","pages":"1-14"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1002/ets2.12303","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48886765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Geoffrey Phelps, B. Bridgeman, Fred Yan, Jonathan Steinberg, B. Weren, Jiawen Zhou
{"title":"Preliminary Evidence on Measurement Characteristics for the Foundational Assessment of Competencies for Teaching Performance Tasks","authors":"Geoffrey Phelps, B. Bridgeman, Fred Yan, Jonathan Steinberg, B. Weren, Jiawen Zhou","doi":"10.1002/ets2.12310","DOIUrl":"https://doi.org/10.1002/ets2.12310","url":null,"abstract":"","PeriodicalId":11972,"journal":{"name":"ETS Research Report Series","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1002/ets2.12310","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46139872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The State Kindergarten Entry Assessment Digital Technology Landscape","authors":"D. Ackerman","doi":"10.1002/ets2.12296","DOIUrl":"https://doi.org/10.1002/ets2.12296","url":null,"abstract":"","PeriodicalId":11972,"journal":{"name":"ETS Research Report Series","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1002/ets2.12296","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46435153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Examining How English Learners Interact With\u0000 \u0000 WINSIGHT\u0000 \u0000 ® Summative Assessment Items: An Exploratory Study","authors":"Alexis A. López, Florencia Tolentino","doi":"10.1002/ETS2.12309","DOIUrl":"https://doi.org/10.1002/ETS2.12309","url":null,"abstract":"","PeriodicalId":11972,"journal":{"name":"ETS Research Report Series","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1002/ETS2.12309","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47069945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}