This study examines whether analytic writing traits function equivalently across diverse thematic prompt categories for English language learners (ELLs). We utilized the ELLIPSE Corpus, which contains 6482 essays written by ELLs in response to 44 prompts during standardized annual testing in the United States. These essays were organized into six thematic prompt categories. To estimate the students’ underlying writing ability, we employed a unidimensional Item Response Theory (IRT) model. Subsequently, we conducted Differential Feature Functioning (DFF) analysis using a step-by-step ordinal logistic regression framework based on IRT. DFF analysis revealed that while Vocabulary and Grammar show statistically detectable category-related variation, effect sizes were negligible, indicating no practical impact on score interpretation. A more focused, category-by-category DFF analysis identified minor DFF in Cohesion, Vocabulary, and Grammar across the Education, Personal Development, and Society and Social Life categories, yet effects remained practically negligible. Diagnostic plots further confirmed the stability of trait functioning across prompt categories. Comprehensive sensitivity analyses supported the robustness of these findings. These results support the fairness and comparability of analytic trait-based scoring for ELL writing assessments. The study contributes to equitable writing assessment practice by offering evidence-based guidance for fair prompt design, targeted rater training, and rubric refinement.
扫码关注我们
求助内容:
应助结果提醒方式:
