{"title":"使用自然语言处理工具预测数学表现","authors":"S. Crossley, Ran Liu, D. McNamara","doi":"10.1145/3027385.3027399","DOIUrl":null,"url":null,"abstract":"A number of studies have demonstrated links between linguistic knowledge and performance in math. Studies examining these links in first language speakers of English have traditionally relied on correlational analyses between linguistic knowledge tests and standardized math tests. For second language (L2) speakers, the majority of studies have compared math performance between proficient and non-proficient speakers of English. In this study, we take a novel approach and examine the linguistic features of student language while they are engaged in collaborative problem solving within an on-line math tutoring system. We transcribe the students' speech and use natural language processing tools to extract linguistic information related to text cohesion, lexical sophistication, and sentiment. Our criterion variables are individuals' pretest and posttest math performance scores. In addition to examining relations between linguistic features of student language production and math scores, we also control for a number of non-linguistic factors including gender, age, grade, school, and content focus (procedural versus conceptual). Linear mixed effect modeling indicates that non-linguistic factors are not predictive of math scores. However, linguistic features related to cohesion affect and lexical proficiency explained approximately 30% of the variance (R2 = .303) in the math scores.","PeriodicalId":160897,"journal":{"name":"Proceedings of the Seventh International Learning Analytics & Knowledge Conference","volume":"1999 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":"{\"title\":\"Predicting math performance using natural language processing tools\",\"authors\":\"S. Crossley, Ran Liu, D. McNamara\",\"doi\":\"10.1145/3027385.3027399\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A number of studies have demonstrated links between linguistic knowledge and performance in math. Studies examining these links in first language speakers of English have traditionally relied on correlational analyses between linguistic knowledge tests and standardized math tests. For second language (L2) speakers, the majority of studies have compared math performance between proficient and non-proficient speakers of English. In this study, we take a novel approach and examine the linguistic features of student language while they are engaged in collaborative problem solving within an on-line math tutoring system. We transcribe the students' speech and use natural language processing tools to extract linguistic information related to text cohesion, lexical sophistication, and sentiment. Our criterion variables are individuals' pretest and posttest math performance scores. In addition to examining relations between linguistic features of student language production and math scores, we also control for a number of non-linguistic factors including gender, age, grade, school, and content focus (procedural versus conceptual). Linear mixed effect modeling indicates that non-linguistic factors are not predictive of math scores. However, linguistic features related to cohesion affect and lexical proficiency explained approximately 30% of the variance (R2 = .303) in the math scores.\",\"PeriodicalId\":160897,\"journal\":{\"name\":\"Proceedings of the Seventh International Learning Analytics & Knowledge Conference\",\"volume\":\"1999 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-03-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"22\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Seventh International Learning Analytics & Knowledge Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3027385.3027399\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Seventh International Learning Analytics & Knowledge Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3027385.3027399","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Predicting math performance using natural language processing tools
A number of studies have demonstrated links between linguistic knowledge and performance in math. Studies examining these links in first language speakers of English have traditionally relied on correlational analyses between linguistic knowledge tests and standardized math tests. For second language (L2) speakers, the majority of studies have compared math performance between proficient and non-proficient speakers of English. In this study, we take a novel approach and examine the linguistic features of student language while they are engaged in collaborative problem solving within an on-line math tutoring system. We transcribe the students' speech and use natural language processing tools to extract linguistic information related to text cohesion, lexical sophistication, and sentiment. Our criterion variables are individuals' pretest and posttest math performance scores. In addition to examining relations between linguistic features of student language production and math scores, we also control for a number of non-linguistic factors including gender, age, grade, school, and content focus (procedural versus conceptual). Linear mixed effect modeling indicates that non-linguistic factors are not predictive of math scores. However, linguistic features related to cohesion affect and lexical proficiency explained approximately 30% of the variance (R2 = .303) in the math scores.