Automated Scoring of Clinical Expressive Language Evaluation Tasks.

Proceedings of the conference. Association for Computational Linguistics. Meeting Pub Date : 2020-07-01 DOI:10.18653/v1/2020.bea-1.18

Yiyi Wang, Emily Prud'hommeaux, Meysam Asgari, Jill Dolata

{"title":"Automated Scoring of Clinical Expressive Language Evaluation Tasks.","authors":"Yiyi Wang, Emily Prud'hommeaux, Meysam Asgari, Jill Dolata","doi":"10.18653/v1/2020.bea-1.18","DOIUrl":null,"url":null,"abstract":"<p><p>Many clinical assessment instruments used to diagnose language impairments in children include a task in which the subject must formulate a sentence to describe an image using a specific target word. Because producing sentences in this way requires the speaker to integrate syntactic and semantic knowledge in a complex manner, responses are typically evaluated on several different dimensions of appropriateness yielding a single composite score for each response. In this paper, we present a dataset consisting of non-clinically elicited responses for three related sentence formulation tasks, and we propose an approach for automatically evaluating their appropriateness. Using neural machine translation, we generate correct-incorrect sentence pairs to serve as synthetic data in order to increase the amount and diversity of training data for our scoring model. Our scoring model uses transfer learning to facilitate automatic sentence appropriateness evaluation. We further compare custom word embeddings with pre-trained contextualized embeddings serving as features for our scoring model. We find that transfer learning improves scoring accuracy, particularly when using pre-trained contextualized embeddings.</p>","PeriodicalId":74541,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. Meeting","volume":" ","pages":"177-185"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7556318/pdf/nihms-1636235.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the conference. Association for Computational Linguistics. Meeting","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/2020.bea-1.18","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Many clinical assessment instruments used to diagnose language impairments in children include a task in which the subject must formulate a sentence to describe an image using a specific target word. Because producing sentences in this way requires the speaker to integrate syntactic and semantic knowledge in a complex manner, responses are typically evaluated on several different dimensions of appropriateness yielding a single composite score for each response. In this paper, we present a dataset consisting of non-clinically elicited responses for three related sentence formulation tasks, and we propose an approach for automatically evaluating their appropriateness. Using neural machine translation, we generate correct-incorrect sentence pairs to serve as synthetic data in order to increase the amount and diversity of training data for our scoring model. Our scoring model uses transfer learning to facilitate automatic sentence appropriateness evaluation. We further compare custom word embeddings with pre-trained contextualized embeddings serving as features for our scoring model. We find that transfer learning improves scoring accuracy, particularly when using pre-trained contextualized embeddings.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

临床表达性语言评估任务的自动评分。

许多用于诊断儿童语言障碍的临床评估工具都包含一项任务，即受试者必须用特定的目标词造句来描述一幅图像。由于以这种方式造句需要说话者以复杂的方式整合句法和语义知识，因此通常会从几个不同的适当性维度对回答进行评估，从而为每个回答得出一个综合分数。在本文中，我们介绍了一个数据集，该数据集由三个相关造句任务的非临床诱导回答组成，我们还提出了一种自动评估其适当性的方法。通过使用神经机器翻译，我们生成了正确-不正确句子对作为合成数据，以增加评分模型训练数据的数量和多样性。我们的评分模型使用迁移学习来促进句子适当性的自动评估。我们进一步比较了自定义词嵌入和作为评分模型特征的预训练上下文嵌入。我们发现，迁移学习提高了评分的准确性，尤其是在使用预先训练好的上下文嵌入式时。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the conference. Association for Computational Linguistics. Meeting

自引率

0.00%

发文量