Using natural conversations to classify autism with limited data: Age matters

Proceedings of the Sixth Workshop on Computational Linguistics and Clinical Psychology Pub Date : 2019-06-01 DOI:10.18653/v1/W19-3006

M. Hauser, E. Sariyanidi, B. Tunç, C. Zampella, E. Brodkin, R. Schultz, J. Parish-Morris

{"title":"Using natural conversations to classify autism with limited data: Age matters","authors":"M. Hauser, E. Sariyanidi, B. Tunç, C. Zampella, E. Brodkin, R. Schultz, J. Parish-Morris","doi":"10.18653/v1/W19-3006","DOIUrl":null,"url":null,"abstract":"Spoken language ability is highly heterogeneous in Autism Spectrum Disorder (ASD), which complicates efforts to identify linguistic markers for use in diagnostic classification, clinical characterization, and for research and clinical outcome measurement. Machine learning techniques that harness the power of multivariate statistics and non-linear data analysis hold promise for modeling this heterogeneity, but many models require enormous datasets, which are unavailable for most psychiatric conditions (including ASD). In lieu of such datasets, good models can still be built by leveraging domain knowledge. In this study, we compare two machine learning approaches: the first approach incorporates prior knowledge about language variation across middle childhood, adolescence, and adulthood to classify 6-minute naturalistic conversation samples from 140 age- and IQ-matched participants (81 with ASD), while the other approach treats all ages the same. We found that individual age-informed models were significantly more accurate than a single model tasked with building a common algorithm across age groups. Furthermore, predictive linguistic features differed significantly by age group, confirming the importance of considering age-related changes in language use when classifying ASD. Our results suggest that limitations imposed by heterogeneity inherent to ASD and from developmental change with age can be (at least partially) overcome using domain knowledge, such as understanding spoken language development from childhood through adulthood.","PeriodicalId":201097,"journal":{"name":"Proceedings of the Sixth Workshop on Computational Linguistics and Clinical Psychology","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Sixth Workshop on Computational Linguistics and Clinical Psychology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/W19-3006","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

Abstract

Spoken language ability is highly heterogeneous in Autism Spectrum Disorder (ASD), which complicates efforts to identify linguistic markers for use in diagnostic classification, clinical characterization, and for research and clinical outcome measurement. Machine learning techniques that harness the power of multivariate statistics and non-linear data analysis hold promise for modeling this heterogeneity, but many models require enormous datasets, which are unavailable for most psychiatric conditions (including ASD). In lieu of such datasets, good models can still be built by leveraging domain knowledge. In this study, we compare two machine learning approaches: the first approach incorporates prior knowledge about language variation across middle childhood, adolescence, and adulthood to classify 6-minute naturalistic conversation samples from 140 age- and IQ-matched participants (81 with ASD), while the other approach treats all ages the same. We found that individual age-informed models were significantly more accurate than a single model tasked with building a common algorithm across age groups. Furthermore, predictive linguistic features differed significantly by age group, confirming the importance of considering age-related changes in language use when classifying ASD. Our results suggest that limitations imposed by heterogeneity inherent to ASD and from developmental change with age can be (at least partially) overcome using domain knowledge, such as understanding spoken language development from childhood through adulthood.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

使用自然对话在有限数据下对自闭症进行分类:年龄很重要

自闭症谱系障碍(ASD)患者的口语能力是高度异质性的，这使得识别用于诊断分类、临床表征以及研究和临床结果测量的语言标记变得复杂。利用多元统计和非线性数据分析能力的机器学习技术有望为这种异质性建模，但许多模型需要庞大的数据集，而这些数据集无法用于大多数精神疾病(包括ASD)。代替这样的数据集，好的模型仍然可以通过利用领域知识来构建。在本研究中，我们比较了两种机器学习方法:第一种方法结合了关于童年中期、青春期和成年期语言变化的先验知识，对140名年龄和智商匹配的参与者(81名患有ASD)的6分钟自然对话样本进行分类，而另一种方法对所有年龄段的人都进行了相同的分类。我们发现，与建立跨年龄组通用算法的单一模型相比，单个年龄信息模型的准确性要高得多。此外，预测语言特征在不同年龄组之间存在显著差异，这证实了在对ASD进行分类时考虑与年龄相关的语言使用变化的重要性。我们的研究结果表明，ASD固有的异质性和随着年龄的发展变化所带来的限制可以(至少部分地)通过领域知识来克服，例如理解从童年到成年的口语发展。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the Sixth Workshop on Computational Linguistics and Clinical Psychology

自引率

0.00%

发文量