Using natural conversations to classify autism with limited data: Age matters

M. Hauser, E. Sariyanidi, B. Tunç, C. Zampella, E. Brodkin, R. Schultz, J. Parish-Morris
{"title":"Using natural conversations to classify autism with limited data: Age matters","authors":"M. Hauser, E. Sariyanidi, B. Tunç, C. Zampella, E. Brodkin, R. Schultz, J. Parish-Morris","doi":"10.18653/v1/W19-3006","DOIUrl":null,"url":null,"abstract":"Spoken language ability is highly heterogeneous in Autism Spectrum Disorder (ASD), which complicates efforts to identify linguistic markers for use in diagnostic classification, clinical characterization, and for research and clinical outcome measurement. Machine learning techniques that harness the power of multivariate statistics and non-linear data analysis hold promise for modeling this heterogeneity, but many models require enormous datasets, which are unavailable for most psychiatric conditions (including ASD). In lieu of such datasets, good models can still be built by leveraging domain knowledge. In this study, we compare two machine learning approaches: the first approach incorporates prior knowledge about language variation across middle childhood, adolescence, and adulthood to classify 6-minute naturalistic conversation samples from 140 age- and IQ-matched participants (81 with ASD), while the other approach treats all ages the same. We found that individual age-informed models were significantly more accurate than a single model tasked with building a common algorithm across age groups. Furthermore, predictive linguistic features differed significantly by age group, confirming the importance of considering age-related changes in language use when classifying ASD. Our results suggest that limitations imposed by heterogeneity inherent to ASD and from developmental change with age can be (at least partially) overcome using domain knowledge, such as understanding spoken language development from childhood through adulthood.","PeriodicalId":201097,"journal":{"name":"Proceedings of the Sixth Workshop on Computational Linguistics and Clinical Psychology","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Sixth Workshop on Computational Linguistics and Clinical Psychology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/W19-3006","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

Spoken language ability is highly heterogeneous in Autism Spectrum Disorder (ASD), which complicates efforts to identify linguistic markers for use in diagnostic classification, clinical characterization, and for research and clinical outcome measurement. Machine learning techniques that harness the power of multivariate statistics and non-linear data analysis hold promise for modeling this heterogeneity, but many models require enormous datasets, which are unavailable for most psychiatric conditions (including ASD). In lieu of such datasets, good models can still be built by leveraging domain knowledge. In this study, we compare two machine learning approaches: the first approach incorporates prior knowledge about language variation across middle childhood, adolescence, and adulthood to classify 6-minute naturalistic conversation samples from 140 age- and IQ-matched participants (81 with ASD), while the other approach treats all ages the same. We found that individual age-informed models were significantly more accurate than a single model tasked with building a common algorithm across age groups. Furthermore, predictive linguistic features differed significantly by age group, confirming the importance of considering age-related changes in language use when classifying ASD. Our results suggest that limitations imposed by heterogeneity inherent to ASD and from developmental change with age can be (at least partially) overcome using domain knowledge, such as understanding spoken language development from childhood through adulthood.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用自然对话在有限数据下对自闭症进行分类:年龄很重要
自闭症谱系障碍(ASD)患者的口语能力是高度异质性的,这使得识别用于诊断分类、临床表征以及研究和临床结果测量的语言标记变得复杂。利用多元统计和非线性数据分析能力的机器学习技术有望为这种异质性建模,但许多模型需要庞大的数据集,而这些数据集无法用于大多数精神疾病(包括ASD)。代替这样的数据集,好的模型仍然可以通过利用领域知识来构建。在本研究中,我们比较了两种机器学习方法:第一种方法结合了关于童年中期、青春期和成年期语言变化的先验知识,对140名年龄和智商匹配的参与者(81名患有ASD)的6分钟自然对话样本进行分类,而另一种方法对所有年龄段的人都进行了相同的分类。我们发现,与建立跨年龄组通用算法的单一模型相比,单个年龄信息模型的准确性要高得多。此外,预测语言特征在不同年龄组之间存在显著差异,这证实了在对ASD进行分类时考虑与年龄相关的语言使用变化的重要性。我们的研究结果表明,ASD固有的异质性和随着年龄的发展变化所带来的限制可以(至少部分地)通过领域知识来克服,例如理解从童年到成年的口语发展。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Depressed Individuals Use Negative Self-Focused Language When Recalling Recent Interactions with Close Romantic Partners but Not Family or Friends Suicide Risk Assessment on Social Media: USI-UPF at the CLPsych 2019 Shared Task ConvSent at CLPsych 2019 Task A: Using Post-level Sentiment Features for Suicide Risk Prediction on Reddit Linguistic Analysis of Schizophrenia in Reddit Posts Predicting Suicide Risk from Online Postings in Reddit The UGent-IDLab submission to the CLPysch 2019 Shared Task A
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1