Subject Harmonization of Digital Biomarkers: Improved Detection of Mild Cognitive Impairment from Language Markers.

Bao Hoang, Yijiang Pang, Hiroko H Dodge, Jiayu Zhou
{"title":"Subject Harmonization of Digital Biomarkers: Improved Detection of Mild Cognitive Impairment from Language Markers.","authors":"Bao Hoang, Yijiang Pang, Hiroko H Dodge, Jiayu Zhou","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>Mild cognitive impairment (MCI) represents the early stage of dementia including Alzheimer's disease (AD) and is a crucial stage for therapeutic interventions and treatment. Early detection of MCI offers opportunities for early intervention and significantly benefits cohort enrichment for clinical trials. Imaging and in vivo markers in plasma and cerebrospinal fluid biomarkers have high detection performance, yet their prohibitive costs and intrusiveness demand more affordable and accessible alternatives. The recent advances in digital biomarkers, especially language markers, have shown great potential, where variables informative to MCI are derived from linguistic and/or speech and later used for predictive modeling. A major challenge in modeling language markers comes from the variability of how each person speaks. As the cohort size for language studies is usually small due to extensive data collection efforts, the variability among persons makes language markers hard to generalize to unseen subjects. In this paper, we propose a novel subject harmonization tool to address the issue of distributional differences in language markers across subjects, thus enhancing the generalization performance of machine learning models. Our empirical results show that machine learning models built on our harmonized features have improved prediction performance on unseen data. The source code and experiment scripts are available at https://github.com/illidanlab/subject_harmonization.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11017207/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0

Abstract

Mild cognitive impairment (MCI) represents the early stage of dementia including Alzheimer's disease (AD) and is a crucial stage for therapeutic interventions and treatment. Early detection of MCI offers opportunities for early intervention and significantly benefits cohort enrichment for clinical trials. Imaging and in vivo markers in plasma and cerebrospinal fluid biomarkers have high detection performance, yet their prohibitive costs and intrusiveness demand more affordable and accessible alternatives. The recent advances in digital biomarkers, especially language markers, have shown great potential, where variables informative to MCI are derived from linguistic and/or speech and later used for predictive modeling. A major challenge in modeling language markers comes from the variability of how each person speaks. As the cohort size for language studies is usually small due to extensive data collection efforts, the variability among persons makes language markers hard to generalize to unseen subjects. In this paper, we propose a novel subject harmonization tool to address the issue of distributional differences in language markers across subjects, thus enhancing the generalization performance of machine learning models. Our empirical results show that machine learning models built on our harmonized features have improved prediction performance on unseen data. The source code and experiment scripts are available at https://github.com/illidanlab/subject_harmonization.

分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
数字生物标记物的主题协调:从语言标记改进对轻度认知障碍的检测。
轻度认知障碍(MCI)是包括阿尔茨海默病(AD)在内的痴呆症的早期阶段,也是治疗干预和治疗的关键阶段。早期发现 MCI 可为早期干预提供机会,并极大地丰富临床试验的队列。血浆和脑脊液生物标记物中的成像和活体标记物具有很高的检测性能,但其高昂的成本和侵扰性要求有更实惠、更易获得的替代品。数字生物标志物,尤其是语言标志物的最新进展显示出巨大的潜力,这些标志物从语言和/或语音中提取出与 MCI 相关的变量,然后用于预测建模。语言标记建模的一大挑战来自于每个人说话方式的多变性。由于大量的数据收集工作,语言研究的队列规模通常较小,人与人之间的可变性使得语言标记很难推广到未见过的受试者。在本文中,我们提出了一种新颖的受试者协调工具,以解决不同受试者之间语言标记分布差异的问题,从而提高机器学习模型的泛化性能。我们的实证结果表明,基于我们协调过的特征建立的机器学习模型在未见数据上的预测性能有所提高。源代码和实验脚本见 https://github.com/illidanlab/subject_harmonization。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
4.50
自引率
0.00%
发文量
0
期刊最新文献
FedBrain: Federated Training of Graph Neural Networks for Connectome-based Brain Imaging Analysis. Generating new drug repurposing hypotheses using disease-specific hypergraphs. Impact of Measurement Noise on Genetic Association Studies of Cardiac Function. Imputation of race and ethnicity categories using genetic ancestry from real-world genomic testing data. intCC: An efficient weighted integrative consensus clustering of multimodal data.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1