首页 > 最新文献

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science最新文献

英文 中文
Avoiding Biased Clinical Machine Learning Model Performance Estimates in the Presence of Label Selection. 避免标签选择情况下有偏差的临床机器学习模型性能评估
Conor K Corbin, Michael Baiocchi, Jonathan H Chen

When evaluating the performance of clinical machine learning models, one must consider the deployment population. When the population of patients with observed labels is only a subset of the deployment population (label selection), standard model performance estimates on the observed population may be misleading. In this study we describe three classes of label selection and simulate five causally distinct scenarios to assess how particular selection mechanisms bias a suite of commonly reported binary machine learning model performance metrics. Simulations reveal that when selection is affected by observed features, naive estimates of model discrimination may be misleading. When selection is affected by labels, naive estimates of calibration fail to reflect reality. We borrow traditional weighting estimators from causal inference literature and find that when selection probabilities are properly specified, they recover full population estimates. We then tackle the real-world task of monitoring the performance of deployed machine learning models whose interactions with clinicians feed-back and affect the selection mechanism of the labels. We train three machine learning models to flag low-yield laboratory diagnostics, and simulate their intended consequence of reducing wasteful laboratory utilization. We find that naive estimates of AUROC on the observed population undershoot actual performance by up to 20%. Such a disparity could be large enough to lead to the wrongful termination of a successful clinical decision support tool. We propose an altered deployment procedure, one that combines injected randomization with traditional weighted estimates, and find it recovers true model performance.

在评估临床机器学习模型的性能时,必须考虑部署人群。当带有观察标签的患者群体只是部署群体的一个子集(标签选择)时,对观察群体的标准模型性能估计可能会产生误导。在这项研究中,我们描述了三类标签选择,并模拟了五种因果关系不同的情况,以评估特定的选择机制如何偏离一套通常报告的二元机器学习模型性能指标。模拟结果表明,当选择受到观测特征的影响时,对模型区分度的天真估计可能会产生误导。当选择受标签影响时,对校准的天真估计无法反映现实。我们借鉴了因果推理文献中的传统加权估计器,发现当选择概率被正确指定时,它们能恢复完整的群体估计值。然后,我们解决了监控已部署机器学习模型性能的现实任务,这些模型与临床医生的互动反馈会影响标签的选择机制。我们训练了三个机器学习模型来标记低收益的实验室诊断,并模拟其减少实验室浪费的预期结果。我们发现,对所观察人群的 AUROC 的天真估计会低估实际性能达 20%。这种差距足以导致错误地终止一个成功的临床决策支持工具。我们提出了一种改变的部署程序,该程序将注入随机化与传统的加权估计相结合,并发现它能恢复真实的模型性能。
{"title":"Avoiding Biased Clinical Machine Learning Model Performance Estimates in the Presence of Label Selection.","authors":"Conor K Corbin, Michael Baiocchi, Jonathan H Chen","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>When evaluating the performance of clinical machine learning models, one must consider the deployment population. When the population of patients with observed labels is only a subset of the deployment population (label selection), standard model performance estimates on the observed population may be misleading. In this study we describe three classes of label selection and simulate five causally distinct scenarios to assess how particular selection mechanisms bias a suite of commonly reported binary machine learning model performance metrics. Simulations reveal that when selection is affected by observed features, naive estimates of model discrimination may be misleading. When selection is affected by labels, naive estimates of calibration fail to reflect reality. We borrow traditional weighting estimators from causal inference literature and find that when selection probabilities are properly specified, they recover full population estimates. We then tackle the real-world task of monitoring the performance of deployed machine learning models whose interactions with clinicians feed-back and affect the selection mechanism of the labels. We train three machine learning models to flag low-yield laboratory diagnostics, and simulate their intended consequence of reducing wasteful laboratory utilization. We find that naive estimates of AUROC on the observed population undershoot actual performance by up to 20%. Such a disparity could be large enough to lead to the wrongful termination of a successful clinical decision support tool. We propose an altered deployment procedure, one that combines injected randomization with traditional weighted estimates, and find it recovers true model performance.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10283136/pdf/2405.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9703649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Linking Ambient NO2 Pollution Measures with Electronic Health Record Data to Study Asthma Exacerbations. 将环境二氧化氮污染测量与电子健康记录数据联系起来研究哮喘恶化。
Alana Schreibman, Sherrie Xie, Rebecca A Hubbard, Blanca E Himes

Electronic health record (EHR)-derived data can be linked to geospatially distributed socioeconomic and environmental factors to conduct large-scale epidemiologic studies. Ambient NO2 is a known environmental risk factor for asthma. However, health exposure studies often rely on data from geographically sparse regulatory monitors that may not reflect true individual exposure. We contrasted use of interpolated NO2 regulatory monitor data with raw satellite measurements and satellite-derived ground estimates, building on previous work which has computed improved exposure estimates from remotely sensed data. Raw satellite and satellite-derived ground measurements captured spatial variation missed by interpolated ground monitor measurements. Multivariable analyses comparing these three NO2 measurement approaches (interpolated monitor, raw satellite, and satellite-derived) revealed a positive relationship between exposure and asthma exacerbations for both satellite measurements. Exposure-outcome relationships using the interpolated monitor NO2 were inconsistent with known relationships to asthma, suggesting that interpolated monitor data might yield misleading results in small region studies.

电子健康记录(EHR)生成的数据可与地理空间分布的社会经济和环境因素联系起来,以开展大规模流行病学研究。环境中的二氧化氮是哮喘的已知环境风险因素。然而,健康暴露研究通常依赖于来自地理位置稀疏的监管监测仪的数据,这些数据可能无法反映真实的个人暴露情况。我们将内插的二氧化氮监管监测数据与原始卫星测量数据和卫星衍生的地面估算数据进行了对比,并借鉴了之前通过遥感数据计算改进的暴露估算数据的工作。原始卫星测量数据和卫星衍生地面测量数据捕捉到了插值地面监测仪测量数据所忽略的空间变化。比较这三种二氧化氮测量方法(内插监测、原始卫星和卫星衍生)的多变量分析表明,两种卫星测量方法的暴露量与哮喘恶化之间存在正相关关系。使用插值监测仪测量的二氧化氮暴露量与哮喘的已知关系不一致,这表明在小区域研究中,插值监测仪数据可能会产生误导性结果。
{"title":"Linking Ambient NO2 Pollution Measures with Electronic Health Record Data to Study Asthma Exacerbations.","authors":"Alana Schreibman, Sherrie Xie, Rebecca A Hubbard, Blanca E Himes","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Electronic health record (EHR)-derived data can be linked to geospatially distributed socioeconomic and environmental factors to conduct large-scale epidemiologic studies. Ambient NO2 is a known environmental risk factor for asthma. However, health exposure studies often rely on data from geographically sparse regulatory monitors that may not reflect true individual exposure. We contrasted use of interpolated NO2 regulatory monitor data with raw satellite measurements and satellite-derived ground estimates, building on previous work which has computed improved exposure estimates from remotely sensed data. Raw satellite and satellite-derived ground measurements captured spatial variation missed by interpolated ground monitor measurements. Multivariable analyses comparing these three NO2 measurement approaches (interpolated monitor, raw satellite, and satellite-derived) revealed a positive relationship between exposure and asthma exacerbations for both satellite measurements. Exposure-outcome relationships using the interpolated monitor NO2 were inconsistent with known relationships to asthma, suggesting that interpolated monitor data might yield misleading results in small region studies.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10283087/pdf/2145.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9832116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TRESTLE: Toolkit for Reproducible Execution of Speech, Text and Language Experiments. TRESTLE:可重复执行语音、文本和语言实验的工具包。
Changye Li, Weizhe Xu, Trevor Cohen, Martin Michalowski, Serguei Pakhomov

The evidence is growing that machine and deep learning methods can learn the subtle differences between the language produced by people with various forms of cognitive impairment such as dementia and cognitively healthy individuals. Valuable public data repositories such as TalkBank have made it possible for researchers in the computational community to join forces and learn from each other to make significant advances in this area. However, due to variability in approaches and data selection strategies used by various researchers, results obtained by different groups have been difficult to compare directly. In this paper, we present TRESTLE (Toolkit for Reproducible Execution of Speech Text and Language Experiments), an open source platform that focuses on two datasets from the TalkBank repository with dementia detection as an illustrative domain. Successfully deployed in the hackallenge (Hackathon/Challenge) of the International Workshop on Health Intelligence at AAAI 2022, TRESTLE provides a precise digital blueprint of the data pre-processing and selection strategies that can be reused via TRESTLE by other researchers seeking comparable results with their peers and current state-of-the-art (SOTA) approaches.

越来越多的证据表明,机器学习和深度学习方法可以学习患有各种形式认知障碍(如痴呆症)的人与认知健康的人所使用的语言之间的细微差别。TalkBank 等宝贵的公共数据资源库使计算界的研究人员能够联合起来,相互学习,从而在这一领域取得重大进展。然而,由于不同研究人员使用的方法和数据选择策略存在差异,不同研究小组取得的结果很难直接进行比较。在本文中,我们将介绍 TRESTLE(可重复执行语音文本和语言实验的工具包),这是一个开源平台,主要针对 TalkBank 库中的两个数据集,以痴呆症检测为示例领域。TRESTLE 在 2022 年 AAAI 健康智能国际研讨会的黑客挑战赛(Hackathon/Challenge)中成功部署,为数据预处理和选择策略提供了精确的数字蓝图,其他研究人员可通过 TRESTLE 重复使用这些策略,以寻求与同行和当前最先进(SOTA)方法相媲美的结果。
{"title":"TRESTLE: Toolkit for Reproducible Execution of Speech, Text and Language Experiments.","authors":"Changye Li, Weizhe Xu, Trevor Cohen, Martin Michalowski, Serguei Pakhomov","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The evidence is growing that machine and deep learning methods can learn the subtle differences between the language produced by people with various forms of cognitive impairment such as dementia and cognitively healthy individuals. Valuable public data repositories such as TalkBank have made it possible for researchers in the computational community to join forces and learn from each other to make significant advances in this area. However, due to variability in approaches and data selection strategies used by various researchers, results obtained by different groups have been difficult to compare directly. In this paper, we present TRESTLE (<b>T</b>oolkit for <b>R</b>eproducible <b>E</b>xecution of <b>S</b>peech <b>T</b>ext and <b>L</b>anguage <b>E</b>xperiments), an open source platform that focuses on two datasets from the TalkBank repository with dementia detection as an illustrative domain. Successfully deployed in the hackallenge (Hackathon/Challenge) of the International Workshop on Health Intelligence at AAAI 2022, TRESTLE provides a precise digital blueprint of the data pre-processing and selection strategies that can be reused via TRESTLE by other researchers seeking comparable results with their peers and current state-of-the-art (SOTA) approaches.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10283131/pdf/2277.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9715633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Principal Investigators' Perceptions on Factors Associated with Successful Recruitment in Clinical Trials. 主要研究人员对临床试验成功招募相关因素的看法。
Betina Idnay, Alex Butler, Yilu Fang, Ziran Li, Junghwan Lee, Casey Ta, Cong Liu, Brenda Ruotolo, Chi Yuan, Huanyao Chen, George Hripcsak, Elaine Larson, Chunhua Weng

Participant recruitment continues to be a challenge to the success of randomized controlled trials, resulting in increased costs, extended trial timelines and delayed treatment availability. Literature provides evidence that study design features (e.g., trial phase, study site involvement) and trial sponsor are significantly associated with recruitment success. Principal investigators oversee the conduct of clinical trials, including recruitment. Through a cross-sectional survey and a thematic analysis of free-text responses, we assessed the perceptions of sixteen principal investigators regarding success factors for participant recruitment. Study site involvement and funding source do not necessarily make recruitment easier or more challenging from the perspective of the principal investigators. The most commonly used recruitment strategies are also the most effort inefficient (e.g., in-person recruitment, reviewing the electronic medical records for prescreening). Finally, we recommended actionable steps, such as improving staff support and leveraging informatics-driven approaches, to allow clinical researchers to enhance participant recruitment.

参与者招募仍然是随机对照试验成功与否的一个挑战,会导致成本增加、试验时间延长和治疗延迟。有文献证明,研究设计特点(如试验阶段、研究机构参与)和试验发起人与招募成功与否密切相关。主要研究者负责监督临床试验的进行,包括招募。通过横向调查和对自由文本回答的主题分析,我们评估了 16 位主要研究者对参与者招募成功因素的看法。从主要研究者的角度来看,研究机构的参与和资金来源并不一定会使招募工作变得更容易或更具挑战性。最常用的招募策略也是最费力低效的(如亲自招募、审查电子病历进行预选)。最后,我们建议了一些可操作的步骤,如改善员工支持和利用信息学驱动的方法,使临床研究人员能够加强参与者招募工作。
{"title":"Principal Investigators' Perceptions on Factors Associated with Successful Recruitment in Clinical Trials.","authors":"Betina Idnay, Alex Butler, Yilu Fang, Ziran Li, Junghwan Lee, Casey Ta, Cong Liu, Brenda Ruotolo, Chi Yuan, Huanyao Chen, George Hripcsak, Elaine Larson, Chunhua Weng","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Participant recruitment continues to be a challenge to the success of randomized controlled trials, resulting in increased costs, extended trial timelines and delayed treatment availability. Literature provides evidence that study design features (e.g., trial phase, study site involvement) and trial sponsor are significantly associated with recruitment success. Principal investigators oversee the conduct of clinical trials, including recruitment. Through a cross-sectional survey and a thematic analysis of free-text responses, we assessed the perceptions of sixteen principal investigators regarding success factors for participant recruitment. Study site involvement and funding source do not necessarily make recruitment easier or more challenging from the perspective of the principal investigators. The most commonly used recruitment strategies are also the most effort inefficient (e.g., in-person recruitment, reviewing the electronic medical records for prescreening). Finally, we recommended actionable steps, such as improving staff support and leveraging informatics-driven approaches, to allow clinical researchers to enhance participant recruitment.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10283115/pdf/2207.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10070909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient Federated Kinship Relationship Identification. 有效的联邦亲属关系识别。
Xinyue Wang, Leonard Dervishi, Wentao Li, Xiaoqian Jiang, Erman Ayday, Jaideep Vaidya

Kinship relationship estimation plays a significant role in today's genome studies. Since genetic data are mostly stored and protected in different silos, retrieving the desirable kinship relationships across federated data warehouses is a non-trivial problem. The ability to identify and connect related individuals is important for both research and clinical applications. In this work, we propose a new privacy-preserving kinship relationship estimation framework: Incremental Update Kinship Identification (INK). The proposed framework includes three key components that allow us to control the balance between privacy and accuracy (of kinship estimation): an incremental process coupled with the use of auxiliary information and informative scores. Our empirical evaluation shows that INK can achieve higher kinship identification correctness while exposing fewer genetic markers.

亲属关系估计在今天的基因组研究中起着重要的作用。由于遗传数据主要存储和保护在不同的筒仓中,因此跨联邦数据仓库检索所需的亲属关系是一个非常重要的问题。识别和联系相关个体的能力对于研究和临床应用都很重要。本文提出了一种新的保护隐私的亲属关系估计框架:增量更新亲属关系识别(Incremental Update kinship Identification, INK)。提出的框架包括三个关键组成部分,使我们能够控制隐私和准确性(亲属关系估计)之间的平衡:一个增量过程,加上使用辅助信息和信息分数。我们的实证评估表明,INK可以在暴露较少遗传标记的情况下获得更高的亲属识别正确性。
{"title":"Efficient Federated Kinship Relationship Identification.","authors":"Xinyue Wang, Leonard Dervishi, Wentao Li, Xiaoqian Jiang, Erman Ayday, Jaideep Vaidya","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Kinship relationship estimation plays a significant role in today's genome studies. Since genetic data are mostly stored and protected in different silos, retrieving the desirable kinship relationships across federated data warehouses is a non-trivial problem. The ability to identify and connect related individuals is important for both research and clinical applications. In this work, we propose a new privacy-preserving kinship relationship estimation framework: Incremental Update Kinship Identification (INK). The proposed framework includes three key components that allow us to control the balance between privacy and accuracy (of kinship estimation): an incremental process coupled with the use of auxiliary information and informative scores. Our empirical evaluation shows that INK can achieve higher kinship identification correctness while exposing fewer genetic markers.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10283133/pdf/2171.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10071473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using Explainable AI to Cross-Validate Socio-economic Disparities Among Covid-19 Patient Mortality 使用可解释的人工智能交叉验证Covid-19患者死亡率的社会经济差异
Linlin Shi, Redoan Rahman, E. Melamed, J. Gwizdka, Justin F. Rousseau, Ying Ding
This paper applies eXplainable Artificial Intelligence (XAI) methods to investigate the socioeconomic disparities in COVID-19 patient mortality. An Extreme Gradient Boosting (XGBoost) prediction model is built based on a de-identified Austin area hospital dataset to predict the mortality of COVID-19 patients. We apply two XAI methods, Shapley Additive exPlanations (SHAP) and Locally Interpretable Model Agnostic Explanations (LIME), to compare the global and local interpretation of feature importance. This paper demonstrates the advantages of using XAI which shows the feature importance and decisive capability. Furthermore, we use the XAI methods to cross-validate their interpretations for individual patients. The XAI models reveal that Medicare financial class, older age, and gender have high impact on the mortality prediction. We find that LIME's local interpretation does not show significant differences in feature importance comparing to SHAP, which suggests pattern confirmation. This paper demonstrates the importance of XAI methods in cross-validation of feature attributions.
本文应用可解释人工智能(eXplainable Artificial Intelligence, XAI)方法研究新冠肺炎患者死亡率的社会经济差异。基于去识别的奥斯汀地区医院数据集,建立了极端梯度增强(XGBoost)预测模型,用于预测COVID-19患者的死亡率。我们采用两种XAI方法,Shapley加性解释(SHAP)和局部可解释模型不可知解释(LIME),来比较特征重要性的全局解释和局部解释。本文论证了使用XAI的优势,显示了XAI的特点、重要性和决定性。此外,我们使用XAI方法来交叉验证他们对个体患者的解释。XAI模型显示,医疗保险财务阶层、年龄和性别对死亡率预测有很大影响。我们发现LIME的局部解释在特征重要性上与SHAP没有显著差异,这表明模式得到了确认。本文论证了XAI方法在特征属性交叉验证中的重要性。
{"title":"Using Explainable AI to Cross-Validate Socio-economic Disparities Among Covid-19 Patient Mortality","authors":"Linlin Shi, Redoan Rahman, E. Melamed, J. Gwizdka, Justin F. Rousseau, Ying Ding","doi":"10.48550/arXiv.2302.08605","DOIUrl":"https://doi.org/10.48550/arXiv.2302.08605","url":null,"abstract":"This paper applies eXplainable Artificial Intelligence (XAI) methods to investigate the socioeconomic disparities in COVID-19 patient mortality. An Extreme Gradient Boosting (XGBoost) prediction model is built based on a de-identified Austin area hospital dataset to predict the mortality of COVID-19 patients. We apply two XAI methods, Shapley Additive exPlanations (SHAP) and Locally Interpretable Model Agnostic Explanations (LIME), to compare the global and local interpretation of feature importance. This paper demonstrates the advantages of using XAI which shows the feature importance and decisive capability. Furthermore, we use the XAI methods to cross-validate their interpretations for individual patients. The XAI models reveal that Medicare financial class, older age, and gender have high impact on the mortality prediction. We find that LIME's local interpretation does not show significant differences in feature importance comparing to SHAP, which suggests pattern confirmation. This paper demonstrates the importance of XAI methods in cross-validation of feature attributions.","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83219973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TRESTLE: Toolkit for Reproducible Execution of Speech, Text and Language Experiments TRESTLE:语音,文本和语言实验的可重复执行工具包
Changye Li, T. Cohen, Martin Michalowski, Serguei V. S. Pakhomov
The evidence is growing that machine and deep learning methods can learn the subtle differences between the language produced by people with various forms of cognitive impairment such as dementia and cognitively healthy individuals. Valuable public data repositories such as TalkBank have made it possible for researchers in the computational community to join forces and learn from each other to make significant advances in this area. However, due to variability in approaches and data selection strategies used by various researchers, results obtained by different groups have been difficult to compare directly. In this paper, we present TRESTLE (Toolkit for Reproducible Execution of Speech Text and Language Experiments), an open source platform that focuses on two datasets from the TalkBank repository with dementia detection as an illustrative domain. Successfully deployed in the hackallenge (Hackathon/Challenge) of the International Workshop on Health Intelligence at AAAI 2022, TRESTLE provides a precise digital blueprint of the data pre-processing and selection strategies that can be reused via TRESTLE by other researchers seeking comparable results with their peers and current state-of-the-art (SOTA) approaches.
越来越多的证据表明,机器和深度学习方法可以学习到患有各种认知障碍(如痴呆症)的人与认知健康的人之间语言的细微差异。像TalkBank这样有价值的公共数据库使得计算社区的研究人员能够联合起来,相互学习,在这一领域取得重大进展。然而,由于不同研究人员使用的方法和数据选择策略的可变性,不同群体获得的结果很难直接比较。在本文中,我们介绍了TRESTLE(可重复执行语音文本和语言实验的工具包),这是一个开源平台,专注于来自TalkBank存储库的两个数据集,并将痴呆检测作为示例域。在AAAI 2022年健康智能国际研讨会的黑客挑战赛(Hackathon/Challenge)中成功部署,TRESTLE提供了数据预处理和选择策略的精确数字蓝图,可以通过TRESTLE被其他寻求与同行和当前最先进(SOTA)方法比较结果的研究人员重用。
{"title":"TRESTLE: Toolkit for Reproducible Execution of Speech, Text and Language Experiments","authors":"Changye Li, T. Cohen, Martin Michalowski, Serguei V. S. Pakhomov","doi":"10.48550/arXiv.2302.07322","DOIUrl":"https://doi.org/10.48550/arXiv.2302.07322","url":null,"abstract":"The evidence is growing that machine and deep learning methods can learn the subtle differences between the language produced by people with various forms of cognitive impairment such as dementia and cognitively healthy individuals. Valuable public data repositories such as TalkBank have made it possible for researchers in the computational community to join forces and learn from each other to make significant advances in this area. However, due to variability in approaches and data selection strategies used by various researchers, results obtained by different groups have been difficult to compare directly. In this paper, we present TRESTLE (Toolkit for Reproducible Execution of Speech Text and Language Experiments), an open source platform that focuses on two datasets from the TalkBank repository with dementia detection as an illustrative domain. Successfully deployed in the hackallenge (Hackathon/Challenge) of the International Workshop on Health Intelligence at AAAI 2022, TRESTLE provides a precise digital blueprint of the data pre-processing and selection strategies that can be reused via TRESTLE by other researchers seeking comparable results with their peers and current state-of-the-art (SOTA) approaches.","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87266086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluate underdiagnosis and overdiagnosis bias of deep learning model on primary open-angle glaucoma diagnosis in under-served patient populations 评估深度学习模型在缺医少药人群中原发性开角型青光眼诊断中的漏诊和过度诊断偏差
Mingquan Lin, Yuyun Xiao, Bojian Hou, Tingyi Wanyan, M. Sharma, Zhangyang Wang, Fei Wang, S. V. Tassel, Yifan Peng
In the United States, primary open-angle glaucoma (POAG) is the leading cause of blindness, especially among African American and Hispanic individuals. Deep learning has been widely used to detect POAG using fundus images as its performance is comparable to or even surpasses diagnosis by clinicians. However, human bias in clinical diagnosis may be reflected and amplified in the widely-used deep learning models, thus impacting their performance. Biases may cause (1) underdiagnosis, increasing the risks of delayed or inadequate treatment, and (2) overdiagnosis, which may increase individuals' stress, fear, well-being, and unnecessary/costly treatment. In this study, we examined the underdiagnosis and overdiagnosis when applying deep learning in POAG detection based on the Ocular Hypertension Treatment Study (OHTS) from 22 centers across 16 states in the United States. Our results show that the widely-used deep learning model can underdiagnose or overdiagnose under-served populations. The most underdiagnosed group is female younger (< 60 yrs) group, and the most overdiagnosed group is Black older (≥ 60 yrs) group. Biased diagnosis through traditional deep learning methods may delay disease detection, treatment and create burdens among under-served populations, thereby, raising ethical concerns about using deep learning models in ophthalmology clinics.
在美国,原发性开角型青光眼(POAG)是致盲的主要原因,尤其是在非洲裔美国人和西班牙裔美国人中。深度学习已被广泛用于使用眼底图像检测POAG,因为其性能可与临床医生的诊断相媲美甚至超过临床医生的诊断。然而,临床诊断中的人为偏见可能会在广泛使用的深度学习模型中得到反映和放大,从而影响其性能。偏见可能导致(1)诊断不足,增加延迟或不充分治疗的风险;(2)过度诊断,这可能增加个人的压力、恐惧、幸福感和不必要/昂贵的治疗。在这项研究中,我们基于美国16个州22个中心的高眼压治疗研究(OHTS),研究了深度学习在POAG检测中的诊断不足和过度诊断。我们的研究结果表明,广泛使用的深度学习模型可能会对服务不足的人群诊断不足或过度诊断。漏诊率最高的是女性青年(< 60岁)组,漏诊率最高的是黑人老年(≥60岁)组。通过传统的深度学习方法进行的有偏见的诊断可能会延迟疾病的检测和治疗,并给服务不足的人群带来负担,从而引发了在眼科诊所使用深度学习模型的伦理问题。
{"title":"Evaluate underdiagnosis and overdiagnosis bias of deep learning model on primary open-angle glaucoma diagnosis in under-served patient populations","authors":"Mingquan Lin, Yuyun Xiao, Bojian Hou, Tingyi Wanyan, M. Sharma, Zhangyang Wang, Fei Wang, S. V. Tassel, Yifan Peng","doi":"10.48550/arXiv.2301.11315","DOIUrl":"https://doi.org/10.48550/arXiv.2301.11315","url":null,"abstract":"In the United States, primary open-angle glaucoma (POAG) is the leading cause of blindness, especially among African American and Hispanic individuals. Deep learning has been widely used to detect POAG using fundus images as its performance is comparable to or even surpasses diagnosis by clinicians. However, human bias in clinical diagnosis may be reflected and amplified in the widely-used deep learning models, thus impacting their performance. Biases may cause (1) underdiagnosis, increasing the risks of delayed or inadequate treatment, and (2) overdiagnosis, which may increase individuals' stress, fear, well-being, and unnecessary/costly treatment. In this study, we examined the underdiagnosis and overdiagnosis when applying deep learning in POAG detection based on the Ocular Hypertension Treatment Study (OHTS) from 22 centers across 16 states in the United States. Our results show that the widely-used deep learning model can underdiagnose or overdiagnose under-served populations. The most underdiagnosed group is female younger (< 60 yrs) group, and the most overdiagnosed group is Black older (≥ 60 yrs) group. Biased diagnosis through traditional deep learning methods may delay disease detection, treatment and create burdens among under-served populations, thereby, raising ethical concerns about using deep learning models in ophthalmology clinics.","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76296744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Discovering Precision AD Biomarkers with Varying Prognosis Effects in Genetics Driven Subpopulations. 在遗传驱动的亚群中发现具有不同预后影响的精准AD生物标志物。
Brian N Lee, Junwen Wang, Kwangsik Nho, Andrew J Saykin, Li Shen

Alzheimer's Disease (AD) is a highly heritable neurodegenerative disorder characterized by memory impairments. Understanding how genetic factors contribute to AD pathology may inform interventions to slow or prevent the progression of AD. We performed stratified genetic analyses of 1,574 Alzheimer's Disease Neuroimaging Initiative (ADNI) participants to examine associations between levels of quantitative traits (QT's) and future diagnosis. The Chow test was employed to determine if an individual's genetic profile affects identified predictive relationships between QT's and future diagnosis. Our chow test analysis discovered that cognitive and PET-based biomarkers differentially predicted future diagnosis when stratifying on allelic dosage of AD loci. Post-hoc bootstrapped and association analyses of biomarkers confirmed differential effects, emphasizing the necessity of stratified models to realize individualized AD diagnosis prediction. This novel application of the Chow test allows for the quantification and direct comparison of genetic-based differences. Our findings, as well as the identified QT-future diagnosis relationships, warrant future investigation from a biological context.

阿尔茨海默病(AD)是一种高度遗传性的神经退行性疾病,以记忆障碍为特征。了解遗传因素对阿尔茨海默病病理的影响可以为干预措施提供信息,以减缓或预防阿尔茨海默病的进展。我们对1574名阿尔茨海默病神经影像学倡议(ADNI)参与者进行了分层遗传分析,以检查数量性状(QT)水平与未来诊断之间的关系。Chow试验用于确定个体的遗传特征是否影响QT综合征与未来诊断之间的预测关系。我们的chow测试分析发现,认知和基于pet的生物标志物在对AD基因座的等位基因剂量进行分层时预测未来诊断的差异。生物标志物的事后自举和关联分析证实了差异效应,强调了分层模型实现个体化AD诊断预测的必要性。这种新应用的Chow测试允许定量和直接比较基于遗传的差异。我们的发现,以及确定的qt -未来诊断关系,保证了未来从生物学背景下的调查。
{"title":"Discovering Precision AD Biomarkers with Varying Prognosis Effects in Genetics Driven Subpopulations.","authors":"Brian N Lee,&nbsp;Junwen Wang,&nbsp;Kwangsik Nho,&nbsp;Andrew J Saykin,&nbsp;Li Shen","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Alzheimer's Disease (AD) is a highly heritable neurodegenerative disorder characterized by memory impairments. Understanding how genetic factors contribute to AD pathology may inform interventions to slow or prevent the progression of AD. We performed stratified genetic analyses of 1,574 Alzheimer's Disease Neuroimaging Initiative (ADNI) participants to examine associations between levels of quantitative traits (QT's) and future diagnosis. The Chow test was employed to determine if an individual's genetic profile affects identified predictive relationships between QT's and future diagnosis. Our chow test analysis discovered that cognitive and PET-based biomarkers differentially predicted future diagnosis when stratifying on allelic dosage of AD loci. Post-hoc bootstrapped and association analyses of biomarkers confirmed differential effects, emphasizing the necessity of stratified models to realize individualized AD diagnosis prediction. This novel application of the Chow test allows for the quantification and direct comparison of genetic-based differences. Our findings, as well as the identified QT-future diagnosis relationships, warrant future investigation from a biological context.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10283147/pdf/2152.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10070915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessing the Predictive and Analytics Capability of Electronic Clinical Data for High-Cost Patients. 评估高成本患者电子临床数据的预测和分析能力。
Saathvika Diviti, Adam Wilcox

Hotspotting may prevent high healthcare costs surrounding a minority of patients when void of issues such as availability, completeness, and accessibility of information in electronic health records (EHRs). We performed a descriptive study using Barnes-Jewish Hospital patients to assess the availability and accessibility of information that can predict negative outcomes. Manual electronic chart review produced descriptive statistics for a sample of 100 High Resource and 100 Control patient records. The majority of cases were not predictive. Predictive information and their sources were inconsistent. Certain types of patients were more predictive than others, albeit a small percentage of the total. Among the largest and most predictive groups was the most difficult to classify, "Other." These findings were expected and consistent with previous studies but contrast with approaches for attempting prediction such as hotspotting. Further studies may provide solutions to the problems and limitations identified in this study.

当电子健康记录(EHRs)中信息的可用性、完整性和可访问性等问题不存在时,热点定位可以防止围绕少数患者的高额医疗保健费用。我们对巴恩斯-犹太医院的患者进行了一项描述性研究,以评估可以预测负面结果的信息的可用性和可获得性。手动电子图表审查产生了100例高资源和100例对照患者记录样本的描述性统计数据。大多数病例没有预测性。预测信息及其来源不一致。某些类型的患者比其他类型的患者更具预测性,尽管只占总数的一小部分。在最大和最具预测性的群体中,最难分类的是“其他”。这些发现是意料之中的,与以前的研究一致,但与尝试预测的方法(如热点)形成对比。进一步的研究可能会为本研究中发现的问题和局限性提供解决方案。
{"title":"Assessing the Predictive and Analytics Capability of Electronic Clinical Data for High-Cost Patients.","authors":"Saathvika Diviti,&nbsp;Adam Wilcox","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Hotspotting may prevent high healthcare costs surrounding a minority of patients when void of issues such as availability, completeness, and accessibility of information in electronic health records (EHRs). We performed a descriptive study using Barnes-Jewish Hospital patients to assess the availability and accessibility of information that can predict negative outcomes. Manual electronic chart review produced descriptive statistics for a sample of 100 High Resource and 100 Control patient records. The majority of cases were not predictive. Predictive information and their sources were inconsistent. Certain types of patients were more predictive than others, albeit a small percentage of the total. Among the largest and most predictive groups was the most difficult to classify, \"Other.\" These findings were expected and consistent with previous studies but contrast with approaches for attempting prediction such as hotspotting. Further studies may provide solutions to the problems and limitations identified in this study.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10283137/pdf/2098.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9715634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1