Introducing Attribute Association Graphs to Facilitate Medical Data Exploration: Development and Evaluation Using Epidemiological Study Data.

IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS JMIR Medical Informatics Pub Date : 2024-07-24 DOI:10.2196/49865
Louis Bellmann, Alexander Johannes Wiederhold, Leona Trübe, Raphael Twerenbold, Frank Ückert, Karl Gottfried
{"title":"Introducing Attribute Association Graphs to Facilitate Medical Data Exploration: Development and Evaluation Using Epidemiological Study Data.","authors":"Louis Bellmann, Alexander Johannes Wiederhold, Leona Trübe, Raphael Twerenbold, Frank Ückert, Karl Gottfried","doi":"10.2196/49865","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Interpretability and intuitive visualization facilitate medical knowledge generation through big data. In addition, robustness to high-dimensional and missing data is a requirement for statistical approaches in the medical domain. A method tailored to the needs of physicians must meet all the abovementioned criteria.</p><p><strong>Objective: </strong>This study aims to develop an accessible tool for visual data exploration without the need for programming knowledge, adjusting complex parameterizations, or handling missing data. We sought to use statistical analysis using the setting of disease and control cohorts familiar to clinical researchers. We aimed to guide the user by identifying and highlighting data patterns associated with disease and reveal relations between attributes within the data set.</p><p><strong>Methods: </strong>We introduce the attribute association graph, a novel graph structure designed for visual data exploration using robust statistical metrics. The nodes capture frequencies of participant attributes in disease and control cohorts as well as deviations between groups. The edges represent conditional relations between attributes. The graph is visualized using the Neo4j (Neo4j, Inc) data platform and can be interactively explored without the need for technical knowledge. Nodes with high deviations between cohorts and edges of noticeable conditional relationship are highlighted to guide the user during the exploration. The graph is accompanied by a dashboard visualizing variable distributions. For evaluation, we applied the graph and dashboard to the Hamburg City Health Study data set, a large cohort study conducted in the city of Hamburg, Germany. All data structures can be accessed freely by researchers, physicians, and patients. In addition, we developed a user test conducted with physicians incorporating the System Usability Scale, individual questions, and user tasks.</p><p><strong>Results: </strong>We evaluated the attribute association graph and dashboard through an exemplary data analysis of participants with a general cardiovascular disease in the Hamburg City Health Study data set. All results extracted from the graph structure and dashboard are in accordance with findings from the literature, except for unusually low cholesterol levels in participants with cardiovascular disease, which could be induced by medication. In addition, 95% CIs of Pearson correlation coefficients were calculated for all associations identified during the data analysis, confirming the results. In addition, a user test with 10 physicians assessing the usability of the proposed methods was conducted. A System Usability Scale score of 70.5% and average successful task completion of 81.4% were reported.</p><p><strong>Conclusions: </strong>The proposed attribute association graph and dashboard enable intuitive visual data exploration. They are robust to high-dimensional as well as missing data and require no parameterization. The usability for clinicians was confirmed via a user test, and the validity of the statistical results was confirmed by associations known from literature and standard statistical inference.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.1000,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11306949/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR Medical Informatics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/49865","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Interpretability and intuitive visualization facilitate medical knowledge generation through big data. In addition, robustness to high-dimensional and missing data is a requirement for statistical approaches in the medical domain. A method tailored to the needs of physicians must meet all the abovementioned criteria.

Objective: This study aims to develop an accessible tool for visual data exploration without the need for programming knowledge, adjusting complex parameterizations, or handling missing data. We sought to use statistical analysis using the setting of disease and control cohorts familiar to clinical researchers. We aimed to guide the user by identifying and highlighting data patterns associated with disease and reveal relations between attributes within the data set.

Methods: We introduce the attribute association graph, a novel graph structure designed for visual data exploration using robust statistical metrics. The nodes capture frequencies of participant attributes in disease and control cohorts as well as deviations between groups. The edges represent conditional relations between attributes. The graph is visualized using the Neo4j (Neo4j, Inc) data platform and can be interactively explored without the need for technical knowledge. Nodes with high deviations between cohorts and edges of noticeable conditional relationship are highlighted to guide the user during the exploration. The graph is accompanied by a dashboard visualizing variable distributions. For evaluation, we applied the graph and dashboard to the Hamburg City Health Study data set, a large cohort study conducted in the city of Hamburg, Germany. All data structures can be accessed freely by researchers, physicians, and patients. In addition, we developed a user test conducted with physicians incorporating the System Usability Scale, individual questions, and user tasks.

Results: We evaluated the attribute association graph and dashboard through an exemplary data analysis of participants with a general cardiovascular disease in the Hamburg City Health Study data set. All results extracted from the graph structure and dashboard are in accordance with findings from the literature, except for unusually low cholesterol levels in participants with cardiovascular disease, which could be induced by medication. In addition, 95% CIs of Pearson correlation coefficients were calculated for all associations identified during the data analysis, confirming the results. In addition, a user test with 10 physicians assessing the usability of the proposed methods was conducted. A System Usability Scale score of 70.5% and average successful task completion of 81.4% were reported.

Conclusions: The proposed attribute association graph and dashboard enable intuitive visual data exploration. They are robust to high-dimensional as well as missing data and require no parameterization. The usability for clinicians was confirmed via a user test, and the validity of the statistical results was confirmed by associations known from literature and standard statistical inference.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
引入属性关联图以促进医学数据探索:利用流行病学研究数据进行开发和评估。
背景:可解释性和直观可视化有助于通过大数据生成医学知识。此外,对高维数据和缺失数据的鲁棒性也是医学领域对统计方法的要求。针对医生需求量身定制的方法必须满足上述所有标准:本研究旨在开发一种无需编程知识、无需调整复杂参数或处理缺失数据的可视化数据探索工具。我们试图使用临床研究人员熟悉的疾病和对照队列进行统计分析。我们的目标是通过识别和突出与疾病相关的数据模式来引导用户,并揭示数据集中属性之间的关系:我们引入了属性关联图,这是一种新颖的图结构,旨在利用稳健的统计指标进行可视化数据探索。图中的节点代表疾病组和对照组中参与者属性的频率以及组间偏差。边代表属性之间的条件关系。该图通过 Neo4j(Neo4j 公司)数据平台实现可视化,无需技术知识即可进行交互式探索。队列间偏差较大的节点和条件关系明显的边缘会突出显示,以便在探索过程中为用户提供指导。该图还配有一个仪表盘,可直观显示变量分布。为了进行评估,我们将图表和仪表盘应用于汉堡市健康研究数据集,这是一项在德国汉堡市进行的大型队列研究。研究人员、医生和患者均可自由访问所有数据结构。此外,我们还结合系统可用性量表、个人问题和用户任务,对医生进行了用户测试:我们通过对汉堡市健康研究数据集中患有一般心血管疾病的参与者进行示范性数据分析,对属性关联图和仪表板进行了评估。从图表结构和仪表板中提取的所有结果都与文献研究结果一致,但患有心血管疾病的参与者胆固醇水平异常低的情况除外,这可能是药物引起的。此外,还计算了数据分析过程中发现的所有关联的皮尔逊相关系数的 95% CIs,证实了分析结果。此外,还对 10 名医生进行了用户测试,以评估建议方法的可用性。系统可用性量表得分率为 70.5%,平均成功完成任务率为 81.4%:结论:所提出的属性关联图和仪表盘可实现直观的可视化数据探索。结论:提议的属性关联图和仪表盘可实现直观的可视化数据探索,对高维数据和缺失数据具有鲁棒性,且无需参数化。临床医生的可用性已通过用户测试得到证实,统计结果的有效性已通过文献中的关联和标准统计推断得到证实。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
JMIR Medical Informatics
JMIR Medical Informatics Medicine-Health Informatics
CiteScore
7.90
自引率
3.10%
发文量
173
审稿时长
12 weeks
期刊介绍: JMIR Medical Informatics (JMI, ISSN 2291-9694) is a top-rated, tier A journal which focuses on clinical informatics, big data in health and health care, decision support for health professionals, electronic health records, ehealth infrastructures and implementation. It has a focus on applied, translational research, with a broad readership including clinicians, CIOs, engineers, industry and health informatics professionals. Published by JMIR Publications, publisher of the Journal of Medical Internet Research (JMIR), the leading eHealth/mHealth journal (Impact Factor 2016: 5.175), JMIR Med Inform has a slightly different scope (emphasizing more on applications for clinicians and health professionals rather than consumers/citizens, which is the focus of JMIR), publishes even faster, and also allows papers which are more technical or more formative than what would be published in the Journal of Medical Internet Research.
期刊最新文献
Comparative Study to Evaluate the Accuracy of Differential Diagnosis Lists Generated by Gemini Advanced, Gemini, and Bard for a Case Report Series Analysis: Cross-Sectional Study. Disambiguating Clinical Abbreviations by One-to-All Classification: Algorithm Development and Validation Study. Addressing Information Biases Within Electronic Health Record Data to Improve the Examination of Epidemiologic Associations With Diabetes Prevalence Among Young Adults: Cross-Sectional Study. Toward Better Semantic Interoperability of Data Element Repositories in Medicine: Analysis Study. Practical Aspects of Using Large Language Models to Screen Abstracts for Cardiovascular Drug Development: Cross-Sectional Study.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1