HERALD:用于纵向健康数据分析的特定领域查询语言。

IF 3.7 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS International Journal of Medical Informatics Pub Date : 2024-10-05 DOI:10.1016/j.ijmedinf.2024.105646
Lena Baum, Marco Johns, Armin Müller, Hammam Abu Attieh, Fabian Prasser
{"title":"HERALD:用于纵向健康数据分析的特定领域查询语言。","authors":"Lena Baum,&nbsp;Marco Johns,&nbsp;Armin Müller,&nbsp;Hammam Abu Attieh,&nbsp;Fabian Prasser","doi":"10.1016/j.ijmedinf.2024.105646","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Large-scale health data has significant potential for research and innovation, especially with longitudinal data offering insights into prevention, disease progression, and treatment effects. Yet, analyzing this data type is complex, as data points are repeatedly documented along the timeline. As a consequence, extracting cross-sectional tabular data suitable for statistical analysis and machine learning can be challenging for medical researchers and data scientists alike, with existing tools lacking balance between ease-of-use and comprehensiveness.</div></div><div><h3>Objective</h3><div>This paper introduces HERALD, a novel domain-specific query language designed to support the transformation of longitudinal health data into cross-sectional tables. We describe the basic concepts, the query syntax, a graphical user interface for constructing and executing HERALD queries, as well as an integration into Informatics for Integrating Biology and the Bedside (i2b2).</div></div><div><h3>Methods</h3><div>The syntax of HERALD mimics natural language and supports different query types for selection, aggregation, analysis of relationships, and searching for data points based on filter expressions and temporal constraints. Using a hierarchical concept model, queries are executed individually for the data of each patient, while constructing tabular output. HERALD is closed, meaning that queries process data points and generate data points. Queries can refer to data points that have been produced by previous queries, providing a simple, but powerful nesting mechanism.</div></div><div><h3>Results</h3><div>The open-source implementation consists of a HERALD query parser, an execution engine, as well as a web-based user interface for query construction and statistical analysis. The implementation can be deployed as a standalone component and integrated into self-service data analytics environments like i2b2 as a plugin. HERALD can be valuable tool for data scientists and machine learning experts, as it simplifies the process of transforming longitudinal health data into tables and data matrices.</div></div><div><h3>Conclusion</h3><div>The construction of cross-sectional tables from longitudinal data can be supported through dedicated query languages that strike a reasonable balance between language complexity and transformation capabilities.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.7000,"publicationDate":"2024-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"HERALD: A domain-specific query language for longitudinal health data analytics\",\"authors\":\"Lena Baum,&nbsp;Marco Johns,&nbsp;Armin Müller,&nbsp;Hammam Abu Attieh,&nbsp;Fabian Prasser\",\"doi\":\"10.1016/j.ijmedinf.2024.105646\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background</h3><div>Large-scale health data has significant potential for research and innovation, especially with longitudinal data offering insights into prevention, disease progression, and treatment effects. Yet, analyzing this data type is complex, as data points are repeatedly documented along the timeline. As a consequence, extracting cross-sectional tabular data suitable for statistical analysis and machine learning can be challenging for medical researchers and data scientists alike, with existing tools lacking balance between ease-of-use and comprehensiveness.</div></div><div><h3>Objective</h3><div>This paper introduces HERALD, a novel domain-specific query language designed to support the transformation of longitudinal health data into cross-sectional tables. We describe the basic concepts, the query syntax, a graphical user interface for constructing and executing HERALD queries, as well as an integration into Informatics for Integrating Biology and the Bedside (i2b2).</div></div><div><h3>Methods</h3><div>The syntax of HERALD mimics natural language and supports different query types for selection, aggregation, analysis of relationships, and searching for data points based on filter expressions and temporal constraints. Using a hierarchical concept model, queries are executed individually for the data of each patient, while constructing tabular output. HERALD is closed, meaning that queries process data points and generate data points. Queries can refer to data points that have been produced by previous queries, providing a simple, but powerful nesting mechanism.</div></div><div><h3>Results</h3><div>The open-source implementation consists of a HERALD query parser, an execution engine, as well as a web-based user interface for query construction and statistical analysis. The implementation can be deployed as a standalone component and integrated into self-service data analytics environments like i2b2 as a plugin. HERALD can be valuable tool for data scientists and machine learning experts, as it simplifies the process of transforming longitudinal health data into tables and data matrices.</div></div><div><h3>Conclusion</h3><div>The construction of cross-sectional tables from longitudinal data can be supported through dedicated query languages that strike a reasonable balance between language complexity and transformation capabilities.</div></div>\",\"PeriodicalId\":54950,\"journal\":{\"name\":\"International Journal of Medical Informatics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2024-10-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Medical Informatics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1386505624003095\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Medical Informatics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1386505624003095","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

背景:大规模健康数据在研究和创新方面具有巨大的潜力,尤其是纵向数据能为预防、疾病进展和治疗效果提供洞察力。然而,分析这种数据类型非常复杂,因为数据点是沿着时间轴重复记录的。因此,对于医学研究人员和数据科学家来说,提取适合统计分析和机器学习的横截面表格数据具有挑战性,现有工具在易用性和全面性之间缺乏平衡:本文介绍了 HERALD,这是一种新颖的特定领域查询语言,旨在支持将纵向健康数据转换为横截面表格。我们介绍了 HERALD 的基本概念、查询语法、用于构建和执行 HERALD 查询的图形用户界面,以及与整合生物学和床旁信息学(i2b2)的集成:HERALD 的语法模仿自然语言,支持不同的查询类型,包括选择、聚合、关系分析,以及根据过滤表达式和时间限制搜索数据点。利用分层概念模型,对每个病人的数据单独执行查询,同时构建表格输出。HERALD 是封闭的,这意味着查询可处理数据点并生成数据点。查询可以引用之前查询生成的数据点,从而提供了一个简单但功能强大的嵌套机制:开源实现包括一个 HERALD 查询解析器、一个执行引擎,以及一个用于查询构建和统计分析的基于网络的用户界面。该实现可作为独立组件部署,也可作为插件集成到 i2b2 等自助服务数据分析环境中。HERALD 可以简化将纵向健康数据转换为表格和数据矩阵的过程,是数据科学家和机器学习专家的宝贵工具:结论:通过专用的查询语言,可以支持从纵向数据构建横截面表格,这种语言在语言复杂性和转换能力之间取得了合理的平衡。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
HERALD: A domain-specific query language for longitudinal health data analytics

Background

Large-scale health data has significant potential for research and innovation, especially with longitudinal data offering insights into prevention, disease progression, and treatment effects. Yet, analyzing this data type is complex, as data points are repeatedly documented along the timeline. As a consequence, extracting cross-sectional tabular data suitable for statistical analysis and machine learning can be challenging for medical researchers and data scientists alike, with existing tools lacking balance between ease-of-use and comprehensiveness.

Objective

This paper introduces HERALD, a novel domain-specific query language designed to support the transformation of longitudinal health data into cross-sectional tables. We describe the basic concepts, the query syntax, a graphical user interface for constructing and executing HERALD queries, as well as an integration into Informatics for Integrating Biology and the Bedside (i2b2).

Methods

The syntax of HERALD mimics natural language and supports different query types for selection, aggregation, analysis of relationships, and searching for data points based on filter expressions and temporal constraints. Using a hierarchical concept model, queries are executed individually for the data of each patient, while constructing tabular output. HERALD is closed, meaning that queries process data points and generate data points. Queries can refer to data points that have been produced by previous queries, providing a simple, but powerful nesting mechanism.

Results

The open-source implementation consists of a HERALD query parser, an execution engine, as well as a web-based user interface for query construction and statistical analysis. The implementation can be deployed as a standalone component and integrated into self-service data analytics environments like i2b2 as a plugin. HERALD can be valuable tool for data scientists and machine learning experts, as it simplifies the process of transforming longitudinal health data into tables and data matrices.

Conclusion

The construction of cross-sectional tables from longitudinal data can be supported through dedicated query languages that strike a reasonable balance between language complexity and transformation capabilities.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
International Journal of Medical Informatics
International Journal of Medical Informatics 医学-计算机:信息系统
CiteScore
8.90
自引率
4.10%
发文量
217
审稿时长
42 days
期刊介绍: International Journal of Medical Informatics provides an international medium for dissemination of original results and interpretative reviews concerning the field of medical informatics. The Journal emphasizes the evaluation of systems in healthcare settings. The scope of journal covers: Information systems, including national or international registration systems, hospital information systems, departmental and/or physician''s office systems, document handling systems, electronic medical record systems, standardization, systems integration etc.; Computer-aided medical decision support systems using heuristic, algorithmic and/or statistical methods as exemplified in decision theory, protocol development, artificial intelligence, etc. Educational computer based programs pertaining to medical informatics or medicine in general; Organizational, economic, social, clinical impact, ethical and cost-benefit aspects of IT applications in health care.
期刊最新文献
Application of the openEHR reference model for PGHD: A case study on the DH-Convener initiative Tracking provenance in clinical data warehouses for quality management Acute myocardial infarction risk prediction in emergency chest pain patients: An external validation study Healthcare professionals’ cross-organizational access to electronic health records: A scoping review Cross-modal similar clinical case retrieval using a modular model based on contrastive learning and k-nearest neighbor search
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1