Management and analytic of biomedical big data with cloud-based in-memory database and dynamic querying: a hands-on experience with real-world data

Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining Pub Date : 2014-08-24 DOI:10.1145/2623330.2630806

M. Feng, M. Ghassemi, Thomas Brennan, John Ellenberger, I. Hussain, R. Mark

{"title":"Management and analytic of biomedical big data with cloud-based in-memory database and dynamic querying: a hands-on experience with real-world data","authors":"M. Feng, M. Ghassemi, Thomas Brennan, John Ellenberger, I. Hussain, R. Mark","doi":"10.1145/2623330.2630806","DOIUrl":null,"url":null,"abstract":"Analyzing Biomedical Big Data (BBD) is computationally expensive due to high dimensionality and large data volume. Performance and scalability issues of traditional database management systems (DBMS) often limit the usage of more sophisticated and complex data queries and analytic models. Moreover, in the conventional setting, data management and analysis use separate software platforms. Exporting and importing large amounts of data across platforms require a significant amount of computational and I/O resources, as well as potentially putting sensitive data at a security risk. In this tutorial, the participants will learn the difference between in-memory DBMS and traditional DBMS through hands-on exercises using SAP's cloud-based HANA in-memory DBMS in conjunction with the Multi-parameter Intelligent Monitoring in Intensive Care (MIMIC) dataset. MIMIC is an open-access critical care EHR archive (over 4TB in size) and consists of structured, unstructured and waveform data. Furthermore, this tutorial will seek to educate the participants on how a combination of dynamic querying, and in-memory DBMS may enhance the management and analysis of complex clinical data.","PeriodicalId":20536,"journal":{"name":"Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":"31 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2014-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2623330.2630806","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Analyzing Biomedical Big Data (BBD) is computationally expensive due to high dimensionality and large data volume. Performance and scalability issues of traditional database management systems (DBMS) often limit the usage of more sophisticated and complex data queries and analytic models. Moreover, in the conventional setting, data management and analysis use separate software platforms. Exporting and importing large amounts of data across platforms require a significant amount of computational and I/O resources, as well as potentially putting sensitive data at a security risk. In this tutorial, the participants will learn the difference between in-memory DBMS and traditional DBMS through hands-on exercises using SAP's cloud-based HANA in-memory DBMS in conjunction with the Multi-parameter Intelligent Monitoring in Intensive Care (MIMIC) dataset. MIMIC is an open-access critical care EHR archive (over 4TB in size) and consists of structured, unstructured and waveform data. Furthermore, this tutorial will seek to educate the participants on how a combination of dynamic querying, and in-memory DBMS may enhance the management and analysis of complex clinical data.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

使用基于云的内存数据库和动态查询管理和分析生物医学大数据:实际数据的实践经验

由于生物医学大数据的高维和大数据量，分析生物医学大数据的计算成本很高。传统数据库管理系统(DBMS)的性能和可伸缩性问题经常限制更复杂的数据查询和分析模型的使用。此外，在常规设置中，数据管理和分析使用单独的软件平台。跨平台导出和导入大量数据需要大量的计算和I/O资源，并且可能将敏感数据置于安全风险中。在本教程中，参与者将通过使用SAP的基于云的HANA内存DBMS以及多参数重症监护智能监控(MIMIC)数据集的实践练习，了解内存DBMS和传统DBMS之间的区别。MIMIC是一个开放获取的重症监护电子病历档案(大小超过4TB)，由结构化、非结构化和波形数据组成。此外，本教程将试图教育参与者动态查询和内存DBMS的组合如何增强复杂临床数据的管理和分析。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining

自引率

0.00%

发文量