医疗大数据仓库集成的新流程

IF 0.5 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE International Journal of Data Mining Modelling and Management Pub Date : 2023-01-01 DOI:10.1504/ijdmmm.2023.132974

Nouha Arfaoui

{"title":"医疗大数据仓库集成的新流程","authors":"Nouha Arfaoui","doi":"10.1504/ijdmmm.2023.132974","DOIUrl":null,"url":null,"abstract":"Healthcare domain generates huge amount of data from different and heterogynous clinical data sources using different devices to ensure a good managing hospital performance. Because of the quantity and complexity structure of the data, we use big healthcare data warehouse for the storage first and the decision making later. To achieve our goal, we propose a new process that deals with this type of data. It starts by unifying the different data, then it extracts it, loads it into big healthcare data warehouse and finally it makes the necessary transformations. For the first step, the ontology is used. It is the best solution to solve the problem of data sources heterogeneity. We use, also, Hadoop and its ecosystem including Hive, MapReduce and HDFS to accelerate the treatment through the parallelism exploiting the performance of ELT to ensure the 'schema-on-read' where the data is stored before performing the transformation tasks.","PeriodicalId":43061,"journal":{"name":"International Journal of Data Mining Modelling and Management","volume":"56 1","pages":"0"},"PeriodicalIF":0.5000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A new process for healthcare big data warehouse integration\",\"authors\":\"Nouha Arfaoui\",\"doi\":\"10.1504/ijdmmm.2023.132974\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Healthcare domain generates huge amount of data from different and heterogynous clinical data sources using different devices to ensure a good managing hospital performance. Because of the quantity and complexity structure of the data, we use big healthcare data warehouse for the storage first and the decision making later. To achieve our goal, we propose a new process that deals with this type of data. It starts by unifying the different data, then it extracts it, loads it into big healthcare data warehouse and finally it makes the necessary transformations. For the first step, the ontology is used. It is the best solution to solve the problem of data sources heterogeneity. We use, also, Hadoop and its ecosystem including Hive, MapReduce and HDFS to accelerate the treatment through the parallelism exploiting the performance of ELT to ensure the 'schema-on-read' where the data is stored before performing the transformation tasks.\",\"PeriodicalId\":43061,\"journal\":{\"name\":\"International Journal of Data Mining Modelling and Management\",\"volume\":\"56 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.5000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Data Mining Modelling and Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1504/ijdmmm.2023.132974\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Data Mining Modelling and Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/ijdmmm.2023.132974","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

医疗保健领域使用不同的设备从不同的异构临床数据源生成大量数据，以确保良好的管理医院性能。由于数据量大、结构复杂，我们采用大型医疗数据仓库进行先存储后决策。为了实现我们的目标，我们提出了一个处理这类数据的新流程。它首先统一不同的数据，然后提取数据，将其加载到大型医疗保健数据仓库中，最后进行必要的转换。第一步，使用本体。它是解决数据源异构问题的最佳方案。我们还使用Hadoop及其生态系统，包括Hive, MapReduce和HDFS，通过并行性利用ELT的性能来加速处理，以确保在执行转换任务之前存储数据的“schema-on-read”。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A new process for healthcare big data warehouse integration

Healthcare domain generates huge amount of data from different and heterogynous clinical data sources using different devices to ensure a good managing hospital performance. Because of the quantity and complexity structure of the data, we use big healthcare data warehouse for the storage first and the decision making later. To achieve our goal, we propose a new process that deals with this type of data. It starts by unifying the different data, then it extracts it, loads it into big healthcare data warehouse and finally it makes the necessary transformations. For the first step, the ontology is used. It is the best solution to solve the problem of data sources heterogeneity. We use, also, Hadoop and its ecosystem including Hive, MapReduce and HDFS to accelerate the treatment through the parallelism exploiting the performance of ELT to ensure the 'schema-on-read' where the data is stored before performing the transformation tasks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Data Mining Modelling and Management COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-

CiteScore

1.10

自引率

0.00%

发文量

期刊介绍： Facilitating transformation from data to information to knowledge is paramount for organisations. Companies are flooded with data and conflicting information, but with limited real usable knowledge. However, rarely should a process be looked at from limited angles or in parts. Isolated islands of data mining, modelling and management (DMMM) should be connected. IJDMMM highlightes integration of DMMM, statistics/machine learning/databases, each element of data chain management, types of information, algorithms in software; from data pre-processing to post-processing; between theory and applications. Topics covered include: -Artificial intelligence- Biomedical science- Business analytics/intelligence, process modelling- Computer science, database management systems- Data management, mining, modelling, warehousing- Engineering- Environmental science, environment (ecoinformatics)- Information systems/technology, telecommunications/networking- Management science, operations research, mathematics/statistics- Social sciences- Business/economics, (computational) finance- Healthcare, medicine, pharmaceuticals- (Computational) chemistry, biology (bioinformatics)- Sustainable mobility systems, intelligent transportation systems- National security