面向语义源支持的Talend Open Studio组件开发

Morad Hajji, Mohammed Qbadou, K. Mansouri
{"title":"面向语义源支持的Talend Open Studio组件开发","authors":"Morad Hajji, Mohammed Qbadou, K. Mansouri","doi":"10.1109/ICSSD47982.2019.9002820","DOIUrl":null,"url":null,"abstract":"The Extract-Transform-Load (ETL) process is the most widely used mechanism to keep a Data Warehouse loading with data extracted from a variety of sources. Currently, tools offering graphical interfaces to facilitate the manipulation of ETL processes have become very popular and have reached a very advanced level of maturity. Talend Open Studio for Data Integration is one of the most popular and comprehensive tools in terms of functionality and performance. So far, this ETL tool provides a large number of components for different data sources. However, the advent of the Semantic Web brings the notion of ontology as a new source of data whose structure is characterized by its complex aspect related to the expressiveness of languages of the knowledge representation. The emergence of this notion is a new challenge. Indeed, to our knowledge, Talend Open Studio for Data Integration does not have any components intended to support ontological sources.In this contribution, we present our approach for the development of Talend Open Studio for Data Integration components in order to use Semantic Web data in ETL processes created with this tool. Using a strategy that promotes the abstraction of ontological sources, this approach can be adapted to different languages of representation of knowledge such as RDF and OWL.In order to assess the usefulness of our approach, we evaluated it as part of a hypothetical example set of a simplistic ontology.","PeriodicalId":342806,"journal":{"name":"2019 1st International Conference on Smart Systems and Data Science (ICSSD)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Towards the Development of Talend Open Studio Components for the Support of Semantic Sources\",\"authors\":\"Morad Hajji, Mohammed Qbadou, K. Mansouri\",\"doi\":\"10.1109/ICSSD47982.2019.9002820\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Extract-Transform-Load (ETL) process is the most widely used mechanism to keep a Data Warehouse loading with data extracted from a variety of sources. Currently, tools offering graphical interfaces to facilitate the manipulation of ETL processes have become very popular and have reached a very advanced level of maturity. Talend Open Studio for Data Integration is one of the most popular and comprehensive tools in terms of functionality and performance. So far, this ETL tool provides a large number of components for different data sources. However, the advent of the Semantic Web brings the notion of ontology as a new source of data whose structure is characterized by its complex aspect related to the expressiveness of languages of the knowledge representation. The emergence of this notion is a new challenge. Indeed, to our knowledge, Talend Open Studio for Data Integration does not have any components intended to support ontological sources.In this contribution, we present our approach for the development of Talend Open Studio for Data Integration components in order to use Semantic Web data in ETL processes created with this tool. Using a strategy that promotes the abstraction of ontological sources, this approach can be adapted to different languages of representation of knowledge such as RDF and OWL.In order to assess the usefulness of our approach, we evaluated it as part of a hypothetical example set of a simplistic ontology.\",\"PeriodicalId\":342806,\"journal\":{\"name\":\"2019 1st International Conference on Smart Systems and Data Science (ICSSD)\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 1st International Conference on Smart Systems and Data Science (ICSSD)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSSD47982.2019.9002820\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 1st International Conference on Smart Systems and Data Science (ICSSD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSSD47982.2019.9002820","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

提取-转换-加载(Extract-Transform-Load, ETL)过程是使用最广泛的机制,用于保持数据仓库加载从各种来源提取的数据。目前,提供图形界面以方便操作ETL过程的工具已经变得非常流行,并且已经达到了非常高级的成熟度。Talend Open Studio for Data Integration是在功能和性能方面最流行和最全面的工具之一。到目前为止,这个ETL工具为不同的数据源提供了大量的组件。然而,语义网的出现带来了本体作为一种新的数据源的概念,其结构的特点是其复杂性与知识表示语言的表达性有关。这个概念的出现是一个新的挑战。事实上,据我们所知,Talend Open Studio for Data Integration并没有任何支持本体论源的组件。在本文中,我们介绍了开发Talend Open Studio for Data Integration组件的方法,以便在使用该工具创建的ETL流程中使用语义Web数据。使用一种促进本体论源抽象的策略,这种方法可以适应不同的知识表示语言,如RDF和OWL。为了评估我们的方法的有用性,我们将其作为一个简单本体的假设示例集的一部分进行评估。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Towards the Development of Talend Open Studio Components for the Support of Semantic Sources
The Extract-Transform-Load (ETL) process is the most widely used mechanism to keep a Data Warehouse loading with data extracted from a variety of sources. Currently, tools offering graphical interfaces to facilitate the manipulation of ETL processes have become very popular and have reached a very advanced level of maturity. Talend Open Studio for Data Integration is one of the most popular and comprehensive tools in terms of functionality and performance. So far, this ETL tool provides a large number of components for different data sources. However, the advent of the Semantic Web brings the notion of ontology as a new source of data whose structure is characterized by its complex aspect related to the expressiveness of languages of the knowledge representation. The emergence of this notion is a new challenge. Indeed, to our knowledge, Talend Open Studio for Data Integration does not have any components intended to support ontological sources.In this contribution, we present our approach for the development of Talend Open Studio for Data Integration components in order to use Semantic Web data in ETL processes created with this tool. Using a strategy that promotes the abstraction of ontological sources, this approach can be adapted to different languages of representation of knowledge such as RDF and OWL.In order to assess the usefulness of our approach, we evaluated it as part of a hypothetical example set of a simplistic ontology.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Determination of Merchant Ships that Most Likely to be Autonomously Operated Adaptation of Classical Machine Learning Algorithms to Big Data Context: Problems and Challenges : Case Study: Hidden Markov Models Under Spark Predictive Process Monitoring related to the remaining time dimension: a value-driven framework Decomposition and Visualization of High-Dimensional Data in a Two Dimensional Interface Black SDN for WSN
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1