N. Hubig, Linnea Passing, Maximilian E. Schüle, Dimitri Vorona, A. Kemper, Thomas Neumann
{"title":"HyPerInsight: Data Exploration Deep Inside HyPer","authors":"N. Hubig, Linnea Passing, Maximilian E. Schüle, Dimitri Vorona, A. Kemper, Thomas Neumann","doi":"10.1145/3132847.3133167","DOIUrl":null,"url":null,"abstract":"Nowadays we are drowning in data of various varieties. For all these mixed types and categories of data there exist even more different analysis approaches, often done in single hand-written solutions. We propose to extend HyPer, a main memory database system to a uniform data agent platform following the one system fits all approach for solving a wide variety of data analysis problems. We achieve this by applying a flexible operator concept to a set of various important data exploration algorithms. With that, HyPer solves analytical questions using clustering, classification, association rule mining and graph mining besides standard HTAP (Hybrid Transaction and Analytical Processing) workloads on the same database state. It enables to approach the full variety and volume of HTAP extended for data exploration (HTAPx), and only needs knowledge of already introduced SQL extensions that are automatically optimized by the database's standard optimizer. In this demo we will focus on the benefits and flexibility we create by using the SQL extensions for several well-known mining workloads. In our interactive webinterface for this project named HyPerInsight we demonstrate how HyPer outperforms the best open source competitor Apache Spark in common use cases in social media, geo-data, recommender systems and several other.","PeriodicalId":20449,"journal":{"name":"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2017-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3132847.3133167","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13
Abstract
Nowadays we are drowning in data of various varieties. For all these mixed types and categories of data there exist even more different analysis approaches, often done in single hand-written solutions. We propose to extend HyPer, a main memory database system to a uniform data agent platform following the one system fits all approach for solving a wide variety of data analysis problems. We achieve this by applying a flexible operator concept to a set of various important data exploration algorithms. With that, HyPer solves analytical questions using clustering, classification, association rule mining and graph mining besides standard HTAP (Hybrid Transaction and Analytical Processing) workloads on the same database state. It enables to approach the full variety and volume of HTAP extended for data exploration (HTAPx), and only needs knowledge of already introduced SQL extensions that are automatically optimized by the database's standard optimizer. In this demo we will focus on the benefits and flexibility we create by using the SQL extensions for several well-known mining workloads. In our interactive webinterface for this project named HyPerInsight we demonstrate how HyPer outperforms the best open source competitor Apache Spark in common use cases in social media, geo-data, recommender systems and several other.