首页 > 最新文献

2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW)最新文献

英文 中文
Distribution-Driven, Embedded Synthetic Data Generation System and Tool for RDBMS 分布驱动的嵌入式RDBMS合成数据生成系统和工具
Pub Date : 2019-04-08 DOI: 10.1109/ICDEW.2019.00-25
Joseph W. Hu, Ivan T. Bowman, A. Nica, Anil K. Goel
Many self-managing relational database management systems (RDBMS) need to programmatically generate synthetic data to train machine learning models. This paper proposes the concept of shadow database and a framework to derive shadow database from production database that matches distribution properties of source data. Moreover, we have designed and implemented an embedded synthetic data generation tool that takes data distribution profile as input and generates a shadow database according to histograms of source data. The distribution profile is passed into the tool either through an export-import mechanism or as a JSON string. The shadow database can scale to be larger or smaller than the original database and serve as a testbed to train learning models. Unlike most other data generation tools, our tool is implemented as SQL procedures that can be embedded in the underlying RDBMS.
许多自管理关系数据库管理系统(RDBMS)需要以编程方式生成合成数据来训练机器学习模型。本文提出了影子数据库的概念,并提出了从生产数据库中派生出符合源数据分布特性的影子数据库的框架。此外,我们设计并实现了一个嵌入式合成数据生成工具,该工具以数据分布概况为输入,根据源数据的直方图生成影子数据库。分发配置文件通过导出-导入机制或作为JSON字符串传递到工具中。影子数据库可以扩展到比原始数据库更大或更小,并作为训练学习模型的测试平台。与大多数其他数据生成工具不同,我们的工具是作为SQL过程实现的,可以嵌入到底层RDBMS中。
{"title":"Distribution-Driven, Embedded Synthetic Data Generation System and Tool for RDBMS","authors":"Joseph W. Hu, Ivan T. Bowman, A. Nica, Anil K. Goel","doi":"10.1109/ICDEW.2019.00-25","DOIUrl":"https://doi.org/10.1109/ICDEW.2019.00-25","url":null,"abstract":"Many self-managing relational database management systems (RDBMS) need to programmatically generate synthetic data to train machine learning models. This paper proposes the concept of shadow database and a framework to derive shadow database from production database that matches distribution properties of source data. Moreover, we have designed and implemented an embedded synthetic data generation tool that takes data distribution profile as input and generates a shadow database according to histograms of source data. The distribution profile is passed into the tool either through an export-import mechanism or as a JSON string. The shadow database can scale to be larger or smaller than the original database and serve as a testbed to train learning models. Unlike most other data generation tools, our tool is implemented as SQL procedures that can be embedded in the underlying RDBMS.","PeriodicalId":186190,"journal":{"name":"2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127641287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Blockchain Enabled Distributed Data Management - A Vision 区块链支持分布式数据管理-一个愿景
Pub Date : 2019-04-08 DOI: 10.1109/ICDEW.2019.00-39
Furqan Baig, Fusheng Wang
Blockchain has gained much attention in recent academic and research works not only in crypto-currency but also in many other fields such as supply chain, health, storage etc. The application of blockchain in data management domain, however, is mostly geared towards the aspect of security and immutability. In this paper we propose integrating blockchain with distributed data management and study some open challenges and assumptions in doing so. We claim that, from data management perspective, blockchain's ability to handle unequal participants is more important than security and immutability. Finally, we propose possible ideas to integrate blockchain into distributed data transaction and management workflows to design a globally consistent data store ensuring availability guarantees along with support for unifying heterogeneous data backends.
近年来,区块链不仅在加密货币领域,而且在供应链、健康、存储等许多领域都受到了广泛的关注。然而,区块链在数据管理领域的应用主要面向安全性和不变性方面。在本文中,我们提出将区块链与分布式数据管理集成,并研究了这样做的一些公开挑战和假设。我们声称,从数据管理的角度来看,区块链处理不平等参与者的能力比安全性和不可变性更重要。最后,我们提出了将区块链集成到分布式数据事务和管理工作流中的可能想法,以设计一个全球一致的数据存储,确保可用性保证以及对统一异构数据后端的支持。
{"title":"Blockchain Enabled Distributed Data Management - A Vision","authors":"Furqan Baig, Fusheng Wang","doi":"10.1109/ICDEW.2019.00-39","DOIUrl":"https://doi.org/10.1109/ICDEW.2019.00-39","url":null,"abstract":"Blockchain has gained much attention in recent academic and research works not only in crypto-currency but also in many other fields such as supply chain, health, storage etc. The application of blockchain in data management domain, however, is mostly geared towards the aspect of security and immutability. In this paper we propose integrating blockchain with distributed data management and study some open challenges and assumptions in doing so. We claim that, from data management perspective, blockchain's ability to handle unequal participants is more important than security and immutability. Finally, we propose possible ideas to integrate blockchain into distributed data transaction and management workflows to design a globally consistent data store ensuring availability guarantees along with support for unifying heterogeneous data backends.","PeriodicalId":186190,"journal":{"name":"2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132862037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Predicting Online User Purchase Behavior Based on Browsing History 基于浏览历史的在线用户购买行为预测
Pub Date : 2019-04-08 DOI: 10.1109/ICDEW.2019.00-13
Yunghui Chu, Hui-Kuo Yang, Wen-Chih Peng
Recently, people tend to purchase through websites. This change allows e-commerce sites to collect user behavior data from web logs. E-commerce marketing forces usually make use of such data to come up with subsequent promotional campaign to drive more traffic, and converting into paying customers. In this paper we consider a special kind of e-commerce companies which sell products with similar property and usually at a high price. Therefore, the recommendation becomes less important than prediction of items(if any) bought. We want to discover potential buyers and deliver ads or even coupons to them, expecting them to be real buyers. In this paper, we model the buying behaviors from clicking records with patterns extracted using feature engineering approach. Our solution was to model two kinds of browsing behaviors, namely hesitant and impulsive respectively. In the model, we define some interaction features from click-streams which uncover users' purchase intention with the product pages, how long the user stays on that page, and then build a model which can predict users' preference. Experimental results on a real dataset from an e-commerce company demonstrate the effectiveness of the proposed method. The approaches in our work can be used to model user purchasing intent and applied to e-commerce sites which sell high-end products.
最近,人们倾向于通过网站购物。这一改变允许电子商务网站从网络日志中收集用户行为数据。电子商务营销人员通常利用这些数据来制定后续的促销活动,以吸引更多的流量,并将其转化为付费客户。本文研究了一类特殊的电子商务公司,这些公司销售的产品性质相似,通常价格较高。因此,与预测购买的商品(如果有的话)相比,推荐变得不那么重要了。我们希望发现潜在的买家,并向他们提供广告甚至优惠券,期望他们成为真正的买家。在本文中,我们利用特征工程方法提取模式,对点击记录中的购买行为进行建模。我们的解决方案是分别建立犹豫和冲动两种浏览行为模型。在该模型中,我们从点击流中定义一些交互特征,揭示用户对产品页面的购买意图,用户在该页面停留的时间,然后建立一个可以预测用户偏好的模型。在一个电子商务公司的真实数据集上的实验结果证明了该方法的有效性。我们工作中的方法可以用来建模用户购买意图,并应用于销售高端产品的电子商务网站。
{"title":"Predicting Online User Purchase Behavior Based on Browsing History","authors":"Yunghui Chu, Hui-Kuo Yang, Wen-Chih Peng","doi":"10.1109/ICDEW.2019.00-13","DOIUrl":"https://doi.org/10.1109/ICDEW.2019.00-13","url":null,"abstract":"Recently, people tend to purchase through websites. This change allows e-commerce sites to collect user behavior data from web logs. E-commerce marketing forces usually make use of such data to come up with subsequent promotional campaign to drive more traffic, and converting into paying customers. In this paper we consider a special kind of e-commerce companies which sell products with similar property and usually at a high price. Therefore, the recommendation becomes less important than prediction of items(if any) bought. We want to discover potential buyers and deliver ads or even coupons to them, expecting them to be real buyers. In this paper, we model the buying behaviors from clicking records with patterns extracted using feature engineering approach. Our solution was to model two kinds of browsing behaviors, namely hesitant and impulsive respectively. In the model, we define some interaction features from click-streams which uncover users' purchase intention with the product pages, how long the user stays on that page, and then build a model which can predict users' preference. Experimental results on a real dataset from an e-commerce company demonstrate the effectiveness of the proposed method. The approaches in our work can be used to model user purchasing intent and applied to e-commerce sites which sell high-end products.","PeriodicalId":186190,"journal":{"name":"2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121193056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
AutoCache: Employing Machine Learning to Automate Caching in Distributed File Systems AutoCache:在分布式文件系统中使用机器学习自动缓存
Pub Date : 2019-04-08 DOI: 10.1109/ICDEW.2019.00-21
H. Herodotou
The use of computational platforms such as Hadoop and Spark is growing rapidly as a successful paradigm for processing large-scale data residing in distributed file systems like HDFS. Increasing memory sizes have recently led to the introduction of caching and in-memory file systems. However, these systems lack any automated caching mechanisms for storing data in memory. This paper presents AutoCache, a caching framework that automates the decisions for when and which files to store in, or remove from, the cache for increasing system performance. The decisions are based on machine learning models that track and predict file access patterns from evolving data processing workloads. Our evaluation using real-world workload traces from a Facebook production cluster compares our approach with several other policies and showcases significant benefits in terms of both workload performance and cluster efficiency.
Hadoop和Spark等计算平台的使用正在迅速增长,作为处理驻留在分布式文件系统(如HDFS)中的大规模数据的成功范例。内存大小的增加最近导致了缓存和内存文件系统的引入。然而,这些系统缺乏在内存中存储数据的自动缓存机制。本文介绍了AutoCache,这是一个缓存框架,可以自动决定何时以及哪些文件存储在缓存中,或者从缓存中删除,以提高系统性能。决策基于机器学习模型,该模型跟踪和预测不断变化的数据处理工作负载的文件访问模式。我们使用来自Facebook生产集群的真实工作负载跟踪进行了评估,将我们的方法与其他几种策略进行了比较,并在工作负载性能和集群效率方面展示了显著的优势。
{"title":"AutoCache: Employing Machine Learning to Automate Caching in Distributed File Systems","authors":"H. Herodotou","doi":"10.1109/ICDEW.2019.00-21","DOIUrl":"https://doi.org/10.1109/ICDEW.2019.00-21","url":null,"abstract":"The use of computational platforms such as Hadoop and Spark is growing rapidly as a successful paradigm for processing large-scale data residing in distributed file systems like HDFS. Increasing memory sizes have recently led to the introduction of caching and in-memory file systems. However, these systems lack any automated caching mechanisms for storing data in memory. This paper presents AutoCache, a caching framework that automates the decisions for when and which files to store in, or remove from, the cache for increasing system performance. The decisions are based on machine learning models that track and predict file access patterns from evolving data processing workloads. Our evaluation using real-world workload traces from a Facebook production cluster compares our approach with several other policies and showcases significant benefits in terms of both workload performance and cluster efficiency.","PeriodicalId":186190,"journal":{"name":"2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129318398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Towards Auto-Scaling Existing Transactional Databases with Strong Consistency 实现现有事务性数据库的强一致性自动伸缩
Pub Date : 2019-04-08 DOI: 10.1109/ICDEW.2019.00-26
M. Georgiou, Aristodemos Paphitis, Michael Sirivianos, H. Herodotou
Existing relational database systems often suffer from rapid increases or significant variability of transactional workloads but lack support for scalability or elasticity. Database replication has been employed to scale workload performance but past approaches make various performance versus consistency tradeoffs and typically lack the mechanisms and policies for dynamically adding and removing replicas. This paper presents Hihooi, a replication-based middleware system that is able to achieve scalability, strong consistency, and elasticity for existing transactional databases. These features are enabled by (i) a novel replication algorithm for propagating database modifications asynchronously and consistently to all replicas at high speeds, and (ii) a new routing algorithm for directing incoming transactions to consistent replicas. Our experimental evaluation validates the high scalability and elasticity benefits offered by Hihooi, which form the key ingredients towards a truly auto-scaling system.
现有的关系数据库系统经常受到事务工作负载快速增长或显著变化的影响,但缺乏对可伸缩性或弹性的支持。数据库复制已用于扩展工作负载性能,但过去的方法在性能与一致性之间进行了各种权衡,并且通常缺乏动态添加和删除副本的机制和策略。本文介绍了Hihooi,一个基于复制的中间件系统,它能够为现有的事务性数据库实现可伸缩性、强一致性和弹性。这些特性是通过以下方式实现的:(i)一种新的复制算法,用于异步地、一致地以高速将数据库修改传播到所有副本,以及(ii)一种新的路由算法,用于将传入的事务定向到一致的副本。我们的实验评估验证了Hihooi提供的高可扩展性和弹性优势,这些优势构成了真正自动扩展系统的关键成分。
{"title":"Towards Auto-Scaling Existing Transactional Databases with Strong Consistency","authors":"M. Georgiou, Aristodemos Paphitis, Michael Sirivianos, H. Herodotou","doi":"10.1109/ICDEW.2019.00-26","DOIUrl":"https://doi.org/10.1109/ICDEW.2019.00-26","url":null,"abstract":"Existing relational database systems often suffer from rapid increases or significant variability of transactional workloads but lack support for scalability or elasticity. Database replication has been employed to scale workload performance but past approaches make various performance versus consistency tradeoffs and typically lack the mechanisms and policies for dynamically adding and removing replicas. This paper presents Hihooi, a replication-based middleware system that is able to achieve scalability, strong consistency, and elasticity for existing transactional databases. These features are enabled by (i) a novel replication algorithm for propagating database modifications asynchronously and consistently to all replicas at high speeds, and (ii) a new routing algorithm for directing incoming transactions to consistent replicas. Our experimental evaluation validates the high scalability and elasticity benefits offered by Hihooi, which form the key ingredients towards a truly auto-scaling system.","PeriodicalId":186190,"journal":{"name":"2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125350496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Towards Distributed Multi-model Learning on Apache Spark for Model-Based Recommender 基于模型推荐的Apache Spark分布式多模型学习研究
Pub Date : 2019-04-08 DOI: 10.1109/ICDEW.2019.00-12
Anas Alzogbi, Polina Koleva, G. Lausen
Model-based approaches for Content-based Filtering (CBF) recommendation have the potential of generating representative users models owing to their ability to learn from users actions. However, the need for training an individual model for each user leads to a scalability issue and brings a high computational cost that contributes to the limited adaptation of model-based approaches as efficient CBF recommenders. This is particularly relevant for production systems where the recommender is expected to serve a large number of users. In this work, we address the efficiency issue of model-based CBF recommender systems and present a new approach for distributed multi-model learning based on Apache Spark. We use Ranking SVM as the underlying recommendation algorithm and present a distributed implementation that allows efficient training of multiple models in parallel using a collection of machines. We demonstrate the efficiency of our approach on a real-world dataset from citeulike and show that our approach can reduce the cost of multi-model learning without affecting the prediction accuracy.
基于模型的内容过滤(CBF)推荐方法具有生成代表性用户模型的潜力,因为它们能够从用户的行为中学习。然而,需要为每个用户训练一个单独的模型会导致可伸缩性问题,并带来很高的计算成本,这导致基于模型的方法作为高效CBF推荐的适应性有限。这对于期望推荐者为大量用户提供服务的生产系统尤其重要。在这项工作中,我们解决了基于模型的CBF推荐系统的效率问题,并提出了一种基于Apache Spark的分布式多模型学习的新方法。我们使用排序支持向量机作为底层推荐算法,并提出了一种分布式实现,允许使用一组机器并行地有效训练多个模型。我们在citeulike的真实数据集上证明了我们的方法的有效性,并表明我们的方法可以在不影响预测精度的情况下降低多模型学习的成本。
{"title":"Towards Distributed Multi-model Learning on Apache Spark for Model-Based Recommender","authors":"Anas Alzogbi, Polina Koleva, G. Lausen","doi":"10.1109/ICDEW.2019.00-12","DOIUrl":"https://doi.org/10.1109/ICDEW.2019.00-12","url":null,"abstract":"Model-based approaches for Content-based Filtering (CBF) recommendation have the potential of generating representative users models owing to their ability to learn from users actions. However, the need for training an individual model for each user leads to a scalability issue and brings a high computational cost that contributes to the limited adaptation of model-based approaches as efficient CBF recommenders. This is particularly relevant for production systems where the recommender is expected to serve a large number of users. In this work, we address the efficiency issue of model-based CBF recommender systems and present a new approach for distributed multi-model learning based on Apache Spark. We use Ranking SVM as the underlying recommendation algorithm and present a distributed implementation that allows efficient training of multiple models in parallel using a collection of machines. We demonstrate the efficiency of our approach on a real-world dataset from citeulike and show that our approach can reduce the cost of multi-model learning without affecting the prediction accuracy.","PeriodicalId":186190,"journal":{"name":"2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130217506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Towards self-managing cloud-scale computing platforms: Experiences and challenges 迈向自我管理的云规模计算平台:经验与挑战
Pub Date : 2019-04-08 DOI: 10.1109/ICDEW.2019.00-24
Jingren Zhou
Summary form only given, as follows. The complete presentation was not made available for publication as part of the conference proceedings. More and more companies heavily rely on massive data analysis of many kinds to understand data insights and drive business decisions. To support this ever-increasing need, big data computing platforms have grown to an unprecedented scale, way beyond human manageability. In this talk, I'll share our experiences at Alibaba to enable our big data platforms to configure, optimize, monitor, and protect themselves automatically, including automatic version testing and deployment control, system health monitoring and alert, automatic physical design/data placement/storage optimization, etc. I'll also outline some outstanding research and engineering challenges.
仅给出摘要形式,如下。完整的报告没有作为会议记录的一部分提供出版。越来越多的公司严重依赖于多种类型的海量数据分析来理解数据见解并推动业务决策。为了支持这种不断增长的需求,大数据计算平台已经发展到前所未有的规模,远远超出了人类的管理能力。在这次演讲中,我将分享我们在阿里巴巴的经验,使我们的大数据平台能够自动配置、优化、监控和保护自己,包括自动版本测试和部署控制、系统健康监测和警报、自动物理设计/数据放置/存储优化等。我还将概述一些突出的研究和工程挑战。
{"title":"Towards self-managing cloud-scale computing platforms: Experiences and challenges","authors":"Jingren Zhou","doi":"10.1109/ICDEW.2019.00-24","DOIUrl":"https://doi.org/10.1109/ICDEW.2019.00-24","url":null,"abstract":"Summary form only given, as follows. The complete presentation was not made available for publication as part of the conference proceedings. More and more companies heavily rely on massive data analysis of many kinds to understand data insights and drive business decisions. To support this ever-increasing need, big data computing platforms have grown to an unprecedented scale, way beyond human manageability. In this talk, I'll share our experiences at Alibaba to enable our big data platforms to configure, optimize, monitor, and protect themselves automatically, including automatic version testing and deployment control, system health monitoring and alert, automatic physical design/data placement/storage optimization, etc. I'll also outline some outstanding research and engineering challenges.","PeriodicalId":186190,"journal":{"name":"2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126654203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Recommendation of Indian Cuisine Recipes Based on Ingredients 基于食材的印度菜食谱推荐
Pub Date : 2019-04-08 DOI: 10.1109/ICDEW.2019.00-28
Nilesh Nilesh, Madhu Kumari, Pritom Hazarika, Vishal Raman
there are lots of varieties of Indian cuisine available with same ingredients. In India, Traditional cuisines consist of wide varieties due to locally available spices, herbs, vegetables, and fruits. In this paper, we purposed a method that recommends recipes of Indian cuisine on the basis of available ingredients and liked cuisine. For this work, we did web scraping to make a collection of recipes' varieties and after that apply the content-based approach of machine learning to recommend the recipes. This system gives the recommendation of Indian Cuisines based on ingredients.
有很多种类的印度菜都有相同的食材。在印度,由于当地可用的香料,草药,蔬菜和水果,传统美食包括多种多样。在本文中,我们提出了一种方法,即根据可用的食材和喜欢的菜肴来推荐印度菜的食谱。在这项工作中,我们做了网络抓取来收集食谱的品种,然后应用基于内容的机器学习方法来推荐食谱。这个系统根据食材推荐印度菜。
{"title":"Recommendation of Indian Cuisine Recipes Based on Ingredients","authors":"Nilesh Nilesh, Madhu Kumari, Pritom Hazarika, Vishal Raman","doi":"10.1109/ICDEW.2019.00-28","DOIUrl":"https://doi.org/10.1109/ICDEW.2019.00-28","url":null,"abstract":"there are lots of varieties of Indian cuisine available with same ingredients. In India, Traditional cuisines consist of wide varieties due to locally available spices, herbs, vegetables, and fruits. In this paper, we purposed a method that recommends recipes of Indian cuisine on the basis of available ingredients and liked cuisine. For this work, we did web scraping to make a collection of recipes' varieties and after that apply the content-based approach of machine learning to recommend the recipes. This system gives the recommendation of Indian Cuisines based on ingredients.","PeriodicalId":186190,"journal":{"name":"2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129773303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
A Group Recommendation Approach Based on Neural Network Collaborative Filtering 基于神经网络协同过滤的群组推荐方法
Pub Date : 2019-04-08 DOI: 10.1109/ICDEW.2019.00-18
J. Du, Lin Li, Peng Gu, Qing Xie
At present, the most popular recommendation algorithms belong to the class of latent factor models(LFM). Compared with traditional user-based and item-based collaborative filtering methods, the latent factor model effectively improves recommendation accuracy. In recent years, deep neural networks have succeeded in many research fields, such as computer vision, speech recognition, and natural language processing. However, there are few studies combining recommendation systems and deep neural networks, especially for group recommendation. Some academic studies have adopted deep learning methods, but they mainly use it to process auxiliary information, such as acoustic features of sounds, and semantic analysis of texts, the inner product is still used to deal with latent features of users and items. In this paper, we first obtain the nonlinear interaction of latent feature vectors between users and projects through multi-layer perceptron(MLP), and use the combination of LFM and MLP to achieve collaborative filtering recommendation between users and items. Secondly, based on the individual's recommendation score, a fusion strategy based on Nash equilibrium is designed to ensure the average satisfaction of the group users. Our experiments are conducted on the Track 1 of KDD CUP 2012 public data set, taking the square root mean square error(RMSE) as the evaluation index. The experiment compares the traditional LFM optimization model, the MLP model and the LFM-MLP hybrid model in individual recommendation, and compares the strategy proposed in this paper with the traditional three single group strategies, the most pleasure, the average strategy and the least misery. The experimental results show that the proposed method can effectively improve the accuracy of group recommendation.
目前,最流行的推荐算法属于潜在因素模型(latent factor model, LFM)。与传统的基于用户和基于项目的协同过滤方法相比,潜在因素模型有效地提高了推荐准确率。近年来,深度神经网络在计算机视觉、语音识别、自然语言处理等诸多研究领域取得了成功。然而,将推荐系统与深度神经网络相结合的研究很少,特别是在群体推荐方面。一些学术研究采用了深度学习的方法,但主要是用它来处理辅助信息,如声音的声学特征、文本的语义分析等,内积仍然用于处理用户和物品的潜在特征。在本文中,我们首先通过多层感知器(MLP)获得用户和项目之间潜在特征向量的非线性交互,并利用LFM和MLP的结合实现用户和项目之间的协同过滤推荐。其次,在个体推荐评分的基础上,设计基于纳什均衡的融合策略,保证群体用户的平均满意度;我们的实验是在KDD CUP 2012公共数据集的Track 1上进行的,以均方根误差(RMSE)作为评价指标。实验比较了传统LFM优化模型、MLP模型和LFM-MLP混合模型在个人推荐中的应用,并将本文提出的策略与传统的三种单群体策略,即最快乐策略、平均策略和最小痛苦策略进行了比较。实验结果表明,该方法能有效提高群组推荐的准确率。
{"title":"A Group Recommendation Approach Based on Neural Network Collaborative Filtering","authors":"J. Du, Lin Li, Peng Gu, Qing Xie","doi":"10.1109/ICDEW.2019.00-18","DOIUrl":"https://doi.org/10.1109/ICDEW.2019.00-18","url":null,"abstract":"At present, the most popular recommendation algorithms belong to the class of latent factor models(LFM). Compared with traditional user-based and item-based collaborative filtering methods, the latent factor model effectively improves recommendation accuracy. In recent years, deep neural networks have succeeded in many research fields, such as computer vision, speech recognition, and natural language processing. However, there are few studies combining recommendation systems and deep neural networks, especially for group recommendation. Some academic studies have adopted deep learning methods, but they mainly use it to process auxiliary information, such as acoustic features of sounds, and semantic analysis of texts, the inner product is still used to deal with latent features of users and items. In this paper, we first obtain the nonlinear interaction of latent feature vectors between users and projects through multi-layer perceptron(MLP), and use the combination of LFM and MLP to achieve collaborative filtering recommendation between users and items. Secondly, based on the individual's recommendation score, a fusion strategy based on Nash equilibrium is designed to ensure the average satisfaction of the group users. Our experiments are conducted on the Track 1 of KDD CUP 2012 public data set, taking the square root mean square error(RMSE) as the evaluation index. The experiment compares the traditional LFM optimization model, the MLP model and the LFM-MLP hybrid model in individual recommendation, and compares the strategy proposed in this paper with the traditional three single group strategies, the most pleasure, the average strategy and the least misery. The experimental results show that the proposed method can effectively improve the accuracy of group recommendation.","PeriodicalId":186190,"journal":{"name":"2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133656192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Technical Mechanics of a Trans-Border Waste Flow Tracking Solution Based on Blockchain Technology 基于区块链技术的跨境废物流跟踪解决方案的技术机制
Pub Date : 2019-04-08 DOI: 10.1109/ICDEW.2019.00-38
Dominik Schmelz, K. Pinter, S. Strobl, Lei Zhu, Phillip Niemeier, T. Grechenig
When it comes to waste management and waste tracking, the main objectives are: prevent and reuse as much waste as possible. In the last decade, there has been a strong shift from the disposal of waste to recycling thereof. Within the EU, EU member states are required to establish waste prevention programs through the "Waste Framework EU Directive" which attempts to turn the EU into a recycling society. Similar endeavors also exist in other countries and regions. With globalization, the disposal of waste became a lucrative business case, often without transparency whether the waste has been disposed of properly and according to regulations or directives installed by each country. Waste flow management is the basis of sustainable waste prevention and recycling. Short-term solutions and financial profit accelerated malpractice in waste disposal and recycling. The currently used systems cannot track waste across borders in a transparent, but data-protected and tamper-proof way. Therefore, solutions to address these concerns have to be found and implemented. Blockchain enables new approaches, especially in ecosystems with distrust in every participating stakeholder, which is the case in waste flow tracking. We introduce and discuss a novel solution for waste tracking, on the basis of blockchain technology and smart contracts, that fulfills the aforementioned requirements.
当涉及到废物管理和废物跟踪时,主要目标是:防止和再利用尽可能多的废物。在过去的十年里,从处理废物到回收废物已经有了很大的转变。在欧盟内部,欧盟成员国被要求通过“废物框架欧盟指令”建立废物预防计划,试图将欧盟转变为一个回收社会。其他国家和地区也有类似的做法。随着全球化的发展,废物处理成为一项有利可图的商业案例,但废物处理是否妥善,是否按照各国制定的法规或指令进行,往往缺乏透明度。废物流管理是可持续废物预防和回收的基础。短期解决方案和经济利益加速了废物处理和回收的弊端。目前使用的系统无法以透明、但数据保护和防篡改的方式追踪跨境废物。因此,必须找到并实现解决这些问题的解决方案。区块链可以实现新的方法,特别是在对每个参与的利益相关者不信任的生态系统中,这就是废物流跟踪的情况。我们介绍并讨论了一种基于区块链技术和智能合约的新型废物跟踪解决方案,该解决方案满足了上述要求。
{"title":"Technical Mechanics of a Trans-Border Waste Flow Tracking Solution Based on Blockchain Technology","authors":"Dominik Schmelz, K. Pinter, S. Strobl, Lei Zhu, Phillip Niemeier, T. Grechenig","doi":"10.1109/ICDEW.2019.00-38","DOIUrl":"https://doi.org/10.1109/ICDEW.2019.00-38","url":null,"abstract":"When it comes to waste management and waste tracking, the main objectives are: prevent and reuse as much waste as possible. In the last decade, there has been a strong shift from the disposal of waste to recycling thereof. Within the EU, EU member states are required to establish waste prevention programs through the \"Waste Framework EU Directive\" which attempts to turn the EU into a recycling society. Similar endeavors also exist in other countries and regions. With globalization, the disposal of waste became a lucrative business case, often without transparency whether the waste has been disposed of properly and according to regulations or directives installed by each country. Waste flow management is the basis of sustainable waste prevention and recycling. Short-term solutions and financial profit accelerated malpractice in waste disposal and recycling. The currently used systems cannot track waste across borders in a transparent, but data-protected and tamper-proof way. Therefore, solutions to address these concerns have to be found and implemented. Blockchain enables new approaches, especially in ecosystems with distrust in every participating stakeholder, which is the case in waste flow tracking. We introduce and discuss a novel solution for waste tracking, on the basis of blockchain technology and smart contracts, that fulfills the aforementioned requirements.","PeriodicalId":186190,"journal":{"name":"2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129352291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
期刊
2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1