The global database research community has greatly impacted the functionality and performance of data storage and processing systems along the dimensions that define "big data", i.e., volume, velocity, variety, and veracity. Locally, over the past five years, we have also been working on varying fronts. Among our contributions are: (1) establishing a vision for a database-inspired big data analytics system, which unifies the best of database and distributed systems technologies, and augments it with concepts drawn from compilers (e.g., iterations) and data stream processing, as well as (2) forming a community of researchers and institutions to create the Stratosphere platform to realize our vision. One major result from these activities was Apache Flink, an open-source big data analytics platform and its thriving global community of developers and production users. Although much progress has been made, when looking at the overall big data stack, a major challenge for database research community still remains. That is, how to maintain the ease-of-use despite the increasing heterogeneity and complexity of data analytics, involving specialized engines for various aspects of an end-to-end data analytics pipeline, including, among others, graph-based, linear algebra-based, and relational-based algorithms, and the underlying, increasingly heterogeneous hardware and computing infrastructure. At TU Berlin, DFKI, and the Berlin Big Data Center (BBDC), we aim to advance research in this field via the Mosaics project. Our goal is to remedy some of the heterogeneity challenges that hamper developer productivity and limit the use of data science technologies to just the privileged few, who are coveted experts.
{"title":"Mosaics in Big Data: Stratosphere, Apache Flink, and Beyond","authors":"V. Markl","doi":"10.1145/3210284.3214344","DOIUrl":"https://doi.org/10.1145/3210284.3214344","url":null,"abstract":"The global database research community has greatly impacted the functionality and performance of data storage and processing systems along the dimensions that define \"big data\", i.e., volume, velocity, variety, and veracity. Locally, over the past five years, we have also been working on varying fronts. Among our contributions are: (1) establishing a vision for a database-inspired big data analytics system, which unifies the best of database and distributed systems technologies, and augments it with concepts drawn from compilers (e.g., iterations) and data stream processing, as well as (2) forming a community of researchers and institutions to create the Stratosphere platform to realize our vision. One major result from these activities was Apache Flink, an open-source big data analytics platform and its thriving global community of developers and production users. Although much progress has been made, when looking at the overall big data stack, a major challenge for database research community still remains. That is, how to maintain the ease-of-use despite the increasing heterogeneity and complexity of data analytics, involving specialized engines for various aspects of an end-to-end data analytics pipeline, including, among others, graph-based, linear algebra-based, and relational-based algorithms, and the underlying, increasingly heterogeneous hardware and computing infrastructure. At TU Berlin, DFKI, and the Berlin Big Data Center (BBDC), we aim to advance research in this field via the Mosaics project. Our goal is to remedy some of the heterogeneity challenges that hamper developer productivity and limit the use of data science technologies to just the privileged few, who are coveted experts.","PeriodicalId":412438,"journal":{"name":"Proceedings of the 12th ACM International Conference on Distributed and Event-based Systems","volume":"122 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132099590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jelena Pajic, J. Rivera, Kaiwen Zhang, Hans A. Jacobsen
The recent success of electric vehicles leads to unprecedentedly high peaks of demand on the electric grid at the times when most people charge their cars. In order to avoid unreasonably rising costs due to inefficient utilization of the electricity infrastructure, we propose EVA: a scheduling system to solve the valley filling problem by distributing the electricity demand generated by electric vehicles in a geographically limited area efficiently over time spans in which the electric grid is underutilized. EVA is based on a smart contract running on the Ethereum blockchain in combination with off-chain computational nodes performing the schedule calculation using the Alternating Direction Method of Multipliers (ADMM). This allows for a high degree of transparency and verifiability in the scheduling computation results while maintaining a reasonable level of efficiency. In order to interact with the scheduling system, we developed a decentralized app with a graphical frontend, where the user can enter vehicle information and future energy requirements as well as review upcoming schedules. The calculation of the schedule is performed on a daily basis, continuously providing schedules for participating users for the following day.
{"title":"EVA","authors":"Jelena Pajic, J. Rivera, Kaiwen Zhang, Hans A. Jacobsen","doi":"10.1145/3210284.3219776","DOIUrl":"https://doi.org/10.1145/3210284.3219776","url":null,"abstract":"The recent success of electric vehicles leads to unprecedentedly high peaks of demand on the electric grid at the times when most people charge their cars. In order to avoid unreasonably rising costs due to inefficient utilization of the electricity infrastructure, we propose EVA: a scheduling system to solve the valley filling problem by distributing the electricity demand generated by electric vehicles in a geographically limited area efficiently over time spans in which the electric grid is underutilized. EVA is based on a smart contract running on the Ethereum blockchain in combination with off-chain computational nodes performing the schedule calculation using the Alternating Direction Method of Multipliers (ADMM). This allows for a high degree of transparency and verifiability in the scheduling computation results while maintaining a reasonable level of efficiency. In order to interact with the scheduling system, we developed a decentralized app with a graphical frontend, where the user can enter vehicle information and future energy requirements as well as review upcoming schedules. The calculation of the schedule is performed on a daily basis, continuously providing schedules for participating users for the following day.","PeriodicalId":412438,"journal":{"name":"Proceedings of the 12th ACM International Conference on Distributed and Event-based Systems","volume":"395 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124320051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Trade surveillance is an important concern in recent trading engines to detect and prevent fraudulent trades at earliest. In traditional trading platforms, to achieve high throughput and low latency requirements focus of developers has always been on high-performance languages such as C, C++ and FPGA based systems. These systems have limitations of scalability and fault-tolerance. With the arrival of in-memory technology these requirements can be met with Java-based frameworks like Ignite, Flink, Spark. In this paper, we propose a novel way of implementing trade surveillance architecture using Apache Ignite In-Memory Data Grid (IMDG). Paper discusses the engineering approach to tune system architecture on the single node in terms of achieving high throughput, low latency and then scaling out to multiple nodes.
{"title":"Low Latency, High Throughput Trade Surveillance System Using In-Memory Data Grid","authors":"Rishikesh Bansod, R. Virk, Mehul Raval","doi":"10.1145/3210284.3219773","DOIUrl":"https://doi.org/10.1145/3210284.3219773","url":null,"abstract":"Trade surveillance is an important concern in recent trading engines to detect and prevent fraudulent trades at earliest. In traditional trading platforms, to achieve high throughput and low latency requirements focus of developers has always been on high-performance languages such as C, C++ and FPGA based systems. These systems have limitations of scalability and fault-tolerance. With the arrival of in-memory technology these requirements can be met with Java-based frameworks like Ignite, Flink, Spark. In this paper, we propose a novel way of implementing trade surveillance architecture using Apache Ignite In-Memory Data Grid (IMDG). Paper discusses the engineering approach to tune system architecture on the single node in terms of achieving high throughput, low latency and then scaling out to multiple nodes.","PeriodicalId":412438,"journal":{"name":"Proceedings of the 12th ACM International Conference on Distributed and Event-based Systems","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126977050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hyperledger Fabric (HLF) is a modular and extensible permissioned blockchain platform released to open-source and hosted by the Linux Foundation. The platform's design exhibits principles required by enterprise grade business applications like supply-chains, financial transactions, asset management, food safety, and many more. For that end HLF introduces several innovations, two of which are smart contracts in general purpose languages (chaincode in HLF), and flexible endorsement policies, which govern whether a transaction is considered valid. Typical blockchain applications are comprised of two tiers: the first tier focuses on the modelling of the data schema and embedding of business rules into the blockchain by means of smart contracts (chaincode) and endorsment policies; and the second tier uses the SDK (Software Development Kit) provided by HLF to implement client side application logic. However there is a gap between the two tiers that hinders the rapid adoption of changes in the chaincode and endorsement policies within the client SDK. Currently, the chaincode location and endorsement policies are statically configured into the client SDK. This limits the reliability and availability of the client in the event of changes in the platform, and makes the platform more difficult to use. In this work we address and bridge the gap by describing the design and implementation of Service Discovery. Service Discovery provides APIs which allow dynamic discovery of the configuration required for the client SDK to interact with the platform, alleviating the client from the burden of maintaining it. This enables the client to rapidly adapt to changes in the platform, thus significantly improving the reliability of the application layer. It also makes the HLF platform more consumable, simplifying the job of creating blockchain applications.
Hyperledger Fabric (HLF)是一个模块化和可扩展的许可区块链平台,发布开源,由Linux基金会托管。该平台的设计展示了企业级业务应用程序所需的原则,如供应链、金融交易、资产管理、食品安全等等。为此,HLF引入了几项创新,其中两项是通用语言的智能合约(HLF中的链码),以及灵活的背书策略,用于管理交易是否被认为有效。典型的区块链应用由两层组成:第一层侧重于数据模式的建模,并通过智能合约(链码)和背书策略将业务规则嵌入到区块链中;第二层使用HLF提供的SDK (Software Development Kit)实现客户端应用逻辑。然而,这两层之间的差距阻碍了客户端SDK中链码和背书策略变化的快速采用。目前,链码定位和背书策略是静态配置到客户端SDK中的。这限制了客户机在平台发生更改时的可靠性和可用性,并使平台更难以使用。在这项工作中,我们通过描述服务发现的设计和实现来解决和弥合这一差距。服务发现提供的api允许动态发现客户端SDK与平台交互所需的配置,从而减轻了客户端维护它的负担。这使得客户端能够快速适应平台的变化,从而显著提高了应用层的可靠性。它还使HLF平台更易于消费,简化了创建区块链应用程序的工作。
{"title":"Service Discovery for Hyperledger Fabric","authors":"Yacov Manevich, Artem Barger, Y. Tock","doi":"10.1145/3210284.3219766","DOIUrl":"https://doi.org/10.1145/3210284.3219766","url":null,"abstract":"Hyperledger Fabric (HLF) is a modular and extensible permissioned blockchain platform released to open-source and hosted by the Linux Foundation. The platform's design exhibits principles required by enterprise grade business applications like supply-chains, financial transactions, asset management, food safety, and many more. For that end HLF introduces several innovations, two of which are smart contracts in general purpose languages (chaincode in HLF), and flexible endorsement policies, which govern whether a transaction is considered valid. Typical blockchain applications are comprised of two tiers: the first tier focuses on the modelling of the data schema and embedding of business rules into the blockchain by means of smart contracts (chaincode) and endorsment policies; and the second tier uses the SDK (Software Development Kit) provided by HLF to implement client side application logic. However there is a gap between the two tiers that hinders the rapid adoption of changes in the chaincode and endorsement policies within the client SDK. Currently, the chaincode location and endorsement policies are statically configured into the client SDK. This limits the reliability and availability of the client in the event of changes in the platform, and makes the platform more difficult to use. In this work we address and bridge the gap by describing the design and implementation of Service Discovery. Service Discovery provides APIs which allow dynamic discovery of the configuration required for the client SDK to interact with the platform, alleviating the client from the burden of maintaining it. This enables the client to rapidly adapt to changes in the platform, thus significantly improving the reliability of the application layer. It also makes the HLF platform more consumable, simplifying the job of creating blockchain applications.","PeriodicalId":412438,"journal":{"name":"Proceedings of the 12th ACM International Conference on Distributed and Event-based Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128291349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}