Orko Momin, Cengiz Karakoyunlu, Michael T. Runde, J. Chandy
The current file system and storage stack is restricted in the amount of information that flows from application to storage and from storage to application. This limits the ability of applications to tailor the storage system to particular needs of the application. In this paper, we investigate the programmability of the storage system stack and how to enable application aware storage. Our focus is on object storage systems because of its amenability to these ideas. We introduce two main ideas, namely enabling active objects in order to allow computation at the object storage system and the use of higher level object interfaces to enable intra-stack communications to allow application-aware storage and storage-aware applications. We show preliminary results using a key-value interface to access object stores directly.
{"title":"Creating a programmable object storage stack","authors":"Orko Momin, Cengiz Karakoyunlu, Michael T. Runde, J. Chandy","doi":"10.1145/2603941.2603942","DOIUrl":"https://doi.org/10.1145/2603941.2603942","url":null,"abstract":"The current file system and storage stack is restricted in the amount of information that flows from application to storage and from storage to application. This limits the ability of applications to tailor the storage system to particular needs of the application. In this paper, we investigate the programmability of the storage system stack and how to enable application aware storage. Our focus is on object storage systems because of its amenability to these ideas. We introduce two main ideas, namely enabling active objects in order to allow computation at the object storage system and the use of higher level object interfaces to enable intra-stack communications to allow application-aware storage and storage-aware applications. We show preliminary results using a key-value interface to access object stores directly.","PeriodicalId":358865,"journal":{"name":"PFSW '14","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130432211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
From social networking websites to bank transactions, we interact with data-intensive applications every day. Such applications are typically hosted on an application server that interacts with a database server to manipulate persistent data. To make such applications efficient, developers face the daunting task of mastering the intricacies of both the programming system and the database management system. For instance, while many application features can be implemented in either the application or pushed into the database, it is difficult for a programmer to decide where to place a given computation as the decision is typically workload-driven. Unfortunately, making the wrong choice often results in drastic performance hit. In this talk, I will show how examining both the programming system and the database management system at the same time allows us to significantly improve the performance of data-intensive applications. To illustrate such cross-system optimization opportunities, I have designed, built, and evaluated three systems: Query By Synthesis, a tool that converts functionality written as imperative code into relational queries; Sloth, a system that combines queries embedded in applications into batches; and Pyxis, a system that seamlessly moves computation between application and database servers. Using real-world examples, I will show that these systems allow orders of magnitude performance improvement and graceful adaptation to changing server environments while preserving the high-level programming interface to the developer. I will furthermore highlight research opportunities and challenges in applying similar techniques to other system problems.
从社交网站到银行交易,我们每天都与数据密集型应用程序交互。此类应用程序通常托管在与数据库服务器交互以操作持久数据的应用程序服务器上。为了使这样的应用程序高效,开发人员面临着掌握编程系统和数据库管理系统的复杂性的艰巨任务。例如,虽然许多应用程序特性既可以在应用程序中实现,也可以推送到数据库中,但程序员很难决定将给定的计算放在哪里,因为决策通常是由工作负载驱动的。不幸的是,做出错误的选择往往会导致严重的性能损失。在这次演讲中,我将展示如何同时检查编程系统和数据库管理系统使我们能够显著提高数据密集型应用程序的性能。为了说明这种跨系统优化的机会,我设计、构建并评估了三个系统:综合查询(Query By Synthesis),这是一种将命令式代码编写的功能转换为关系查询的工具;Sloth,一个将嵌入在应用程序中的查询组合成批量的系统;Pyxis是一个在应用程序和数据库服务器之间无缝移动计算的系统。通过使用真实世界的示例,我将展示这些系统在为开发人员保留高级编程接口的同时,允许数量级的性能改进和对不断变化的服务器环境的优雅适应。我将进一步强调将类似技术应用于其他系统问题的研究机会和挑战。
{"title":"Rethinking the application-database interface","authors":"Alvin Cheung","doi":"10.1145/2603941.2603947","DOIUrl":"https://doi.org/10.1145/2603941.2603947","url":null,"abstract":"From social networking websites to bank transactions, we interact with data-intensive applications every day. Such applications are typically hosted on an application server that interacts with a database server to manipulate persistent data. To make such applications efficient, developers face the daunting task of mastering the intricacies of both the programming system and the database management system. For instance, while many application features can be implemented in either the application or pushed into the database, it is difficult for a programmer to decide where to place a given computation as the decision is typically workload-driven. Unfortunately, making the wrong choice often results in drastic performance hit.\u0000 In this talk, I will show how examining both the programming system and the database management system at the same time allows us to significantly improve the performance of data-intensive applications. To illustrate such cross-system optimization opportunities, I have designed, built, and evaluated three systems: Query By Synthesis, a tool that converts functionality written as imperative code into relational queries; Sloth, a system that combines queries embedded in applications into batches; and Pyxis, a system that seamlessly moves computation between application and database servers. Using real-world examples, I will show that these systems allow orders of magnitude performance improvement and graceful adaptation to changing server environments while preserving the high-level programming interface to the developer. I will furthermore highlight research opportunities and challenges in applying similar techniques to other system problems.","PeriodicalId":358865,"journal":{"name":"PFSW '14","volume":"133 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121572795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present a reservation scheduler for object-based file systems. It supports storage virtualization for multi-tenant cloud environments with quality of service (QoS) guarantees. The reservation scheduler has been integrated into the XtreemFS cloud file system to maximize the resource utilization under the given QoS demands. Our simulation results obtained with a discrete event simulator indicate that a considerable number of active object stores can be saved while still ensuring the requested service guarantees (capacity, throughput, IOPS, etc.).
{"title":"QoS-aware storage virtualization for cloud file systems","authors":"Christoph Kleineweber, A. Reinefeld, T. Schütt","doi":"10.1145/2603941.2603944","DOIUrl":"https://doi.org/10.1145/2603941.2603944","url":null,"abstract":"We present a reservation scheduler for object-based file systems. It supports storage virtualization for multi-tenant cloud environments with quality of service (QoS) guarantees. The reservation scheduler has been integrated into the XtreemFS cloud file system to maximize the resource utilization under the given QoS demands. Our simulation results obtained with a discrete event simulator indicate that a considerable number of active object stores can be saved while still ensuring the requested service guarantees (capacity, throughput, IOPS, etc.).","PeriodicalId":358865,"journal":{"name":"PFSW '14","volume":"146 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132191393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hao Xu, J. Ward, Mike C. Conway, A. Rajasekar, Reagan Moore
In the practice of data management, different user communities often use different forms of data and metadata, support different operations, and implement different policies. For example, forms of data include blocks, tuples, streams, time series, and so forth, while forms of metadata include different vocabularies, schemas, and namespaces. In addition, the forms of data and metadata often change over time. The diverse and emergent nature of these requirements pose to data management systems a challenge which traditional file systems with fixed functionality have become inadequate to address. Extensible file systems can be built via policy-based data management. We describe the practical and theoretical aspects of policy-based data management and provide a wide range of examples of prototypical and production applications of the integrated Rule-Oriented Data System (iRODS) in data grids that apply this type of file system.
{"title":"Building an extensible file system via policy-based data management","authors":"Hao Xu, J. Ward, Mike C. Conway, A. Rajasekar, Reagan Moore","doi":"10.1145/2603941.2603943","DOIUrl":"https://doi.org/10.1145/2603941.2603943","url":null,"abstract":"In the practice of data management, different user communities often use different forms of data and metadata, support different operations, and implement different policies. For example, forms of data include blocks, tuples, streams, time series, and so forth, while forms of metadata include different vocabularies, schemas, and namespaces. In addition, the forms of data and metadata often change over time. The diverse and emergent nature of these requirements pose to data management systems a challenge which traditional file systems with fixed functionality have become inadequate to address. Extensible file systems can be built via policy-based data management. We describe the practical and theoretical aspects of policy-based data management and provide a wide range of examples of prototypical and production applications of the integrated Rule-Oriented Data System (iRODS) in data grids that apply this type of file system.","PeriodicalId":358865,"journal":{"name":"PFSW '14","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124431659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}