模块化数据存储与Anvil

Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles Pub Date : 2009-10-11 DOI:10.1145/1629575.1629590

Mike Mammarella, Shant Hovsepian, E. Kohler

{"title":"模块化数据存储与Anvil","authors":"Mike Mammarella, Shant Hovsepian, E. Kohler","doi":"10.1145/1629575.1629590","DOIUrl":null,"url":null,"abstract":"Databases have achieved orders-of-magnitude performance improvements by changing the layout of stored data -- for instance, by arranging data in columns or compressing it before storage. These improvements have been implemented in monolithic new engines, however, making it difficult to experiment with feature combinations or extensions. We present Anvil, a modular and extensible toolkit for building database back ends. Anvil's storage modules, called dTables, have much finer granularity than prior work. For example, some dTables specialize in writing data, while others provide optimized read-only formats. This specialization makes both kinds of dTable simple to write and understand. Unifying dTables implement more comprehensive functionality by layering over other dTables -- for instance, building a read/write store from read-only tables and a writable journal, or building a general-purpose store from optimized special-purpose stores. The dTable design leads to a flexible system powerful enough to implement many database storage layouts. Our prototype implementation of Anvil performs up to 5.5 times faster than an existing B-tree-based database back end on conventional workloads, and can easily be customized for further gains on specific data and workloads.","PeriodicalId":20672,"journal":{"name":"Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles","volume":"16 1","pages":"147-160"},"PeriodicalIF":0.0000,"publicationDate":"2009-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"35","resultStr":"{\"title\":\"Modular data storage with Anvil\",\"authors\":\"Mike Mammarella, Shant Hovsepian, E. Kohler\",\"doi\":\"10.1145/1629575.1629590\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Databases have achieved orders-of-magnitude performance improvements by changing the layout of stored data -- for instance, by arranging data in columns or compressing it before storage. These improvements have been implemented in monolithic new engines, however, making it difficult to experiment with feature combinations or extensions. We present Anvil, a modular and extensible toolkit for building database back ends. Anvil's storage modules, called dTables, have much finer granularity than prior work. For example, some dTables specialize in writing data, while others provide optimized read-only formats. This specialization makes both kinds of dTable simple to write and understand. Unifying dTables implement more comprehensive functionality by layering over other dTables -- for instance, building a read/write store from read-only tables and a writable journal, or building a general-purpose store from optimized special-purpose stores. The dTable design leads to a flexible system powerful enough to implement many database storage layouts. Our prototype implementation of Anvil performs up to 5.5 times faster than an existing B-tree-based database back end on conventional workloads, and can easily be customized for further gains on specific data and workloads.\",\"PeriodicalId\":20672,\"journal\":{\"name\":\"Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles\",\"volume\":\"16 1\",\"pages\":\"147-160\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-10-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"35\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1629575.1629590\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1629575.1629590","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 35

摘要

通过更改存储数据的布局，数据库已经实现了数量级的性能改进——例如，在列中安排数据或在存储之前压缩数据。这些改进是在单一的新引擎中实现的，然而，这使得实验功能组合或扩展变得困难。我们提出了Anvil，一个模块化和可扩展的工具包，用于构建数据库后端。Anvil的存储模块，称为dTables，比以前的工作具有更细的粒度。例如，一些dtable专门用于写数据，而另一些则提供优化的只读格式。这种专门化使得这两种dTable都易于编写和理解。统一的dtable通过对其他dtable进行分层来实现更全面的功能——例如，从只读表和可写日志构建一个读写存储，或者从优化的特殊用途存储构建一个通用存储。dTable的设计导致了一个灵活的系统，足够强大，可以实现许多数据库存储布局。我们的Anvil原型实现在传统工作负载上的执行速度比现有的基于b树的数据库后端快5.5倍，并且可以很容易地定制以获得特定数据和工作负载的进一步收益。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Modular data storage with Anvil

Databases have achieved orders-of-magnitude performance improvements by changing the layout of stored data -- for instance, by arranging data in columns or compressing it before storage. These improvements have been implemented in monolithic new engines, however, making it difficult to experiment with feature combinations or extensions. We present Anvil, a modular and extensible toolkit for building database back ends. Anvil's storage modules, called dTables, have much finer granularity than prior work. For example, some dTables specialize in writing data, while others provide optimized read-only formats. This specialization makes both kinds of dTable simple to write and understand. Unifying dTables implement more comprehensive functionality by layering over other dTables -- for instance, building a read/write store from read-only tables and a writable journal, or building a general-purpose store from optimized special-purpose stores. The dTable design leads to a flexible system powerful enough to implement many database storage layouts. Our prototype implementation of Anvil performs up to 5.5 times faster than an existing B-tree-based database back end on conventional workloads, and can easily be customized for further gains on specific data and workloads.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles

自引率

0.00%

发文量