首页 > 最新文献

Proceedings of the Vldb Endowment最新文献

英文 中文
Breathing New Life into an Old Tree: Resolving Logging Dilemma of B + -tree on Modern Computational Storage Drives 为老树注入新生命:解决现代计算存储驱动器上 B + - 树的日志困境
IF 2.5 3区 计算机科学 Q1 Computer Science Pub Date : 2023-10-01 DOI: 10.14778/3626292.3626297
Kecheng Huang, Zhaoyan Shen, Zili Shao, Tong Zhang, Feng Chen
Having dominated databases and various data management systems for decades, B + -tree is infamously subject to a logging dilemma: One could improve B + -tree speed performance by equipping it with a larger log, which nevertheless will degrade its crash recovery speed. Such a logging dilemma is particularly prominent in the presence of modern workloads that involve intensive small writes. In this paper, we propose a novel solution, called per-page logging based B + -tree, which leverages the emerging computational storage drive (CSD) with built-in transparent compression to fundamentally resolve the logging dilemma. Our key idea is to divide the large single log into many small (e.g., 4KB), highly compressible per-page logs , each being statically bounded with a B + -tree page. All per-page logs together form a very large over-provisioned log space for B + -tree to improve its operational speed performance. Meanwhile, during crash recovery, B + -tree does not need to scan any per-page logs, leading to a recovery latency independent from the total log size. We have developed and open-sourced a fully functional prototype. Our evaluation results show that, under small-write intensive workloads, our design solution can improve B + -tree operational throughput by up to 625.6% and maintain a crash recovery time of as low as 19.2 ms, while incurring a minimal storage overhead of only 0.5-1.6%.
数十年来,B + - 树在数据库和各种数据管理系统中一直占据主导地位,但它却因日志困境而声名狼藉:人们可以通过为它配备更大的日志来提高 B + - 树的速度性能,但这却会降低它的崩溃恢复速度。这种日志困境在涉及密集小写的现代工作负载中尤为突出。在本文中,我们提出了一种新颖的解决方案,称为基于 B + -tree 的每页日志记录,它利用了新兴的计算存储驱动器(CSD)内置的透明压缩功能,从根本上解决了日志记录难题。我们的主要想法是将大型单一日志划分为许多小型(如 4KB)、可高度压缩的每页日志,每个日志都以 B + -tree 页面静态绑定。所有每页日志共同构成一个非常大的超额日志空间,供 B + -tree 使用,以提高其运行速度性能。同时,在崩溃恢复期间,B + -tree 不需要扫描任何每页日志,因此恢复延迟与总日志大小无关。我们开发并开源了一个功能齐全的原型。我们的评估结果表明,在小型写密集型工作负载下,我们的设计方案可以将 B + -tree 的运行吞吐量提高 625.6%,并将崩溃恢复时间保持在 19.2 毫秒,同时只产生 0.5-1.6% 的最小存储开销。
{"title":"Breathing New Life into an Old Tree: Resolving Logging Dilemma of B + -tree on Modern Computational Storage Drives","authors":"Kecheng Huang, Zhaoyan Shen, Zili Shao, Tong Zhang, Feng Chen","doi":"10.14778/3626292.3626297","DOIUrl":"https://doi.org/10.14778/3626292.3626297","url":null,"abstract":"Having dominated databases and various data management systems for decades, B + -tree is infamously subject to a logging dilemma: One could improve B + -tree speed performance by equipping it with a larger log, which nevertheless will degrade its crash recovery speed. Such a logging dilemma is particularly prominent in the presence of modern workloads that involve intensive small writes. In this paper, we propose a novel solution, called per-page logging based B + -tree, which leverages the emerging computational storage drive (CSD) with built-in transparent compression to fundamentally resolve the logging dilemma. Our key idea is to divide the large single log into many small (e.g., 4KB), highly compressible per-page logs , each being statically bounded with a B + -tree page. All per-page logs together form a very large over-provisioned log space for B + -tree to improve its operational speed performance. Meanwhile, during crash recovery, B + -tree does not need to scan any per-page logs, leading to a recovery latency independent from the total log size. We have developed and open-sourced a fully functional prototype. Our evaluation results show that, under small-write intensive workloads, our design solution can improve B + -tree operational throughput by up to 625.6% and maintain a crash recovery time of as low as 19.2 ms, while incurring a minimal storage overhead of only 0.5-1.6%.","PeriodicalId":54220,"journal":{"name":"Proceedings of the Vldb Endowment","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139330821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the Cusp: Computing Thrills and Perils and Professional Awakening 在风口浪尖上:计算机的刺激、危险和职业觉醒
3区 计算机科学 Q1 Computer Science Pub Date : 2023-08-01 DOI: 10.14778/3611540.3611640
Natasa Milic-Frayling
Over the past eight decades, computer science has advanced as a field, and the computing profession has matured by establishing professional codes of conduct, fostering best practices, and establishing industry standards to support the proliferation of technologies and services. Research and applications of digital computation continue to change all aspects of human endeavor through new waves of innovation. While it is clear that different research advances fuel innovation, the ways they come together to make an impact vary. In contrast to highly regulated sectors such as pharma, medicine and law, the process of transforming research into widely deployed technologies is not regulated. We reflect on collective practices, from discovery by scientists and engineers to market delivery by entrepreneurs, industry leaders, and practitioners. We consider ecosystem changes that are required to sustain the transformational effects of new technologies and enable new practices to take root. Every such transformation ruptures in the existing socio-technical fabric and requires a concerted effort to remedy this through effective policies and regulations. Computing experts are involved in all phases and must match the transformational power of their innovation with the highest standard of professional conduct. We highlight the principles of responsible innovation and discuss three waves of digital innovation. We use wide and uncontrolled generative AI deployments to illustrate risks from the implosion of digital media due to contamination of digital records, removal of human agency, and risk to an individual's personhood.
在过去的八十年里,计算机科学作为一个领域已经取得了进步,通过建立专业行为准则、培养最佳实践和建立行业标准来支持技术和服务的扩散,计算专业已经成熟。数字计算的研究和应用继续通过新的创新浪潮改变人类努力的各个方面。虽然很明显,不同的研究进展推动了创新,但它们结合在一起产生影响的方式各不相同。与制药、医药和法律等受到高度监管的行业相比,将研究转化为广泛应用的技术的过程没有受到监管。我们反思集体实践,从科学家和工程师的发现到企业家、行业领袖和从业者的市场交付。我们考虑维持新技术的转型效应和使新实践扎根所需的生态系统变化。每一种这种转变都会破坏现有的社会技术结构,需要通过有效的政策和条例作出协调一致的努力来加以补救。计算机专家参与所有阶段,必须以最高标准的专业行为来匹配他们创新的变革力量。我们强调负责任的创新原则,并讨论了数字创新的三波浪潮。我们使用广泛和不受控制的生成人工智能部署来说明由于数字记录的污染,人类代理的移除以及个人人格的风险而导致的数字媒体内爆的风险。
{"title":"On the Cusp: Computing Thrills and Perils and Professional Awakening","authors":"Natasa Milic-Frayling","doi":"10.14778/3611540.3611640","DOIUrl":"https://doi.org/10.14778/3611540.3611640","url":null,"abstract":"Over the past eight decades, computer science has advanced as a field, and the computing profession has matured by establishing professional codes of conduct, fostering best practices, and establishing industry standards to support the proliferation of technologies and services. Research and applications of digital computation continue to change all aspects of human endeavor through new waves of innovation. While it is clear that different research advances fuel innovation, the ways they come together to make an impact vary. In contrast to highly regulated sectors such as pharma, medicine and law, the process of transforming research into widely deployed technologies is not regulated. We reflect on collective practices, from discovery by scientists and engineers to market delivery by entrepreneurs, industry leaders, and practitioners. We consider ecosystem changes that are required to sustain the transformational effects of new technologies and enable new practices to take root. Every such transformation ruptures in the existing socio-technical fabric and requires a concerted effort to remedy this through effective policies and regulations. Computing experts are involved in all phases and must match the transformational power of their innovation with the highest standard of professional conduct. We highlight the principles of responsible innovation and discuss three waves of digital innovation. We use wide and uncontrolled generative AI deployments to illustrate risks from the implosion of digital media due to contamination of digital records, removal of human agency, and risk to an individual's personhood.","PeriodicalId":54220,"journal":{"name":"Proceedings of the Vldb Endowment","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134996880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Lingua Manga : A Generic Large Language Model Centric System for Data Curation Lingua Manga:一个通用的大型语言模型中心的数据管理系统
3区 计算机科学 Q1 Computer Science Pub Date : 2023-08-01 DOI: 10.14778/3611540.3611624
Zui Chen, Lei Cao, Sam Madden
Data curation is a wide-ranging area which contains many critical but time-consuming data processing tasks. However, the diversity of such tasks makes it challenging to develop a general-purpose data curation system. To address this issue, we present Lingua Manga, a user-friendly and versatile system that utilizes pre-trained large language models. Lingua Manga offers automatic optimization for achieving high performance and label efficiency while facilitating flexible and rapid development. Through three example applications with distinct objectives and users of varying levels of technical proficiency, we demonstrate that Lingua Manga can effectively assist both skilled programmers and low-code or even no-code users in addressing data curation challenges.
数据管理是一个广泛的领域,包含许多关键但耗时的数据处理任务。然而,这些任务的多样性使得开发通用的数据管理系统具有挑战性。为了解决这个问题,我们提出了Lingua Manga,这是一个用户友好且通用的系统,利用预训练的大型语言模型。Lingua Manga提供自动优化,以实现高性能和标签效率,同时促进灵活和快速的开发。通过三个具有不同目标和不同技术熟练程度用户的示例应用程序,我们证明了Lingua Manga可以有效地帮助熟练的程序员和低代码甚至无代码用户解决数据管理挑战。
{"title":"Lingua Manga : A Generic Large Language Model Centric System for Data Curation","authors":"Zui Chen, Lei Cao, Sam Madden","doi":"10.14778/3611540.3611624","DOIUrl":"https://doi.org/10.14778/3611540.3611624","url":null,"abstract":"Data curation is a wide-ranging area which contains many critical but time-consuming data processing tasks. However, the diversity of such tasks makes it challenging to develop a general-purpose data curation system. To address this issue, we present Lingua Manga, a user-friendly and versatile system that utilizes pre-trained large language models. Lingua Manga offers automatic optimization for achieving high performance and label efficiency while facilitating flexible and rapid development. Through three example applications with distinct objectives and users of varying levels of technical proficiency, we demonstrate that Lingua Manga can effectively assist both skilled programmers and low-code or even no-code users in addressing data curation challenges.","PeriodicalId":54220,"journal":{"name":"Proceedings of the Vldb Endowment","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134997925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Story of GraphLab - From Scaling Machine Learning to Shaping Graph Systems Research (VLDB 2023 Test-of-Time Award Talk) GraphLab的故事——从扩展机器学习到塑造图系统研究(VLDB 2023 Test-of-Time Award演讲)
3区 计算机科学 Q1 Computer Science Pub Date : 2023-08-01 DOI: 10.14778/3611540.3611637
Joseph E. Gonzalez, Yucheng Low
The GraphLab project spanned almost a decade and had profound academic and industrial impact on large-scale machine learning and graph processing systems. There were numerous papers written describing the innovations in GraphLab including the original vertex-centric [8] and edge-centric [3] programming abstractions, high-performance asynchronous execution engines [9], out-of-core graph computation [6], tabular graph-systems [4], and even new statistical inference algorithms [2] enabled by the GraphLab project. This work became the basis of multiple PhD theses [1, 5, 7]. The GraphLab open-source project had broad academic and industrial adoption and ultimately lead to the launch of Turi. In this talk, we tell the story of GraphLab, how it began and the key ideas behind it. We will focus on the approach to achieving scalable asynchronous systems in machine learning. During our talk, we will explore the impact that GraphLab has had on the development of graph processing systems, graph databases, and AI/ML; Additionally, we will share our insights and opinions into where we see the future of these fields heading. In the process, we highlight some of the lessons we learned and provide guidance for future students.
GraphLab项目历时近十年,对大规模机器学习和图形处理系统产生了深远的学术和工业影响。有许多论文描述了GraphLab的创新,包括原始的以顶点为中心[8]和以边缘为中心[3]的编程抽象、高性能异步执行引擎[9]、核外图计算[6]、表格图系统[4],甚至是GraphLab项目支持的新的统计推断算法[2]。这项工作成为多篇博士论文的基础[1,5,7]。GraphLab开源项目在学术界和工业界得到了广泛的采用,并最终导致了Turi的发布。在这次演讲中,我们将讲述GraphLab的故事,它是如何开始的,以及它背后的关键思想。我们将重点讨论在机器学习中实现可扩展异步系统的方法。在我们的演讲中,我们将探讨GraphLab对图处理系统、图数据库和AI/ML发展的影响;此外,我们将分享我们对这些领域未来发展方向的见解和观点。在这个过程中,我们强调了我们所学到的一些教训,并为未来的学生提供了指导。
{"title":"The Story of GraphLab - From Scaling Machine Learning to Shaping Graph Systems Research (VLDB 2023 Test-of-Time Award Talk)","authors":"Joseph E. Gonzalez, Yucheng Low","doi":"10.14778/3611540.3611637","DOIUrl":"https://doi.org/10.14778/3611540.3611637","url":null,"abstract":"The GraphLab project spanned almost a decade and had profound academic and industrial impact on large-scale machine learning and graph processing systems. There were numerous papers written describing the innovations in GraphLab including the original vertex-centric [8] and edge-centric [3] programming abstractions, high-performance asynchronous execution engines [9], out-of-core graph computation [6], tabular graph-systems [4], and even new statistical inference algorithms [2] enabled by the GraphLab project. This work became the basis of multiple PhD theses [1, 5, 7]. The GraphLab open-source project had broad academic and industrial adoption and ultimately lead to the launch of Turi. In this talk, we tell the story of GraphLab, how it began and the key ideas behind it. We will focus on the approach to achieving scalable asynchronous systems in machine learning. During our talk, we will explore the impact that GraphLab has had on the development of graph processing systems, graph databases, and AI/ML; Additionally, we will share our insights and opinions into where we see the future of these fields heading. In the process, we highlight some of the lessons we learned and provide guidance for future students.","PeriodicalId":54220,"journal":{"name":"Proceedings of the Vldb Endowment","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134998137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CM-Explorer: Dissecting Data Ingestion Problems CM-Explorer:剖析数据摄取问题
3区 计算机科学 Q1 Computer Science Pub Date : 2023-08-01 DOI: 10.14778/3611540.3611595
Niels Bylois, Frank Neven, Stijn Vansummeren
Data ingestion validation, the task of certifying the quality of continuously collected data, is crucial to ensure trustworthiness of analytics insights. A widely used approach for validating data quality is to specify, either manually or automatically, so-called data unit tests that check whether data quality metrics lie within expected bounds. We employ conditional unit tests based on conditional metrics (CMs) that compute data quality signals over specific parts of the ingestion data and therefore allow for a fine-grained detection of errors. A violated conditional unit test specifies a set of erroneous tuples in a natural way: the subrelation that its CM refers to. Unfortunately, the downside of their fine-grained nature is that violating unit tests are often correlated: a single error in an ingestion batch may cause multiple tests (each referring to different parts of the batch) to fail. The key challenge is therefore to untangle this correlation and filter out the most relevant violated conditional unit tests, i.e., tests that identify a core set of erroneous tuples and act as an explanation for the errors. We present CM-Explorer, a system that supports data stewards in quickly finding the most relevant violated conditional unit tests. The system consists of three components: (1) a graph explorer for visualizing the correlation structure of the violated unit tests; (2) a relation explorer for browsing the tuples selected by conditional unit tests; and, (3) a history explorer to get insight why conditional unit tests are violated. In this paper, we discuss these components and present the different scenarios that we make available for the demonstration.
数据摄取验证,即验证持续收集的数据质量的任务,对于确保分析见解的可信度至关重要。用于验证数据质量的一种广泛使用的方法是手动或自动指定所谓的数据单元测试,以检查数据质量度量是否在预期范围内。我们采用了基于条件度量(CMs)的条件单元测试,它计算摄取数据的特定部分的数据质量信号,从而允许对错误进行细粒度检测。违反的条件单元测试以一种自然的方式指定了一组错误的元组:它的CM引用的子关系。不幸的是,它们细粒度特性的缺点是违反单元测试通常是相关的:摄取批处理中的单个错误可能导致多个测试(每个测试引用批处理的不同部分)失败。因此,关键的挑战是解开这种关联,并过滤掉最相关的违反条件的单元测试,即识别一组核心错误元组并充当错误解释的测试。我们提出CM-Explorer,一个支持数据管理员快速查找最相关的违反条件单元测试的系统。该系统由三个部分组成:(1)图形浏览器,用于可视化违反单元测试的关联结构;(2)一个关系浏览器,用于浏览条件单元测试选择的元组;(3)一个历史探索者来了解为什么违反了条件单元测试。在本文中,我们将讨论这些组件,并展示我们为演示提供的不同场景。
{"title":"CM-Explorer: Dissecting Data Ingestion Problems","authors":"Niels Bylois, Frank Neven, Stijn Vansummeren","doi":"10.14778/3611540.3611595","DOIUrl":"https://doi.org/10.14778/3611540.3611595","url":null,"abstract":"Data ingestion validation, the task of certifying the quality of continuously collected data, is crucial to ensure trustworthiness of analytics insights. A widely used approach for validating data quality is to specify, either manually or automatically, so-called data unit tests that check whether data quality metrics lie within expected bounds. We employ conditional unit tests based on conditional metrics (CMs) that compute data quality signals over specific parts of the ingestion data and therefore allow for a fine-grained detection of errors. A violated conditional unit test specifies a set of erroneous tuples in a natural way: the subrelation that its CM refers to. Unfortunately, the downside of their fine-grained nature is that violating unit tests are often correlated: a single error in an ingestion batch may cause multiple tests (each referring to different parts of the batch) to fail. The key challenge is therefore to untangle this correlation and filter out the most relevant violated conditional unit tests, i.e., tests that identify a core set of erroneous tuples and act as an explanation for the errors. We present CM-Explorer, a system that supports data stewards in quickly finding the most relevant violated conditional unit tests. The system consists of three components: (1) a graph explorer for visualizing the correlation structure of the violated unit tests; (2) a relation explorer for browsing the tuples selected by conditional unit tests; and, (3) a history explorer to get insight why conditional unit tests are violated. In this paper, we discuss these components and present the different scenarios that we make available for the demonstration.","PeriodicalId":54220,"journal":{"name":"Proceedings of the Vldb Endowment","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134998138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Databases on Modern Networks: A Decade of Research That Now Comes into Practice 现代网络上的数据库:十年来的研究现已付诸实践
3区 计算机科学 Q1 Computer Science Pub Date : 2023-08-01 DOI: 10.14778/3611540.3611579
Alberto Lerner, Carsten Binnig, Philippe Cudré-Mauroux, Rana Hussein, Matthias Jasny, Theo Jepsen, Dan R. K. Ports, Lasse Thostrup, Tobias Ziegler
Modern cloud networks are a fundamental pillar of data-intensive applications. They provide high-speed transaction (packet) rates and low overhead, enabling, for instance, truly scalable database designs. These networks, however, are fundamentally different from conventional ones. Arguably, the two key discerning technologies are RDMA and programmable network devices. Today, these technologies are not niche technologies anymore and are widely deployed across all major cloud vendors. The question is thus not if but how a new breed of data-intensive applications can benefit from modern networks, given the perceived difficulty in using and programming them. This tutorial addresses these challenges by exposing how the underlying principles changed as the network evolved and by presenting the new system design opportunities they opened. In the process, we also discuss several hard-earned lessons accumulated by making the transition first-hand.
现代云网络是数据密集型应用的基本支柱。它们提供高速的事务(包)速率和低开销,例如,支持真正可伸缩的数据库设计。然而,这些网络与传统网络有着根本的不同。可以说,两个关键的识别技术是RDMA和可编程网络设备。如今,这些技术不再是小众技术,而是被广泛部署在所有主要的云供应商中。因此,问题不是如果,而是考虑到使用和编程的困难,如何从现代网络中受益。本教程通过揭示底层原则如何随着网络的发展而变化,并介绍它们所带来的新系统设计机会,来解决这些挑战。在此过程中,我们还讨论了通过第一手过渡积累的一些来之不易的经验教训。
{"title":"Databases on Modern Networks: A Decade of Research That Now Comes into Practice","authors":"Alberto Lerner, Carsten Binnig, Philippe Cudré-Mauroux, Rana Hussein, Matthias Jasny, Theo Jepsen, Dan R. K. Ports, Lasse Thostrup, Tobias Ziegler","doi":"10.14778/3611540.3611579","DOIUrl":"https://doi.org/10.14778/3611540.3611579","url":null,"abstract":"Modern cloud networks are a fundamental pillar of data-intensive applications. They provide high-speed transaction (packet) rates and low overhead, enabling, for instance, truly scalable database designs. These networks, however, are fundamentally different from conventional ones. Arguably, the two key discerning technologies are RDMA and programmable network devices. Today, these technologies are not niche technologies anymore and are widely deployed across all major cloud vendors. The question is thus not if but how a new breed of data-intensive applications can benefit from modern networks, given the perceived difficulty in using and programming them. This tutorial addresses these challenges by exposing how the underlying principles changed as the network evolved and by presenting the new system design opportunities they opened. In the process, we also discuss several hard-earned lessons accumulated by making the transition first-hand.","PeriodicalId":54220,"journal":{"name":"Proceedings of the Vldb Endowment","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134998305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Techniques and Efficiencies from Building a Real-Time DBMS 构建实时DBMS的技术和效率
3区 计算机科学 Q1 Computer Science Pub Date : 2023-08-01 DOI: 10.14778/3611540.3611556
V. Srinivasan, Andrew Gooding, Sunil Sayyaparaju, Thomas Lopatic, Kevin Porter, Ashish Shinde, B. Narendran
This paper describes a variety of techniques from over a decade of developing Aerospike (formerly Citrusleaf), a real-time DBMS that is being used in some of the world's largest mission-critical systems that require the highest levels of performance and availability. Such mission-critical systems have many requirements including the ability to make decisions within a strict real-time SLA (milliseconds) with no downtime, predictable performance so that the first and billionth customer gets the same experience, ability to scale up 10X (or even 100X) with no downtime, support strong consistency for applications that need it, synchronous and asynchronous replication with global transactional capabilities, and the ability to deploy in any public and private cloud environments. We describe how using efficient algorithms to optimize every area of the DBMS helps the system achieve these stringent requirements. Specifically, we describe, effective ways to shard, place and locate data across a set of nodes, efficient identification of cluster membership and cluster changes, efficiencies generated by using a 'smart' client, how to effectively use replications with two copies replication instead of three-copy, how to reduce the cost of the realtime data footprint by combining the use of memory with flash storage, self-managing clusters for ease of operation including elastic scaling, networking and CPU optimizations including NUMA pinning with multi-threading. The techniques and efficiencies described here have enabled hundreds of deployments to grow by many orders of magnitude with near complete uptime.
本文介绍了十多年来开发Aerospike(前身为Citrusleaf)的各种技术,Aerospike是一种实时DBMS,用于世界上一些要求最高性能和可用性的大型关键任务系统。这样的关键任务系统有许多要求,包括在严格的实时SLA(毫秒级)内做出决策的能力,没有停机时间,可预测的性能,以便第一个和第十亿个客户获得相同的体验,扩展10倍(甚至100倍)的能力,没有停机时间,支持需要它的应用程序的强一致性,具有全局事务功能的同步和异步复制,以及在任何公共和私有云环境中部署的能力。我们描述了如何使用有效的算法来优化DBMS的每个领域,以帮助系统实现这些严格的要求。具体来说,我们描述了在一组节点上分片、放置和定位数据的有效方法,集群成员和集群变化的有效识别,使用“智能”客户端产生的效率,如何有效地使用双副本复制而不是三副本复制,如何通过结合使用内存和闪存来降低实时数据占用的成本,自我管理集群以方便操作,包括弹性扩展,网络和CPU优化,包括多线程的NUMA固定。本文描述的技术和效率已经使数百个部署的数量增长了许多个数量级,并且几乎完全正常运行。
{"title":"Techniques and Efficiencies from Building a Real-Time DBMS","authors":"V. Srinivasan, Andrew Gooding, Sunil Sayyaparaju, Thomas Lopatic, Kevin Porter, Ashish Shinde, B. Narendran","doi":"10.14778/3611540.3611556","DOIUrl":"https://doi.org/10.14778/3611540.3611556","url":null,"abstract":"This paper describes a variety of techniques from over a decade of developing Aerospike (formerly Citrusleaf), a real-time DBMS that is being used in some of the world's largest mission-critical systems that require the highest levels of performance and availability. Such mission-critical systems have many requirements including the ability to make decisions within a strict real-time SLA (milliseconds) with no downtime, predictable performance so that the first and billionth customer gets the same experience, ability to scale up 10X (or even 100X) with no downtime, support strong consistency for applications that need it, synchronous and asynchronous replication with global transactional capabilities, and the ability to deploy in any public and private cloud environments. We describe how using efficient algorithms to optimize every area of the DBMS helps the system achieve these stringent requirements. Specifically, we describe, effective ways to shard, place and locate data across a set of nodes, efficient identification of cluster membership and cluster changes, efficiencies generated by using a 'smart' client, how to effectively use replications with two copies replication instead of three-copy, how to reduce the cost of the realtime data footprint by combining the use of memory with flash storage, self-managing clusters for ease of operation including elastic scaling, networking and CPU optimizations including NUMA pinning with multi-threading. The techniques and efficiencies described here have enabled hundreds of deployments to grow by many orders of magnitude with near complete uptime.","PeriodicalId":54220,"journal":{"name":"Proceedings of the Vldb Endowment","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135002981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On-the-Fly Data Transformation in Action 实时数据转换在行动
3区 计算机科学 Q1 Computer Science Pub Date : 2023-08-01 DOI: 10.14778/3611540.3611593
Ju Hyoung Mun, Konstantinos Karatsenidis, Tarikul Islam Papon, Shahin Roozkhosh, Denis Hoornaert, Ulrich Drepper, Ahmed Sanaullah, Renato Mancuso, Manos Athanassoulis
Transactional and analytical database management systems (DBMS) typically employ different data layouts: row-stores for the first and column-stores for the latter. In order to bridge the requirements of the two without maintaining two systems and two (or more) copies of the data, our proposed system Relational Memory employs specialized hardware that transforms the base row table into arbitrary column groups at query execution time. This approach maximizes the cache locality and is easy to use via a simple abstraction that allows transparent on-the-fly data transformation. Here, we demonstrate how to deploy and use Relational Memory via four representative scenarios. The demonstration uses the full-stack implementation of Relational Memory on the Xilinx Zynq UltraScale+ MPSoC platform. Conference participants will interact with Relational Memory deployed in the actual platform.
事务性和分析性数据库管理系统(DBMS)通常采用不同的数据布局:前者采用行存储,后者采用列存储。为了在不维护两个系统和两个(或更多)数据副本的情况下弥合这两者的需求,我们建议的系统关系内存使用专门的硬件,在查询执行时将基行表转换为任意列组。这种方法最大限度地提高了缓存的局部性,并且通过一个简单的抽象(允许透明的动态数据转换)易于使用。在这里,我们将通过四个代表性场景演示如何部署和使用关系内存。该演示在Xilinx Zynq UltraScale+ MPSoC平台上使用关系内存的全栈实现。会议参与者将与部署在实际平台中的关系内存进行交互。
{"title":"On-the-Fly Data Transformation in Action","authors":"Ju Hyoung Mun, Konstantinos Karatsenidis, Tarikul Islam Papon, Shahin Roozkhosh, Denis Hoornaert, Ulrich Drepper, Ahmed Sanaullah, Renato Mancuso, Manos Athanassoulis","doi":"10.14778/3611540.3611593","DOIUrl":"https://doi.org/10.14778/3611540.3611593","url":null,"abstract":"Transactional and analytical database management systems (DBMS) typically employ different data layouts: row-stores for the first and column-stores for the latter. In order to bridge the requirements of the two without maintaining two systems and two (or more) copies of the data, our proposed system Relational Memory employs specialized hardware that transforms the base row table into arbitrary column groups at query execution time. This approach maximizes the cache locality and is easy to use via a simple abstraction that allows transparent on-the-fly data transformation. Here, we demonstrate how to deploy and use Relational Memory via four representative scenarios. The demonstration uses the full-stack implementation of Relational Memory on the Xilinx Zynq UltraScale+ MPSoC platform. Conference participants will interact with Relational Memory deployed in the actual platform.","PeriodicalId":54220,"journal":{"name":"Proceedings of the Vldb Endowment","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135003649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Progressive Partitioning for Parallelized Query Execution in Google's Napa b谷歌的Napa并行查询执行的渐进式分区
3区 计算机科学 Q1 Computer Science Pub Date : 2023-08-01 DOI: 10.14778/3611540.3611541
Junichi Tatemura, Tao Zou, Jagan Sankaranarayanan, Yanlai Huang, Jim Chen, Yupu Zhang, Kevin Lai, Hao Zhang, Gokul Nath Babu Manoharan, Goetz Graefe, Divyakant Agrawal, Brad Adelberg, Shilpa Kolhar, Indrajit Roy
Napa holds Google's critical data warehouses in log-structured merge trees for real-time data ingestion and sub-second response for billions of queries per day. These queries are often multi-key look-ups in highly skewed tables and indexes. In our production experience, only progressive query-specific partitioning can achieve Napa's strict query latency SLOs. Here we advocate good-enough partitioning that keeps the per-query partitioning time low without risking uneven work distribution. Our design combines pragmatic system choices and algorithmic innovations. For instance, B-trees are augmented with statistics of key distributions, thus serving the dual purpose of aiding lookups and partitioning. Furthermore, progressive partitioning is designed to be "good enough" thereby balancing partitioning time with performance. The resulting system is robust and successfully serves day-in-day-out billions of queries with very high quality of service forming a core infrastructure at Google.
Napa在日志结构的合并树中拥有谷歌的关键数据仓库,用于实时数据摄取和每天数十亿次查询的亚秒级响应。这些查询通常是在高度倾斜的表和索引中进行多键查找。在我们的生产经验中,只有渐进式特定于查询的分区才能实现Napa严格的查询延迟slo。在这里,我们提倡足够好的分区,使每个查询的分区时间保持在较低的水平,而不会冒工作分布不均匀的风险。我们的设计结合了实用的系统选择和算法创新。例如,用键分布的统计信息扩充b树,从而达到辅助查找和分区的双重目的。此外,渐进式分区被设计为“足够好”,从而平衡分区时间和性能。由此产生的系统非常强大,并成功地为每天数十亿的查询提供高质量的服务,形成了谷歌的核心基础设施。
{"title":"Progressive Partitioning for Parallelized Query Execution in Google's Napa","authors":"Junichi Tatemura, Tao Zou, Jagan Sankaranarayanan, Yanlai Huang, Jim Chen, Yupu Zhang, Kevin Lai, Hao Zhang, Gokul Nath Babu Manoharan, Goetz Graefe, Divyakant Agrawal, Brad Adelberg, Shilpa Kolhar, Indrajit Roy","doi":"10.14778/3611540.3611541","DOIUrl":"https://doi.org/10.14778/3611540.3611541","url":null,"abstract":"Napa holds Google's critical data warehouses in log-structured merge trees for real-time data ingestion and sub-second response for billions of queries per day. These queries are often multi-key look-ups in highly skewed tables and indexes. In our production experience, only progressive query-specific partitioning can achieve Napa's strict query latency SLOs. Here we advocate good-enough partitioning that keeps the per-query partitioning time low without risking uneven work distribution. Our design combines pragmatic system choices and algorithmic innovations. For instance, B-trees are augmented with statistics of key distributions, thus serving the dual purpose of aiding lookups and partitioning. Furthermore, progressive partitioning is designed to be \"good enough\" thereby balancing partitioning time with performance. The resulting system is robust and successfully serves day-in-day-out billions of queries with very high quality of service forming a core infrastructure at Google.","PeriodicalId":54220,"journal":{"name":"Proceedings of the Vldb Endowment","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135003653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel PyTorch FSDP:扩展完全分片数据并行的经验
3区 计算机科学 Q1 Computer Science Pub Date : 2023-08-01 DOI: 10.14778/3611540.3611569
Yanli Zhao, Andrew Gu, Rohan Varma, Liang Luo, Chien-Chin Huang, Min Xu, Less Wright, Hamid Shojanazeri, Myle Ott, Sam Shleifer, Alban Desmaison, Can Balioglu, Pritam Damania, Bernard Nguyen, Geeta Chauhan, Yuchen Hao, Ajit Mathews, Shen Li
It is widely acknowledged that large models have the potential to deliver superior performance across a broad range of domains. Despite the remarkable progress made in the field of machine learning systems research, which has enabled the development and exploration of large models, such abilities remain confined to a small group of advanced users and industry leaders, resulting in an implicit technical barrier for the wider community to access and leverage these technologies. In this paper, we introduce PyTorch Fully Sharded Data Parallel (FSDP) as an industry-grade solution for large model training. FSDP has been closely co-designed with several key PyTorch core components including Tensor implementation, dispatcher system, and CUDA memory caching allocator, to provide non-intrusive user experiences and high training efficiency. Additionally, FSDP natively incorporates a range of techniques and settings to optimize resource utilization across a variety of hardware configurations. The experimental results demonstrate that FSDP is capable of achieving comparable performance to Distributed Data Parallel while providing support for significantly larger models with near-linear scalability in terms of TFLOPS.
人们普遍认为,大型模型有潜力在广泛的领域提供卓越的性能。尽管在机器学习系统研究领域取得了显著进展,这使得大型模型的开发和探索成为可能,但这种能力仍然局限于一小部分高级用户和行业领导者,导致更广泛的社区访问和利用这些技术存在隐性的技术障碍。在本文中,我们介绍了PyTorch完全分片数据并行(FSDP)作为大型模型训练的工业级解决方案。FSDP与几个关键PyTorch核心组件紧密合作设计,包括张量实现,调度系统和CUDA内存缓存分配器,以提供非侵入式用户体验和高培训效率。此外,FSDP集成了一系列技术和设置,可以优化各种硬件配置的资源利用率。实验结果表明,FSDP能够实现与分布式数据并行相当的性能,同时在TFLOPS方面为具有近线性可扩展性的更大模型提供支持。
{"title":"PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel","authors":"Yanli Zhao, Andrew Gu, Rohan Varma, Liang Luo, Chien-Chin Huang, Min Xu, Less Wright, Hamid Shojanazeri, Myle Ott, Sam Shleifer, Alban Desmaison, Can Balioglu, Pritam Damania, Bernard Nguyen, Geeta Chauhan, Yuchen Hao, Ajit Mathews, Shen Li","doi":"10.14778/3611540.3611569","DOIUrl":"https://doi.org/10.14778/3611540.3611569","url":null,"abstract":"It is widely acknowledged that large models have the potential to deliver superior performance across a broad range of domains. Despite the remarkable progress made in the field of machine learning systems research, which has enabled the development and exploration of large models, such abilities remain confined to a small group of advanced users and industry leaders, resulting in an implicit technical barrier for the wider community to access and leverage these technologies. In this paper, we introduce PyTorch Fully Sharded Data Parallel (FSDP) as an industry-grade solution for large model training. FSDP has been closely co-designed with several key PyTorch core components including Tensor implementation, dispatcher system, and CUDA memory caching allocator, to provide non-intrusive user experiences and high training efficiency. Additionally, FSDP natively incorporates a range of techniques and settings to optimize resource utilization across a variety of hardware configurations. The experimental results demonstrate that FSDP is capable of achieving comparable performance to Distributed Data Parallel while providing support for significantly larger models with near-linear scalability in terms of TFLOPS.","PeriodicalId":54220,"journal":{"name":"Proceedings of the Vldb Endowment","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135165172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 43
期刊
Proceedings of the Vldb Endowment
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1