Brecht Vandevoort, Bas Ketsman, Christoph E. Koch, F. Neven
The popular isolation level multiversion Read Committed (RC) exchanges some of the strong guarantees of serializability for increased transaction throughput. Nevertheless, transaction workloads can sometimes be executed under RC while still guaranteeing serializability at a reduced cost. Such workloads are said to be robust against RC. This paper provides a high level overview of deciding robustness against RC. In particular, we discuss how a sound and complete test can be obtained through the formalization of transaction templates. We then increase the modeling power of transaction templates by extending them with functional constraints which are useful for capturing data dependencies like foreign keys. We show that the incorporation of functional constraints can identify more workloads as robust than would otherwise be the case. Even though the robustness problem becomes undecidable in its most general form, we establish that various restrictions on functional constraints lead to decidable and even tractable results that can be used to model and test for robustness against RC for practical scenarios.
{"title":"When is it safe to run a transactional workload under Read Committed?","authors":"Brecht Vandevoort, Bas Ketsman, Christoph E. Koch, F. Neven","doi":"10.1145/3604437.3604446","DOIUrl":"https://doi.org/10.1145/3604437.3604446","url":null,"abstract":"The popular isolation level multiversion Read Committed (RC) exchanges some of the strong guarantees of serializability for increased transaction throughput. Nevertheless, transaction workloads can sometimes be executed under RC while still guaranteeing serializability at a reduced cost. Such workloads are said to be robust against RC. This paper provides a high level overview of deciding robustness against RC. In particular, we discuss how a sound and complete test can be obtained through the formalization of transaction templates. We then increase the modeling power of transaction templates by extending them with functional constraints which are useful for capturing data dependencies like foreign keys. We show that the incorporation of functional constraints can identify more workloads as robust than would otherwise be the case. Even though the robustness problem becomes undecidable in its most general form, we establish that various restrictions on functional constraints lead to decidable and even tractable results that can be used to model and test for robustness against RC for practical scenarios.","PeriodicalId":346332,"journal":{"name":"ACM SIGMOD Record","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121084770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Most database research papers are prescriptive. They identify a technical problem and show us how to solve it. They present new algorithms, theorems, and evaluations of prototypes. Other papers follow a different path: descriptive rather than prescriptive. They tell us how data systems behave in practice, and how they are actually used. They employ a different set of tools, such as surveys, software analyses or user studies. These papers are much rarer at database research conferences, and they're all the more valuable for that.
{"title":"TECHNICAL PERSPECTIVE: Ad Hoc Transactions: What They Are and Why We Should Care","authors":"K. Salem","doi":"10.1145/3604437.3604439","DOIUrl":"https://doi.org/10.1145/3604437.3604439","url":null,"abstract":"Most database research papers are prescriptive. They identify a technical problem and show us how to solve it. They present new algorithms, theorems, and evaluations of prototypes. Other papers follow a different path: descriptive rather than prescriptive. They tell us how data systems behave in practice, and how they are actually used. They employ a different set of tools, such as surveys, software analyses or user studies. These papers are much rarer at database research conferences, and they're all the more valuable for that.","PeriodicalId":346332,"journal":{"name":"ACM SIGMOD Record","volume":"52 Suppl 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127339657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Memory disaggregation architecture physically separates CPU and memory into independent components, which are connected via high-speed RDMA networks, greatly improving resource utilization of database systems. However, such an architecture poses unique challenges to data indexing due to limited RDMA semantics and near-zero computation power at memory side. Existing indexes supporting disaggregated memory either suffer from low write performance, or require hardware modification.
{"title":"Building Write-Optimized Tree Indexes on Disaggregated Memory","authors":"Qing Wang, Youyou Lu, J. Shu","doi":"10.1145/3604437.3604448","DOIUrl":"https://doi.org/10.1145/3604437.3604448","url":null,"abstract":"Memory disaggregation architecture physically separates CPU and memory into independent components, which are connected via high-speed RDMA networks, greatly improving resource utilization of database systems. However, such an architecture poses unique challenges to data indexing due to limited RDMA semantics and near-zero computation power at memory side. Existing indexes supporting disaggregated memory either suffer from low write performance, or require hardware modification.","PeriodicalId":346332,"journal":{"name":"ACM SIGMOD Record","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128512354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The paper proposes a solution to the problem of inadequate support for transactions in multi-engine database systems. Multi-engine database systems are databases that integrate new (fast) memory-optimized storage engines with (slow) traditional engines, allowing the application to use tables in both engines. Multi-engine database systems are in particular interesting for traditional database systems that are extended over time. By being able to store tables in slow and fast storage engines and executing transactions cross engines allows to reduce overall cost since less performance critical tables can be placed in slow (and thus cheaper) storage. As
{"title":"Technical Perspective for Skeena: Efficient and Consistent Cross-Engine Transactions","authors":"Carsten Binnig","doi":"10.1145/3604437.3604443","DOIUrl":"https://doi.org/10.1145/3604437.3604443","url":null,"abstract":"The paper proposes a solution to the problem of inadequate support for transactions in multi-engine database systems. Multi-engine database systems are databases that integrate new (fast) memory-optimized storage engines with (slow) traditional engines, allowing the application to use tables in both engines. Multi-engine database systems are in particular interesting for traditional database systems that are extended over time. By being able to store tables in slow and fast storage engines and executing transactions cross engines allows to reduce overall cost since less performance critical tables can be placed in slow (and thus cheaper) storage. As","PeriodicalId":346332,"journal":{"name":"ACM SIGMOD Record","volume":"189 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116656448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Knowledge graphs (KGs) such as DBpedia, Freebase, YAGO, Wikidata, and NELL were constructed to store large-scale, real-world facts as (subject, predicate, object) triples - that can also be modeled as a graph, where a node (a subject or an object) represents an entity with attributes, and a directed edge (a predicate) is a relationship between two entities. Querying KGs is critical in web search, question answering (QA), semantic search, personal assistants, fact checking, and recommendation. While significant progress has been made on KG construction and curation, thanks to deep learning recently we have seen a surge of research on KG querying and QA. The objectives of our survey are two-fold. First, research on KG querying has been conducted by several communities, such as databases, data mining, semantic web, machine learning, information retrieval, and natural language processing (NLP), with different focus and terminologies; and also in diverse topics ranging from graph databases, query languages, join algorithms, graph patterns matching, to more sophisticated KG embedding and natural language questions (NLQs). We aim at uniting different interdisciplinary topics and concepts that have been developed for KG querying. Second, many recent advances on KG and query embedding, multimodal KG, and KG-QA come from deep learning, IR, NLP, and computer vision domains. We identify important challenges of KG querying that received less attention by graph databases, and by the DB community in general, e.g., incomplete KG, semantic matching, multimodal data, and NLQs. We conclude by discussing interesting opportunities for the data management community, for instance, KG as a unified data model and vector-based query processing.
{"title":"Knowledge Graphs Querying","authors":"Arijit Khan","doi":"10.1145/3615952.3615956","DOIUrl":"https://doi.org/10.1145/3615952.3615956","url":null,"abstract":"Knowledge graphs (KGs) such as DBpedia, Freebase, YAGO, Wikidata, and NELL were constructed to store large-scale, real-world facts as (subject, predicate, object) triples - that can also be modeled as a graph, where a node (a subject or an object) represents an entity with attributes, and a directed edge (a predicate) is a relationship between two entities. Querying KGs is critical in web search, question answering (QA), semantic search, personal assistants, fact checking, and recommendation. While significant progress has been made on KG construction and curation, thanks to deep learning recently we have seen a surge of research on KG querying and QA. The objectives of our survey are two-fold. First, research on KG querying has been conducted by several communities, such as databases, data mining, semantic web, machine learning, information retrieval, and natural language processing (NLP), with different focus and terminologies; and also in diverse topics ranging from graph databases, query languages, join algorithms, graph patterns matching, to more sophisticated KG embedding and natural language questions (NLQs). We aim at uniting different interdisciplinary topics and concepts that have been developed for KG querying. Second, many recent advances on KG and query embedding, multimodal KG, and KG-QA come from deep learning, IR, NLP, and computer vision domains. We identify important challenges of KG querying that received less attention by graph databases, and by the DB community in general, e.g., incomplete KG, semantic matching, multimodal data, and NLQs. We conclude by discussing interesting opportunities for the data management community, for instance, KG as a unified data model and vector-based query processing.","PeriodicalId":346332,"journal":{"name":"ACM SIGMOD Record","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121311909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dense co-authorship network formed by the review board members of a conference may adversely impact the quality and integrity of the review process. In this report, we shed light on the topological characteristics of such networks for three major data management conference venues. Our results show all these venues give rise to dense networks with a large giant component. We advocate to rethink the traditional way review boards are formed to mitigate the emergence of dense networks.
{"title":"How Connected Are Our Conference Review Boards?","authors":"S. Bhowmick","doi":"10.1145/3582302.3582324","DOIUrl":"https://doi.org/10.1145/3582302.3582324","url":null,"abstract":"Dense co-authorship network formed by the review board members of a conference may adversely impact the quality and integrity of the review process. In this report, we shed light on the topological characteristics of such networks for three major data management conference venues. Our results show all these venues give rise to dense networks with a large giant component. We advocate to rethink the traditional way review boards are formed to mitigate the emergence of dense networks.","PeriodicalId":346332,"journal":{"name":"ACM SIGMOD Record","volume":"115 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116036384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Efthimia Aivaloglou, G. Fletcher, Michael Liut, Daphne Miedema
This report summarizes the outcomes of the first international workshop on Data Systems Education: Bridging Education Practice with Education Research (DataEd '22). The workshop was held in conjunction with the SIGMOD '22 conference in Philadelphia, USA on June 17, 2022. The aim of the workshop was to provide a dedicated venue for presenting and and discussing data management systems education experiences and research by bringing together the database and the computing education research communities to share findings, to crosspollinate perspectives and methods, and to shed light on opportunities for mutual progress in data systems education. The program featured two keynote talks, ten research paper presentations, a discussion session, and an industry panel discussion. In this report, we present the workshop's main results, observations, and emerging research directions.
{"title":"Report on the First International Workshop on Data Systems Education (DataEd '22)","authors":"Efthimia Aivaloglou, G. Fletcher, Michael Liut, Daphne Miedema","doi":"10.1145/3582302.3582314","DOIUrl":"https://doi.org/10.1145/3582302.3582314","url":null,"abstract":"This report summarizes the outcomes of the first international workshop on Data Systems Education: Bridging Education Practice with Education Research (DataEd '22). The workshop was held in conjunction with the SIGMOD '22 conference in Philadelphia, USA on June 17, 2022. The aim of the workshop was to provide a dedicated venue for presenting and and discussing data management systems education experiences and research by bringing together the database and the computing education research communities to share findings, to crosspollinate perspectives and methods, and to shed light on opportunities for mutual progress in data systems education. The program featured two keynote talks, ten research paper presentations, a discussion session, and an industry panel discussion. In this report, we present the workshop's main results, observations, and emerging research directions.","PeriodicalId":346332,"journal":{"name":"ACM SIGMOD Record","volume":"198 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124432357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bas Ketsman, Christoph E. Koch, F. Neven, Brecht Vandevoort
The aim of this paper is to serve as a lightweight introduction to concurrency control for database theorists through a uniform presentation of the work on robustness against Multiversion Read Committed and Snapshot Isolation.
{"title":"Concurrency control for database theorists","authors":"Bas Ketsman, Christoph E. Koch, F. Neven, Brecht Vandevoort","doi":"10.1145/3582302.3582304","DOIUrl":"https://doi.org/10.1145/3582302.3582304","url":null,"abstract":"The aim of this paper is to serve as a lightweight introduction to concurrency control for database theorists through a uniform presentation of the work on robustness against Multiversion Read Committed and Snapshot Isolation.","PeriodicalId":346332,"journal":{"name":"ACM SIGMOD Record","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134142697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
You just got promoted to Associate Professor. Like most things in life, whether joys or sorrows, the joy of this accomplishment will not last forever. However, that doesn't mean that you should not look back and reflect on years of hard work and tenacity that you have put in which have earned you this promotion, so first of all, congratulations! Take a moment to savor this accomplishment. On the other hand, it would be a mistake to not ask the question, what just changed about me. Let's see. You now have tenure and you have been promoted to a senior rank. In one sense, that translates to less stress, but in another, you do have to wonder whether it necessarily does mean less stress. On the flip side, you should also take advantage of the opportunity to ask, what are some new freedoms I have just earned. The stress component is driven by partly knowing, but also partly being unsure of, the expectations from a newly minted Associate Professor. The freedom component stems from knowing that you are now tenured, which hopefully means that you can embark on more daring, high risk projects, even if you don't feel like you know quite how to negotiate the trade-off between risk and impact.
{"title":"Mid-Career Researcher, huh?","authors":"L. Lakshmanan","doi":"10.1145/3582302.3582312","DOIUrl":"https://doi.org/10.1145/3582302.3582312","url":null,"abstract":"You just got promoted to Associate Professor. Like most things in life, whether joys or sorrows, the joy of this accomplishment will not last forever. However, that doesn't mean that you should not look back and reflect on years of hard work and tenacity that you have put in which have earned you this promotion, so first of all, congratulations! Take a moment to savor this accomplishment. On the other hand, it would be a mistake to not ask the question, what just changed about me. Let's see. You now have tenure and you have been promoted to a senior rank. In one sense, that translates to less stress, but in another, you do have to wonder whether it necessarily does mean less stress. On the flip side, you should also take advantage of the opportunity to ask, what are some new freedoms I have just earned. The stress component is driven by partly knowing, but also partly being unsure of, the expectations from a newly minted Associate Professor. The freedom component stems from knowing that you are now tenured, which hopefully means that you can embark on more daring, high risk projects, even if you don't feel like you know quite how to negotiate the trade-off between risk and impact.","PeriodicalId":346332,"journal":{"name":"ACM SIGMOD Record","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128548341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
When I started my PhD, I wanted to do something related to systems but I wasn't sure exactly what. I didn't consider data management systems initially, because I was unaware of the richness of the systems work that data management systems were build on. I thought the field was mainly about SQL. Luckily, that view changed quickly.
{"title":"Reminiscences on Influential Papers","authors":"T. Rabl","doi":"10.1145/3582302.3582310","DOIUrl":"https://doi.org/10.1145/3582302.3582310","url":null,"abstract":"When I started my PhD, I wanted to do something related to systems but I wasn't sure exactly what. I didn't consider data management systems initially, because I was unaware of the richness of the systems work that data management systems were build on. I thought the field was mainly about SQL. Luckily, that view changed quickly.","PeriodicalId":346332,"journal":{"name":"ACM SIGMOD Record","volume":"05 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128983929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}