{"title":"Efficient Execution of User-Defined Functions in SQL Queries","authors":"Yannis Foufoulas, Alkis Simitsis","doi":"10.14778/3611540.3611574","DOIUrl":null,"url":null,"abstract":"User-defined functions (UDFs) have been widely used to overcome the expressivity limitations of SQL and complement its declarative nature with functional capabilities. UDFs are particularly useful in today's applications that involve complex data analytics and machine learning algorithms and logic. However, UDFs pose significant performance challenges in query processing and optimization, largely due to the mismatch of the UDF execution and SQL processing environments. In this tutorial, we present state-of-the-art methods and systems towards efficient execution of UDFs in SQL queries. We focus on low-level techniques for physical optimization and compilation of UDF queries, describe and compare the core, recent approaches in the area, discuss their advantages and limitations, identify critical gaps in theory and practice, and propose promising future research directions.","PeriodicalId":54220,"journal":{"name":"Proceedings of the Vldb Endowment","volume":"43 1","pages":"0"},"PeriodicalIF":2.6000,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Vldb Endowment","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14778/3611540.3611574","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
User-defined functions (UDFs) have been widely used to overcome the expressivity limitations of SQL and complement its declarative nature with functional capabilities. UDFs are particularly useful in today's applications that involve complex data analytics and machine learning algorithms and logic. However, UDFs pose significant performance challenges in query processing and optimization, largely due to the mismatch of the UDF execution and SQL processing environments. In this tutorial, we present state-of-the-art methods and systems towards efficient execution of UDFs in SQL queries. We focus on low-level techniques for physical optimization and compilation of UDF queries, describe and compare the core, recent approaches in the area, discuss their advantages and limitations, identify critical gaps in theory and practice, and propose promising future research directions.
期刊介绍:
The Proceedings of the VLDB (PVLDB) welcomes original research papers on a broad range of research topics related to all aspects of data management, where systems issues play a significant role, such as data management system technology and information management infrastructures, including their very large scale of experimentation, novel architectures, and demanding applications as well as their underpinning theory. The scope of a submission for PVLDB is also described by the subject areas given below. Moreover, the scope of PVLDB is restricted to scientific areas that are covered by the combined expertise on the submission’s topic of the journal’s editorial board. Finally, the submission’s contributions should build on work already published in data management outlets, e.g., PVLDB, VLDBJ, ACM SIGMOD, IEEE ICDE, EDBT, ACM TODS, IEEE TKDE, and go beyond a syntactic citation.