George Katsogiannis-Meimarakis, Mike Xydas, Georgia Koutrika
{"title":"Natural Language Interfaces for Databases with Deep Learning","authors":"George Katsogiannis-Meimarakis, Mike Xydas, Georgia Koutrika","doi":"10.14778/3611540.3611575","DOIUrl":null,"url":null,"abstract":"In the age of the Digital Revolution, almost all human activities, from industrial and business operations to medical and academic research, are reliant on the constant integration and utilisation of ever-increasing volumes of data. However, the explosive volume and complexity of data makes data querying and exploration challenging even for experts, and makes the need to democratise the access to data, even for non-technical users, all the more evident. It is time to lift all technical barriers, by empowering users to access relational databases through conversation. We consider 3 main research areas that a natural language data interface is based on: Text-to-SQL, SQL-to-Text, and Data-to-Text. The purpose of this tutorial is a deep dive into these areas, covering state-of-the-art techniques and models, and explaining how the progress in the deep learning field has led to impressive advancements. We will present benchmarks that sparked research and competition, and discuss open problems and research opportunities with one of the most important challenges being the integration of these 3 research areas into one conversational system.","PeriodicalId":54220,"journal":{"name":"Proceedings of the Vldb Endowment","volume":"91 1","pages":"0"},"PeriodicalIF":2.6000,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Vldb Endowment","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14778/3611540.3611575","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
In the age of the Digital Revolution, almost all human activities, from industrial and business operations to medical and academic research, are reliant on the constant integration and utilisation of ever-increasing volumes of data. However, the explosive volume and complexity of data makes data querying and exploration challenging even for experts, and makes the need to democratise the access to data, even for non-technical users, all the more evident. It is time to lift all technical barriers, by empowering users to access relational databases through conversation. We consider 3 main research areas that a natural language data interface is based on: Text-to-SQL, SQL-to-Text, and Data-to-Text. The purpose of this tutorial is a deep dive into these areas, covering state-of-the-art techniques and models, and explaining how the progress in the deep learning field has led to impressive advancements. We will present benchmarks that sparked research and competition, and discuss open problems and research opportunities with one of the most important challenges being the integration of these 3 research areas into one conversational system.
期刊介绍:
The Proceedings of the VLDB (PVLDB) welcomes original research papers on a broad range of research topics related to all aspects of data management, where systems issues play a significant role, such as data management system technology and information management infrastructures, including their very large scale of experimentation, novel architectures, and demanding applications as well as their underpinning theory. The scope of a submission for PVLDB is also described by the subject areas given below. Moreover, the scope of PVLDB is restricted to scientific areas that are covered by the combined expertise on the submission’s topic of the journal’s editorial board. Finally, the submission’s contributions should build on work already published in data management outlets, e.g., PVLDB, VLDBJ, ACM SIGMOD, IEEE ICDE, EDBT, ACM TODS, IEEE TKDE, and go beyond a syntactic citation.