Christina Khnaisser, Hind Hamrouni, David B. Blumenthal, Anton Dignös, Johann Gamper
{"title":"Efficiently Labeling and Retrieving Temporal Anomalies in Relational Databases","authors":"Christina Khnaisser, Hind Hamrouni, David B. Blumenthal, Anton Dignös, Johann Gamper","doi":"10.1007/s10796-024-10495-w","DOIUrl":null,"url":null,"abstract":"<p>Time and temporal constraints are implicit in most databases. To facilitate data analysis and quality assessment, a database should provide explicit operations to identify the violation of temporal constraints. Against this background, the purpose of this paper is threefold: (1) we identify and provide a formal definition of five common anomalies in temporal databases, (2) we propose two new relational operations that allow, respectively, to label anomalous tuples in and to retrieve the anomalous tuples from a dataset, and (3) we provide three different SQL implementations of these operations for current relational database management systems. The healthcare domain is used to illustrate the usage and utility of the temporal anomalies. Finally, an experimental evaluation on real-world and synthetic data analyses the performance of the different implementations of the anomaly operators.</p>","PeriodicalId":13610,"journal":{"name":"Information Systems Frontiers","volume":"318 1","pages":""},"PeriodicalIF":6.9000,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Systems Frontiers","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10796-024-10495-w","RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Time and temporal constraints are implicit in most databases. To facilitate data analysis and quality assessment, a database should provide explicit operations to identify the violation of temporal constraints. Against this background, the purpose of this paper is threefold: (1) we identify and provide a formal definition of five common anomalies in temporal databases, (2) we propose two new relational operations that allow, respectively, to label anomalous tuples in and to retrieve the anomalous tuples from a dataset, and (3) we provide three different SQL implementations of these operations for current relational database management systems. The healthcare domain is used to illustrate the usage and utility of the temporal anomalies. Finally, an experimental evaluation on real-world and synthetic data analyses the performance of the different implementations of the anomaly operators.
期刊介绍:
The interdisciplinary interfaces of Information Systems (IS) are fast emerging as defining areas of research and development in IS. These developments are largely due to the transformation of Information Technology (IT) towards networked worlds and its effects on global communications and economies. While these developments are shaping the way information is used in all forms of human enterprise, they are also setting the tone and pace of information systems of the future. The major advances in IT such as client/server systems, the Internet and the desktop/multimedia computing revolution, for example, have led to numerous important vistas of research and development with considerable practical impact and academic significance. While the industry seeks to develop high performance IS/IT solutions to a variety of contemporary information support needs, academia looks to extend the reach of IS technology into new application domains. Information Systems Frontiers (ISF) aims to provide a common forum of dissemination of frontline industrial developments of substantial academic value and pioneering academic research of significant practical impact.