Data cleaning is frequently an iterative process tailored to the requirements of a specific analysis task. The design and implementation of iterative data cleaning tools presents novel challenges, both technical and organizational, to the community. In this paper, we present results from a user survey (N = 29) of data analysts and infrastructure engineers from industry and academia. We highlight three important themes: (1) the iterative nature of data cleaning, (2) the lack of rigor in evaluating the correctness of data cleaning, and (3) the disconnect between the analysts who query the data and the infrastructure engineers who design the cleaning pipelines. We conclude by presenting a number of recommendations for future work in which we envision an interactive data cleaning system that accounts for the observed challenges.
{"title":"Towards reliable interactive data cleaning: a user survey and recommendations","authors":"S. Krishnan, D. Haas, M. Franklin, Eugene Wu","doi":"10.1145/2939502.2939511","DOIUrl":"https://doi.org/10.1145/2939502.2939511","url":null,"abstract":"Data cleaning is frequently an iterative process tailored to the requirements of a specific analysis task. The design and implementation of iterative data cleaning tools presents novel challenges, both technical and organizational, to the community. In this paper, we present results from a user survey (N = 29) of data analysts and infrastructure engineers from industry and academia. We highlight three important themes: (1) the iterative nature of data cleaning, (2) the lack of rigor in evaluating the correctness of data cleaning, and (3) the disconnect between the analysts who query the data and the infrastructure engineers who design the cleaning pipelines. We conclude by presenting a number of recommendations for future work in which we envision an interactive data cleaning system that accounts for the observed challenges.","PeriodicalId":356971,"journal":{"name":"HILDA '16","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122426842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kanit Wongsuphasawat, Dominik Moritz, Anushka Anand, J. Mackinlay, Bill Howe, Jeffrey Heer
Creating effective visualizations requires domain familiarity as well as design and analysis expertise, and may impose a tedious specification process. To address these difficulties, many visualization tools complement manual specification with recommendations. However, designing interfaces, ranking metrics, and scalable recommender systems remain important research challenges. In this paper, we propose a common framework for facilitating the development of visualization recommender systems in the form of a specification language for querying over the space of visualizations. We present the preliminary design of CompassQL, which defines (1) a partial specification that describes enumeration constraints, and (2) methods for choosing, ranking, and grouping recommended visualizations. To demonstrate the expressivity of the language, we describe existing recommender systems in terms of CompassQL queries. Finally, we discuss the prospective benefits of a common language for future visualization recommender systems.
{"title":"Towards a general-purpose query language for visualization recommendation","authors":"Kanit Wongsuphasawat, Dominik Moritz, Anushka Anand, J. Mackinlay, Bill Howe, Jeffrey Heer","doi":"10.1145/2939502.2939506","DOIUrl":"https://doi.org/10.1145/2939502.2939506","url":null,"abstract":"Creating effective visualizations requires domain familiarity as well as design and analysis expertise, and may impose a tedious specification process. To address these difficulties, many visualization tools complement manual specification with recommendations. However, designing interfaces, ranking metrics, and scalable recommender systems remain important research challenges. In this paper, we propose a common framework for facilitating the development of visualization recommender systems in the form of a specification language for querying over the space of visualizations. We present the preliminary design of CompassQL, which defines (1) a partial specification that describes enumeration constraints, and (2) methods for choosing, ranking, and grouping recommended visualizations. To demonstrate the expressivity of the language, we describe existing recommender systems in terms of CompassQL queries. Finally, we discuss the prospective benefits of a common language for future visualization recommender systems.","PeriodicalId":356971,"journal":{"name":"HILDA '16","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133381343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Muhammad El-Hindi, Zheguang Zhao, Carsten Binnig, Tim Kraska
Visualizations are arguably the most important tool to explore, understand and convey facts about data. As part of interactive data exploration, visualizations might be used to quickly skim through the data and look for patterns. Unfortunately, database systems are not designed to efficiently support these workloads. As a result, visualizations often take very long to produce, creating a significant barrier to interactive data analysis. In this paper, we focus on the interactive computation of histograms for data exploration. To address this issue, we present a novel multi-dimensional index structure called VisTree. As a key contribution, this paper presents several techniques to better align the design of multi-dimensional indexes with the needs of visualization tools for data exploration. Our experiments show that the VisTree achieves a speed increase of up to three orders of magnitude compared to traditional multi-dimensional indexes and enables an interactive speed of below 500ms even on large data sets.
{"title":"VisTrees: fast indexes for interactive data exploration","authors":"Muhammad El-Hindi, Zheguang Zhao, Carsten Binnig, Tim Kraska","doi":"10.1145/2939502.2939507","DOIUrl":"https://doi.org/10.1145/2939502.2939507","url":null,"abstract":"Visualizations are arguably the most important tool to explore, understand and convey facts about data. As part of interactive data exploration, visualizations might be used to quickly skim through the data and look for patterns. Unfortunately, database systems are not designed to efficiently support these workloads. As a result, visualizations often take very long to produce, creating a significant barrier to interactive data analysis.\u0000 In this paper, we focus on the interactive computation of histograms for data exploration. To address this issue, we present a novel multi-dimensional index structure called VisTree. As a key contribution, this paper presents several techniques to better align the design of multi-dimensional indexes with the needs of visualization tools for data exploration. Our experiments show that the VisTree achieves a speed increase of up to three orders of magnitude compared to traditional multi-dimensional indexes and enables an interactive speed of below 500ms even on large data sets.","PeriodicalId":356971,"journal":{"name":"HILDA '16","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114578101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Thanks the recent advances of AI and the stellar popularity of messaging apps (e.g., WhatsApp), chatbots are no longer bound to customer support services and computer museums. Indeed, they provide a mighty, lightweight and accessible way to provide services over the Internet. In this paper, we introduce Clustine, a chatbot to help users query large tables through short messages. The main idea is to combine cluster analysis and text generation to compress query results, describe them with natural language and make recommendations. We present the architecture of our system, demonstrate it with two use cases, and present early validation experiments with 12 real datasets to show that its promises are reachable.
{"title":"Have a chat with clustine, conversational engine to query large tables","authors":"Thibault Sellam, M. Kersten","doi":"10.1145/2939502.2939504","DOIUrl":"https://doi.org/10.1145/2939502.2939504","url":null,"abstract":"Thanks the recent advances of AI and the stellar popularity of messaging apps (e.g., WhatsApp), chatbots are no longer bound to customer support services and computer museums. Indeed, they provide a mighty, lightweight and accessible way to provide services over the Internet. In this paper, we introduce Clustine, a chatbot to help users query large tables through short messages. The main idea is to combine cluster analysis and text generation to compress query results, describe them with natural language and make recommendations. We present the architecture of our system, demonstrate it with two use cases, and present early validation experiments with 12 real datasets to show that its promises are reachable.","PeriodicalId":356971,"journal":{"name":"HILDA '16","volume":"66 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128020980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Manasi Vartak, H. Subramanyam, Wei-En Lee, S. Viswanathan, S. Husnoo, S. Madden, M. Zaharia
Building a machine learning model is an iterative process. A data scientist will build many tens to hundreds of models before arriving at one that meets some acceptance criteria (e.g. AUC cutoff, accuracy threshold). However, the current style of model building is ad-hoc and there is no practical way for a data scientist to manage models that are built over time. As a result, the data scientist must attempt to "remember" previously constructed models and insights obtained from them. This task is challenging for more than a handful of models and can hamper the process of sensemaking. Without a means to manage models, there is no easy way for a data scientist to answer questions such as "Which models were built using an incorrect feature?", "Which model performed best on American customers?" or "How did the two top models compare?" In this paper, we describe our ongoing work on ModelDB, a novel end-to-end system for the management of machine learning models. ModelDB clients automatically track machine learning models in their native environments (e.g. scikit-learn, spark.ml), the ModelDB backend introduces a common layer of abstractions to represent models and pipelines, and the ModelDB frontend allows visual exploration and analyses of models via a web-based interface.
{"title":"ModelDB: a system for machine learning model management","authors":"Manasi Vartak, H. Subramanyam, Wei-En Lee, S. Viswanathan, S. Husnoo, S. Madden, M. Zaharia","doi":"10.1145/2939502.2939516","DOIUrl":"https://doi.org/10.1145/2939502.2939516","url":null,"abstract":"Building a machine learning model is an iterative process. A data scientist will build many tens to hundreds of models before arriving at one that meets some acceptance criteria (e.g. AUC cutoff, accuracy threshold). However, the current style of model building is ad-hoc and there is no practical way for a data scientist to manage models that are built over time. As a result, the data scientist must attempt to \"remember\" previously constructed models and insights obtained from them. This task is challenging for more than a handful of models and can hamper the process of sensemaking. Without a means to manage models, there is no easy way for a data scientist to answer questions such as \"Which models were built using an incorrect feature?\", \"Which model performed best on American customers?\" or \"How did the two top models compare?\" In this paper, we describe our ongoing work on ModelDB, a novel end-to-end system for the management of machine learning models. ModelDB clients automatically track machine learning models in their native environments (e.g. scikit-learn, spark.ml), the ModelDB backend introduces a common layer of abstractions to represent models and pipelines, and the ModelDB frontend allows visual exploration and analyses of models via a web-based interface.","PeriodicalId":356971,"journal":{"name":"HILDA '16","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124900571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Henry R. Ehrenberg, Jaeho Shin, Alexander J. Ratner, Jason Alan Fries, C. Ré
Populating large-scale structured databases from unstructured sources is a critical and challenging task in data analytics. As automated feature engineering methods grow increasingly prevalent, constructing sufficiently large labeled training sets has become the primary hurdle in building machine learning information extraction systems. In light of this, we have taken a new approach called data programming [7]. Rather than hand-labeling data, in the data programming paradigm, users generate large amounts of noisy training labels by programmatically encoding domain heuristics as simple rules. Using this approach over more traditional distant supervision methods and fully supervised approaches using labeled data, we have been able to construct knowledge base systems more rapidly and with higher quality. Since the ability to quickly prototype, evaluate, and debug these rules is a key component of this paradigm, we introduce DDLite, an interactive development framework for data programming. This paper reports feedback collected from DDLite users across a diverse set of entity extraction tasks. We share observations from several DDLite hackathons in which 10 biomedical researchers prototyped information extraction pipelines for chemicals, diseases, and anatomical named entities. Initial results were promising, with the disease tagging team obtaining an F1 score within 10 points of the state-of-the-art in only a single day-long hackathon's work. Our key insights concern the challenges of writing diverse rule sets for generating labels, and exploring training data. These findings motivate several areas of active data programming research.
{"title":"Data programming with DDLite: putting humans in a different part of the loop","authors":"Henry R. Ehrenberg, Jaeho Shin, Alexander J. Ratner, Jason Alan Fries, C. Ré","doi":"10.1145/2939502.2939515","DOIUrl":"https://doi.org/10.1145/2939502.2939515","url":null,"abstract":"Populating large-scale structured databases from unstructured sources is a critical and challenging task in data analytics. As automated feature engineering methods grow increasingly prevalent, constructing sufficiently large labeled training sets has become the primary hurdle in building machine learning information extraction systems. In light of this, we have taken a new approach called data programming [7]. Rather than hand-labeling data, in the data programming paradigm, users generate large amounts of noisy training labels by programmatically encoding domain heuristics as simple rules. Using this approach over more traditional distant supervision methods and fully supervised approaches using labeled data, we have been able to construct knowledge base systems more rapidly and with higher quality. Since the ability to quickly prototype, evaluate, and debug these rules is a key component of this paradigm, we introduce DDLite, an interactive development framework for data programming. This paper reports feedback collected from DDLite users across a diverse set of entity extraction tasks. We share observations from several DDLite hackathons in which 10 biomedical researchers prototyped information extraction pipelines for chemicals, diseases, and anatomical named entities. Initial results were promising, with the disease tagging team obtaining an F1 score within 10 points of the state-of-the-art in only a single day-long hackathon's work. Our key insights concern the challenges of writing diverse rule sets for generating labels, and exploring training data. These findings motivate several areas of active data programming research.","PeriodicalId":356971,"journal":{"name":"HILDA '16","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125577021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As complex machine learning systems become more widely adopted, it becomes increasingly challenging for users to understand models or interpret the results generated from the models. We present our ongoing work on developing interactive and visual approaches for exploring and understanding machine learning results using data cube analysis. We propose MLCube, a data cube inspired framework that enables users to define instance subsets using feature conditions and computes aggregate statistics and evaluation metrics over the subsets. We also design MLCube Explorer, an interactive visualization tool for comparing models' performances over the subsets. Users can interactively specify operations, such as drilling down to specific instance subsets, to perform more in-depth exploration. Through a usage scenario, we demonstrate how MLCube Explorer works with a public advertisement click log data set, to help a user build new advertisement click prediction models that advance over an existing model.
{"title":"Visual exploration of machine learning results using data cube analysis","authors":"Minsuk Kahng, Dezhi Fang, Duen Horng Chau","doi":"10.1145/2939502.2939503","DOIUrl":"https://doi.org/10.1145/2939502.2939503","url":null,"abstract":"As complex machine learning systems become more widely adopted, it becomes increasingly challenging for users to understand models or interpret the results generated from the models. We present our ongoing work on developing interactive and visual approaches for exploring and understanding machine learning results using data cube analysis. We propose MLCube, a data cube inspired framework that enables users to define instance subsets using feature conditions and computes aggregate statistics and evaluation metrics over the subsets. We also design MLCube Explorer, an interactive visualization tool for comparing models' performances over the subsets. Users can interactively specify operations, such as drilling down to specific instance subsets, to perform more in-depth exploration. Through a usage scenario, we demonstrate how MLCube Explorer works with a public advertisement click log data set, to help a user build new advertisement click prediction models that advance over an existing model.","PeriodicalId":356971,"journal":{"name":"HILDA '16","volume":"71 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131998743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Named entity recognition and entity linking are core natural language processing components that are predominantly solved by supervised machine learning approaches. Such supervised machine learning approaches require manual annotation of training data that can be expensive to compile. The applicability of supervised, machine learning-based entity recognition and linking components in real-world applications can be hindered by the limited availability of training data. In this paper, we propose a novel approach that uses ontologies as a basis for entity recognition and linking, and captures context of neighboring tokens of the entities of interest with vectors based on syntactic and semantic features. Our approach takes user feedback so that the vector-based model can be continuously updated in an online setting. Here we demonstrate our approach in a healthcare context, using it to recognize body part and imaging modality entities within clinical documents, and map these entities to the right concepts in the RadLex and NCIT medical ontologies. Our current evaluation shows promising results on a small set of clinical documents with a precision and recall of 0.841 and 0.966. The evaluation also demonstrates that our approach is capable of continuous performance improvement with increasing size of examples. We believe that our human-in-the-loop, online learning approach to entity recognition and linking shows promise that it is suitable for real-world applications.
{"title":"Interactive online learning for clinical entity recognition","authors":"L. Tari, Varish Mulwad, Anna von Reden","doi":"10.1145/2939502.2939510","DOIUrl":"https://doi.org/10.1145/2939502.2939510","url":null,"abstract":"Named entity recognition and entity linking are core natural language processing components that are predominantly solved by supervised machine learning approaches. Such supervised machine learning approaches require manual annotation of training data that can be expensive to compile. The applicability of supervised, machine learning-based entity recognition and linking components in real-world applications can be hindered by the limited availability of training data. In this paper, we propose a novel approach that uses ontologies as a basis for entity recognition and linking, and captures context of neighboring tokens of the entities of interest with vectors based on syntactic and semantic features. Our approach takes user feedback so that the vector-based model can be continuously updated in an online setting. Here we demonstrate our approach in a healthcare context, using it to recognize body part and imaging modality entities within clinical documents, and map these entities to the right concepts in the RadLex and NCIT medical ontologies. Our current evaluation shows promising results on a small set of clinical documents with a precision and recall of 0.841 and 0.966. The evaluation also demonstrates that our approach is capable of continuous performance improvement with increasing size of examples. We believe that our human-in-the-loop, online learning approach to entity recognition and linking shows promise that it is suitable for real-world applications.","PeriodicalId":356971,"journal":{"name":"HILDA '16","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129500945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrew Crotty, Alex Galakatos, Emanuel Zgraggen, Carsten Binnig, Tim Kraska
Enabling interactive visualization over new datasets at "human speed" is key to democratizing data science and maximizing human productivity. In this work, we first argue why existing analytics infrastructures do not support interactive data exploration and then outline the challenges and opportunities of building a system specifically designed for interactive data exploration. Finally, we present an Interactive Data Exploration Accelerator (IDEA), a new type of system for interactive data exploration that is specifically designed to integrate with existing data management landscapes and allow users to explore their data instantly without expensive data preparation costs.
{"title":"The case for interactive data exploration accelerators (IDEAs)","authors":"Andrew Crotty, Alex Galakatos, Emanuel Zgraggen, Carsten Binnig, Tim Kraska","doi":"10.1145/2939502.2939513","DOIUrl":"https://doi.org/10.1145/2939502.2939513","url":null,"abstract":"Enabling interactive visualization over new datasets at \"human speed\" is key to democratizing data science and maximizing human productivity. In this work, we first argue why existing analytics infrastructures do not support interactive data exploration and then outline the challenges and opportunities of building a system specifically designed for interactive data exploration. Finally, we present an Interactive Data Exploration Accelerator (IDEA), a new type of system for interactive data exploration that is specifically designed to integrate with existing data management landscapes and allow users to explore their data instantly without expensive data preparation costs.","PeriodicalId":356971,"journal":{"name":"HILDA '16","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132379144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Interactive visualization tools (e.g., crossfilter) are critical to many data analysts by making the discovery and verification of hypotheses quick and seamless. Increasing data sizes has made the scalability of these tools a necessity. To bridge the gap between data sizes and interactivity, many visualization systems have turned to sampling-based approximate query processing frameworks. However, these systems are currently oblivious to human perceptual visual accuracy. This could either lead to overly aggressive sampling when the approximation accuracy is higher than needed or an incorrect visual rendering when the accuracy is too lax. Thus, for both correctness and efficiency, we propose to use empirical knowledge of human perceptual limitations to automatically bound the error of approximate answers meant for visualization. This paper explores a preliminary model of sampling-based approximate query processing that uses perceptual models (encoded as functions) to construct approximate answers intended for visualization. We present initial results that show that the approximate and non-approximate answers for a given query differ by a perceptually indiscernible amount, as defined by perceptual functions.
{"title":"PFunk-H: approximate query processing using perceptual models","authors":"Daniel Alabi, Eugene Wu","doi":"10.1145/2939502.2939512","DOIUrl":"https://doi.org/10.1145/2939502.2939512","url":null,"abstract":"Interactive visualization tools (e.g., crossfilter) are critical to many data analysts by making the discovery and verification of hypotheses quick and seamless. Increasing data sizes has made the scalability of these tools a necessity. To bridge the gap between data sizes and interactivity, many visualization systems have turned to sampling-based approximate query processing frameworks. However, these systems are currently oblivious to human perceptual visual accuracy. This could either lead to overly aggressive sampling when the approximation accuracy is higher than needed or an incorrect visual rendering when the accuracy is too lax. Thus, for both correctness and efficiency, we propose to use empirical knowledge of human perceptual limitations to automatically bound the error of approximate answers meant for visualization.\u0000 This paper explores a preliminary model of sampling-based approximate query processing that uses perceptual models (encoded as functions) to construct approximate answers intended for visualization. We present initial results that show that the approximate and non-approximate answers for a given query differ by a perceptually indiscernible amount, as defined by perceptual functions.","PeriodicalId":356971,"journal":{"name":"HILDA '16","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121786804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}