Collecting and Characterizing Natural Language Utterances for Specifying Data Visualizations

Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems Pub Date : 2021-05-06 DOI:10.1145/3411764.3445400

Arjun Srinivasan, Nikhila Nyapathy, Bongshin Lee, S. Drucker, J. Stasko

引用次数: 37

Abstract

Natural language interfaces (NLIs) for data visualization are becoming increasingly popular both in academic research and in commercial software. Yet, there is a lack of empirical understanding of how people specify visualizations through natural language. We conducted an online study (N = 102), showing participants a series of visualizations and asking them to provide utterances they would pose to generate the displayed charts. From the responses, we curated a dataset of 893 utterances and characterized the utterances according to (1) their phrasing (e.g., commands, queries, questions) and (2) the information they contained (e.g., chart types, data aggregations). To help guide future research and development, we contribute this utterance dataset and discuss its applications toward the creation and benchmarking of NLIs for visualization.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

收集和描述用于指定数据可视化的自然语言话语

用于数据可视化的自然语言接口(nli)在学术研究和商业软件中越来越受欢迎。然而，对于人们如何通过自然语言指定可视化，缺乏经验上的理解。我们进行了一项在线研究(N = 102)，向参与者展示了一系列可视化图像，并要求他们提供他们会摆出的话语来生成显示的图表。从这些回复中，我们整理了一个包含893个话语的数据集，并根据(1)它们的措辞(例如，命令、查询、问题)和(2)它们包含的信息(例如，图表类型、数据聚合)来描述这些话语。为了帮助指导未来的研究和发展，我们提供了这个话语数据集，并讨论了它在可视化nli的创建和基准测试中的应用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems

自引率

0.00%

发文量