Yu Fu, Shunan Guo, Jane Hoffswell, Victor S. Bursztyn, Ryan Rossi, John Stasko
{"title":"\"数据并非如此\"--实现自动事实核查和数据声明交流","authors":"Yu Fu, Shunan Guo, Jane Hoffswell, Victor S. Bursztyn, Ryan Rossi, John Stasko","doi":"arxiv-2409.10713","DOIUrl":null,"url":null,"abstract":"Fact-checking data claims requires data evidence retrieval and analysis,\nwhich can become tedious and intractable when done manually. This work presents\nAletheia, an automated fact-checking prototype designed to facilitate data\nclaims verification and enhance data evidence communication. For verification,\nwe utilize a pre-trained LLM to parse the semantics for evidence retrieval. To\neffectively communicate the data evidence, we design representations in two\nforms: data tables and visualizations, tailored to various data fact types.\nAdditionally, we design interactions that showcase a real-world application of\nthese techniques. We evaluate the performance of two core NLP tasks with a\ncurated dataset comprising 400 data claims and compare the two representation\nforms regarding viewers' assessment time, confidence, and preference via a user\nstudy with 20 participants. The evaluation offers insights into the feasibility\nand bottlenecks of using LLMs for data fact-checking tasks, potential\nadvantages and disadvantages of using visualizations over data tables, and\ndesign recommendations for presenting data evidence.","PeriodicalId":501541,"journal":{"name":"arXiv - CS - Human-Computer Interaction","volume":"17 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"\\\"The Data Says Otherwise\\\"-Towards Automated Fact-checking and Communication of Data Claims\",\"authors\":\"Yu Fu, Shunan Guo, Jane Hoffswell, Victor S. Bursztyn, Ryan Rossi, John Stasko\",\"doi\":\"arxiv-2409.10713\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Fact-checking data claims requires data evidence retrieval and analysis,\\nwhich can become tedious and intractable when done manually. This work presents\\nAletheia, an automated fact-checking prototype designed to facilitate data\\nclaims verification and enhance data evidence communication. For verification,\\nwe utilize a pre-trained LLM to parse the semantics for evidence retrieval. To\\neffectively communicate the data evidence, we design representations in two\\nforms: data tables and visualizations, tailored to various data fact types.\\nAdditionally, we design interactions that showcase a real-world application of\\nthese techniques. We evaluate the performance of two core NLP tasks with a\\ncurated dataset comprising 400 data claims and compare the two representation\\nforms regarding viewers' assessment time, confidence, and preference via a user\\nstudy with 20 participants. The evaluation offers insights into the feasibility\\nand bottlenecks of using LLMs for data fact-checking tasks, potential\\nadvantages and disadvantages of using visualizations over data tables, and\\ndesign recommendations for presenting data evidence.\",\"PeriodicalId\":501541,\"journal\":{\"name\":\"arXiv - CS - Human-Computer Interaction\",\"volume\":\"17 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Human-Computer Interaction\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.10713\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Human-Computer Interaction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.10713","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
"The Data Says Otherwise"-Towards Automated Fact-checking and Communication of Data Claims
Fact-checking data claims requires data evidence retrieval and analysis,
which can become tedious and intractable when done manually. This work presents
Aletheia, an automated fact-checking prototype designed to facilitate data
claims verification and enhance data evidence communication. For verification,
we utilize a pre-trained LLM to parse the semantics for evidence retrieval. To
effectively communicate the data evidence, we design representations in two
forms: data tables and visualizations, tailored to various data fact types.
Additionally, we design interactions that showcase a real-world application of
these techniques. We evaluate the performance of two core NLP tasks with a
curated dataset comprising 400 data claims and compare the two representation
forms regarding viewers' assessment time, confidence, and preference via a user
study with 20 participants. The evaluation offers insights into the feasibility
and bottlenecks of using LLMs for data fact-checking tasks, potential
advantages and disadvantages of using visualizations over data tables, and
design recommendations for presenting data evidence.