用综合训练的图标检测分析和总结信息图

2021 IEEE 14th Pacific Visualization Symposium (PacificVis) Pub Date : 2021-04-01 DOI:10.1109/PacificVis52677.2021.00012

Spandan Madan, Z. Bylinskii, C. Nobre, Matthew Tancik, Adrià Recasens, Kimberli Zhong, Sami Alsheikh, A. Oliva, F. Durand, H. Pfister

{"title":"用综合训练的图标检测分析和总结信息图","authors":"Spandan Madan, Z. Bylinskii, C. Nobre, Matthew Tancik, Adrià Recasens, Kimberli Zhong, Sami Alsheikh, A. Oliva, F. Durand, H. Pfister","doi":"10.1109/PacificVis52677.2021.00012","DOIUrl":null,"url":null,"abstract":"Widely used in news, business, and educational media, infographics are handcrafted to effectively communicate messages about complex and often abstract topics including ‘ways to conserve the environment’ and ‘coronavirus prevention’. The computational understanding of infographics required for future applications like automatic captioning, summarization, search, and question-answering, will depend on being able to parse the visual and textual elements contained within. However, being composed of stylistically and semantically diverse visual and textual elements, infographics pose challenges for current A.I. systems. While automatic text extraction works reasonably well on infographics, standard object detection algorithms fail to identify the stand-alone visual elements in infographics that we refer to as ‘icons’. In this paper, we propose a novel approach to train an object detector using synthetically-generated data, and show that it succeeds at generalizing to detecting icons within in-the-wild infographics. We further pair our icon detection approach with an icon classifier and a state-of-the-art text detector to demonstrate three demo applications: topic prediction, multi-modal summarization, and multi-modal search. Parsing the visual and textual elements within infographics provides us with the first steps towards automatic infographic understanding.","PeriodicalId":199565,"journal":{"name":"2021 IEEE 14th Pacific Visualization Symposium (PacificVis)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Parsing and Summarizing Infographics with Synthetically Trained Icon Detection\",\"authors\":\"Spandan Madan, Z. Bylinskii, C. Nobre, Matthew Tancik, Adrià Recasens, Kimberli Zhong, Sami Alsheikh, A. Oliva, F. Durand, H. Pfister\",\"doi\":\"10.1109/PacificVis52677.2021.00012\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Widely used in news, business, and educational media, infographics are handcrafted to effectively communicate messages about complex and often abstract topics including ‘ways to conserve the environment’ and ‘coronavirus prevention’. The computational understanding of infographics required for future applications like automatic captioning, summarization, search, and question-answering, will depend on being able to parse the visual and textual elements contained within. However, being composed of stylistically and semantically diverse visual and textual elements, infographics pose challenges for current A.I. systems. While automatic text extraction works reasonably well on infographics, standard object detection algorithms fail to identify the stand-alone visual elements in infographics that we refer to as ‘icons’. In this paper, we propose a novel approach to train an object detector using synthetically-generated data, and show that it succeeds at generalizing to detecting icons within in-the-wild infographics. We further pair our icon detection approach with an icon classifier and a state-of-the-art text detector to demonstrate three demo applications: topic prediction, multi-modal summarization, and multi-modal search. Parsing the visual and textual elements within infographics provides us with the first steps towards automatic infographic understanding.\",\"PeriodicalId\":199565,\"journal\":{\"name\":\"2021 IEEE 14th Pacific Visualization Symposium (PacificVis)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE 14th Pacific Visualization Symposium (PacificVis)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PacificVis52677.2021.00012\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 14th Pacific Visualization Symposium (PacificVis)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PacificVis52677.2021.00012","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

信息图表广泛用于新闻、商业和教育媒体，是手工制作的，用于有效传达复杂且通常是抽象主题的信息，包括“保护环境的方法”和“冠状病毒预防”。未来应用程序(如自动字幕、摘要、搜索和问答)所需的对信息图的计算理解将取决于能够解析其中包含的视觉和文本元素。然而，信息图表由风格和语义上不同的视觉和文本元素组成，对当前的人工智能系统构成了挑战。虽然自动文本提取在信息图上工作得相当好，但标准对象检测算法无法识别信息图中我们称之为“图标”的独立视觉元素。在本文中，我们提出了一种使用合成生成的数据来训练目标检测器的新方法，并表明它成功地推广到检测野外信息图中的图标。我们进一步将我们的图标检测方法与图标分类器和最先进的文本检测器配对，以演示三个演示应用:主题预测、多模态摘要和多模态搜索。解析信息图中的视觉和文本元素为我们自动理解信息图提供了第一步。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Parsing and Summarizing Infographics with Synthetically Trained Icon Detection

Widely used in news, business, and educational media, infographics are handcrafted to effectively communicate messages about complex and often abstract topics including ‘ways to conserve the environment’ and ‘coronavirus prevention’. The computational understanding of infographics required for future applications like automatic captioning, summarization, search, and question-answering, will depend on being able to parse the visual and textual elements contained within. However, being composed of stylistically and semantically diverse visual and textual elements, infographics pose challenges for current A.I. systems. While automatic text extraction works reasonably well on infographics, standard object detection algorithms fail to identify the stand-alone visual elements in infographics that we refer to as ‘icons’. In this paper, we propose a novel approach to train an object detector using synthetically-generated data, and show that it succeeds at generalizing to detecting icons within in-the-wild infographics. We further pair our icon detection approach with an icon classifier and a state-of-the-art text detector to demonstrate three demo applications: topic prediction, multi-modal summarization, and multi-modal search. Parsing the visual and textual elements within infographics provides us with the first steps towards automatic infographic understanding.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 IEEE 14th Pacific Visualization Symposium (PacificVis)

自引率

0.00%

发文量