Chart-to-text generation using a hybrid deep network

Advances in computational intelligence Pub Date : 2023-11-02 DOI:10.1007/s43674-023-00066-y

Nontaporn Wonglek, Siriwalai Maneesinthu, Sivakorn Srichaiyaperk, Teerapon Saengmuang, Thitirat Siriborvornratanakul

{"title":"Chart-to-text generation using a hybrid deep network","authors":"Nontaporn Wonglek, Siriwalai Maneesinthu, Sivakorn Srichaiyaperk, Teerapon Saengmuang, Thitirat Siriborvornratanakul","doi":"10.1007/s43674-023-00066-y","DOIUrl":null,"url":null,"abstract":"<div><p>Text generation from charts is a task that involves automatically generating natural language text descriptions of data presented in chart form. This is a useful capability for tasks such as summarizing data for presentation or providing alternative representations of data for accessibility. In this work, we propose a hybrid deep network approach for text generation from table images in an academic format. The input to the model is a table image, which is first processed using Tesseract OCR (optical character recognition) to extract the data. The data are then passed through a Transformer (i.e., T5, K2T) model to generate the final text output. We evaluate the performance of our model on a dataset of academic papers. Results show that our network is able to generate high-quality text descriptions of charts. Specifically, the average BLEU scores are 0.072355 for T5 and 0.037907 for K2T. Our results demonstrate the effectiveness of the hybrid deep network approach for text generation from table images in an academic format.</p></div>","PeriodicalId":72089,"journal":{"name":"Advances in computational intelligence","volume":"3 5","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in computational intelligence","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s43674-023-00066-y","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Text generation from charts is a task that involves automatically generating natural language text descriptions of data presented in chart form. This is a useful capability for tasks such as summarizing data for presentation or providing alternative representations of data for accessibility. In this work, we propose a hybrid deep network approach for text generation from table images in an academic format. The input to the model is a table image, which is first processed using Tesseract OCR (optical character recognition) to extract the data. The data are then passed through a Transformer (i.e., T5, K2T) model to generate the final text output. We evaluate the performance of our model on a dataset of academic papers. Results show that our network is able to generate high-quality text descriptions of charts. Specifically, the average BLEU scores are 0.072355 for T5 and 0.037907 for K2T. Our results demonstrate the effectiveness of the hybrid deep network approach for text generation from table images in an academic format.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

使用混合深度网络生成图表到文本

从图表生成文本是一项任务，涉及自动生成以图表形式呈现的数据的自然语言文本描述。这对于总结数据以供呈现或提供数据的替代表示以供访问等任务来说是一项有用的功能。在这项工作中，我们提出了一种混合深度网络方法，用于从学术格式的表格图像中生成文本。模型的输入是表格图像，首先使用Tesseract OCR（光学字符识别）对其进行处理以提取数据。然后，数据通过Transformer（即T5、K2T）模型来生成最终的文本输出。我们在学术论文的数据集上评估了我们的模型的性能。结果表明，我们的网络能够生成高质量的图表文本描述。具体而言，T5的平均BLEU得分为0.072355，K2T的平均BLEU得分为0.037907。我们的结果证明了混合深度网络方法在从学术格式的表格图像中生成文本方面的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Advances in computational intelligence

自引率

0.00%

发文量