{"title":"Chart-to-text generation using a hybrid deep network","authors":"Nontaporn Wonglek, Siriwalai Maneesinthu, Sivakorn Srichaiyaperk, Teerapon Saengmuang, Thitirat Siriborvornratanakul","doi":"10.1007/s43674-023-00066-y","DOIUrl":null,"url":null,"abstract":"<div><p>Text generation from charts is a task that involves automatically generating natural language text descriptions of data presented in chart form. This is a useful capability for tasks such as summarizing data for presentation or providing alternative representations of data for accessibility. In this work, we propose a hybrid deep network approach for text generation from table images in an academic format. The input to the model is a table image, which is first processed using Tesseract OCR (optical character recognition) to extract the data. The data are then passed through a Transformer (i.e., T5, K2T) model to generate the final text output. We evaluate the performance of our model on a dataset of academic papers. Results show that our network is able to generate high-quality text descriptions of charts. Specifically, the average BLEU scores are 0.072355 for T5 and 0.037907 for K2T. Our results demonstrate the effectiveness of the hybrid deep network approach for text generation from table images in an academic format.</p></div>","PeriodicalId":72089,"journal":{"name":"Advances in computational intelligence","volume":"3 5","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in computational intelligence","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s43674-023-00066-y","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Text generation from charts is a task that involves automatically generating natural language text descriptions of data presented in chart form. This is a useful capability for tasks such as summarizing data for presentation or providing alternative representations of data for accessibility. In this work, we propose a hybrid deep network approach for text generation from table images in an academic format. The input to the model is a table image, which is first processed using Tesseract OCR (optical character recognition) to extract the data. The data are then passed through a Transformer (i.e., T5, K2T) model to generate the final text output. We evaluate the performance of our model on a dataset of academic papers. Results show that our network is able to generate high-quality text descriptions of charts. Specifically, the average BLEU scores are 0.072355 for T5 and 0.037907 for K2T. Our results demonstrate the effectiveness of the hybrid deep network approach for text generation from table images in an academic format.