使用混合深度网络生成图表到文本

Nontaporn Wonglek, Siriwalai Maneesinthu, Sivakorn Srichaiyaperk, Teerapon Saengmuang, Thitirat Siriborvornratanakul
{"title":"使用混合深度网络生成图表到文本","authors":"Nontaporn Wonglek,&nbsp;Siriwalai Maneesinthu,&nbsp;Sivakorn Srichaiyaperk,&nbsp;Teerapon Saengmuang,&nbsp;Thitirat Siriborvornratanakul","doi":"10.1007/s43674-023-00066-y","DOIUrl":null,"url":null,"abstract":"<div><p>Text generation from charts is a task that involves automatically generating natural language text descriptions of data presented in chart form. This is a useful capability for tasks such as summarizing data for presentation or providing alternative representations of data for accessibility. In this work, we propose a hybrid deep network approach for text generation from table images in an academic format. The input to the model is a table image, which is first processed using Tesseract OCR (optical character recognition) to extract the data. The data are then passed through a Transformer (i.e., T5, K2T) model to generate the final text output. We evaluate the performance of our model on a dataset of academic papers. Results show that our network is able to generate high-quality text descriptions of charts. Specifically, the average BLEU scores are 0.072355 for T5 and 0.037907 for K2T. Our results demonstrate the effectiveness of the hybrid deep network approach for text generation from table images in an academic format.</p></div>","PeriodicalId":72089,"journal":{"name":"Advances in computational intelligence","volume":"3 5","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Chart-to-text generation using a hybrid deep network\",\"authors\":\"Nontaporn Wonglek,&nbsp;Siriwalai Maneesinthu,&nbsp;Sivakorn Srichaiyaperk,&nbsp;Teerapon Saengmuang,&nbsp;Thitirat Siriborvornratanakul\",\"doi\":\"10.1007/s43674-023-00066-y\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Text generation from charts is a task that involves automatically generating natural language text descriptions of data presented in chart form. This is a useful capability for tasks such as summarizing data for presentation or providing alternative representations of data for accessibility. In this work, we propose a hybrid deep network approach for text generation from table images in an academic format. The input to the model is a table image, which is first processed using Tesseract OCR (optical character recognition) to extract the data. The data are then passed through a Transformer (i.e., T5, K2T) model to generate the final text output. We evaluate the performance of our model on a dataset of academic papers. Results show that our network is able to generate high-quality text descriptions of charts. Specifically, the average BLEU scores are 0.072355 for T5 and 0.037907 for K2T. Our results demonstrate the effectiveness of the hybrid deep network approach for text generation from table images in an academic format.</p></div>\",\"PeriodicalId\":72089,\"journal\":{\"name\":\"Advances in computational intelligence\",\"volume\":\"3 5\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-11-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Advances in computational intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s43674-023-00066-y\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in computational intelligence","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s43674-023-00066-y","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

从图表生成文本是一项任务,涉及自动生成以图表形式呈现的数据的自然语言文本描述。这对于总结数据以供呈现或提供数据的替代表示以供访问等任务来说是一项有用的功能。在这项工作中,我们提出了一种混合深度网络方法,用于从学术格式的表格图像中生成文本。模型的输入是表格图像,首先使用Tesseract OCR(光学字符识别)对其进行处理以提取数据。然后,数据通过Transformer(即T5、K2T)模型来生成最终的文本输出。我们在学术论文的数据集上评估了我们的模型的性能。结果表明,我们的网络能够生成高质量的图表文本描述。具体而言,T5的平均BLEU得分为0.072355,K2T的平均BLEU得分为0.037907。我们的结果证明了混合深度网络方法在从学术格式的表格图像中生成文本方面的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Chart-to-text generation using a hybrid deep network

Text generation from charts is a task that involves automatically generating natural language text descriptions of data presented in chart form. This is a useful capability for tasks such as summarizing data for presentation or providing alternative representations of data for accessibility. In this work, we propose a hybrid deep network approach for text generation from table images in an academic format. The input to the model is a table image, which is first processed using Tesseract OCR (optical character recognition) to extract the data. The data are then passed through a Transformer (i.e., T5, K2T) model to generate the final text output. We evaluate the performance of our model on a dataset of academic papers. Results show that our network is able to generate high-quality text descriptions of charts. Specifically, the average BLEU scores are 0.072355 for T5 and 0.037907 for K2T. Our results demonstrate the effectiveness of the hybrid deep network approach for text generation from table images in an academic format.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Non-linear machine learning with sample perturbation augments leukemia relapse prognostics from single-cell proteomics measurements ARBP: antibiotic-resistant bacteria propagation bio-inspired algorithm and its performance on benchmark functions Detection and classification of diabetic retinopathy based on ensemble learning Office real estate price index forecasts through Gaussian process regressions for ten major Chinese cities Systematic micro-breaks affect concentration during cognitive comparison tasks: quantitative and qualitative measurements
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1