{"title":"Swin-chart: An efficient approach for chart classification","authors":"Anurag Dhote , Mohammed Javed , David S. Doermann","doi":"10.1016/j.patrec.2024.08.012","DOIUrl":null,"url":null,"abstract":"<div><p>Charts are a visualization tool used in scientific documents to facilitate easy comprehension of complex relationships underlying data and experiments. Researchers use various chart types to convey scientific information, so the problem of data extraction and subsequent chart understanding becomes very challenging. Many studies have been taken up in the literature to address the problem of chart mining, whose motivation is to facilitate the editing of existing charts, carry out extrapolative studies, and provide a deeper understanding of the underlying data. The first step towards chart understanding is chart classification, for which traditional ML and CNN-based deep learning models have been used in the literature. In this paper, we propose Swin-Chart, a Swin transformer-based deep learning approach for chart classification, which generalizes well across multiple datasets with a wide range of chart categories. Swin-Chart comprises a pre-trained Swin Transformer, a finetuning component, and a weight averaging component. The proposed approach is tested on a five-chart image benchmark dataset. We observed that the Swin-Chart model outperformers existing state-of-the-art models on all the datasets. Furthermore, we also provide an ablation study of the Swin-Chart model with all five datasets to understand the importance of various sub-parts such as the back-bone Swin transformer model, the value of several best weights selected for the weight averaging component, and the presence of the weight averaging component itself.</p><p>The Swin-Chart model also received first position in the chart classification task on the latest dataset in the CHART Infographics competition at ICDAR 2023 - chartinfo.github.io.</p></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"185 ","pages":"Pages 203-209"},"PeriodicalIF":3.9000,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition Letters","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167865524002447","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Charts are a visualization tool used in scientific documents to facilitate easy comprehension of complex relationships underlying data and experiments. Researchers use various chart types to convey scientific information, so the problem of data extraction and subsequent chart understanding becomes very challenging. Many studies have been taken up in the literature to address the problem of chart mining, whose motivation is to facilitate the editing of existing charts, carry out extrapolative studies, and provide a deeper understanding of the underlying data. The first step towards chart understanding is chart classification, for which traditional ML and CNN-based deep learning models have been used in the literature. In this paper, we propose Swin-Chart, a Swin transformer-based deep learning approach for chart classification, which generalizes well across multiple datasets with a wide range of chart categories. Swin-Chart comprises a pre-trained Swin Transformer, a finetuning component, and a weight averaging component. The proposed approach is tested on a five-chart image benchmark dataset. We observed that the Swin-Chart model outperformers existing state-of-the-art models on all the datasets. Furthermore, we also provide an ablation study of the Swin-Chart model with all five datasets to understand the importance of various sub-parts such as the back-bone Swin transformer model, the value of several best weights selected for the weight averaging component, and the presence of the weight averaging component itself.
The Swin-Chart model also received first position in the chart classification task on the latest dataset in the CHART Infographics competition at ICDAR 2023 - chartinfo.github.io.
期刊介绍:
Pattern Recognition Letters aims at rapid publication of concise articles of a broad interest in pattern recognition.
Subject areas include all the current fields of interest represented by the Technical Committees of the International Association of Pattern Recognition, and other developing themes involving learning and recognition.