Hadia Showkat Kawoosa, Muhammad Suhaib Kanroo, P. Goyal
{"title":"LYLAA:一个轻量级的基于YOLO的图例和轴分析方法的图表信息图","authors":"Hadia Showkat Kawoosa, Muhammad Suhaib Kanroo, P. Goyal","doi":"10.1145/3573128.3609355","DOIUrl":null,"url":null,"abstract":"Chart Data Extraction (CDE) is a complex task in document analysis that involves extracting data from charts to facilitate accessibility for various applications, such as document mining, medical diagnosis, and accessibility for the visually impaired. CDE is challenging due to the intricate structure and specific semantics of charts, which include elements such as title, axis, legend, and plot elements. The existing solutions for CDE have not yet satisfactorily addressed these issues. In this paper, we focus on two critical subtasks in CDE, Legend Analysis and Axis Analysis, and present a lightweight YOLO-based method for detection and domain-specific heuristic algorithms (Axis Matching and Legend Matching), for matching. We evaluate the efficacy of our proposed method, LYLAA, on a real-world dataset, the ICPR2022 UB PMC dataset, and observe promising results compared to the competing teams in the ICPR2022 CHART-Infographics competition. Our findings showcase the potential of our proposed method in the CDE process.","PeriodicalId":310776,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering 2023","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"LYLAA: A Lightweight YOLO based Legend and Axis Analysis method for CHART-Infographics\",\"authors\":\"Hadia Showkat Kawoosa, Muhammad Suhaib Kanroo, P. Goyal\",\"doi\":\"10.1145/3573128.3609355\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Chart Data Extraction (CDE) is a complex task in document analysis that involves extracting data from charts to facilitate accessibility for various applications, such as document mining, medical diagnosis, and accessibility for the visually impaired. CDE is challenging due to the intricate structure and specific semantics of charts, which include elements such as title, axis, legend, and plot elements. The existing solutions for CDE have not yet satisfactorily addressed these issues. In this paper, we focus on two critical subtasks in CDE, Legend Analysis and Axis Analysis, and present a lightweight YOLO-based method for detection and domain-specific heuristic algorithms (Axis Matching and Legend Matching), for matching. We evaluate the efficacy of our proposed method, LYLAA, on a real-world dataset, the ICPR2022 UB PMC dataset, and observe promising results compared to the competing teams in the ICPR2022 CHART-Infographics competition. Our findings showcase the potential of our proposed method in the CDE process.\",\"PeriodicalId\":310776,\"journal\":{\"name\":\"Proceedings of the ACM Symposium on Document Engineering 2023\",\"volume\":\"32 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-08-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the ACM Symposium on Document Engineering 2023\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3573128.3609355\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM Symposium on Document Engineering 2023","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3573128.3609355","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
LYLAA: A Lightweight YOLO based Legend and Axis Analysis method for CHART-Infographics
Chart Data Extraction (CDE) is a complex task in document analysis that involves extracting data from charts to facilitate accessibility for various applications, such as document mining, medical diagnosis, and accessibility for the visually impaired. CDE is challenging due to the intricate structure and specific semantics of charts, which include elements such as title, axis, legend, and plot elements. The existing solutions for CDE have not yet satisfactorily addressed these issues. In this paper, we focus on two critical subtasks in CDE, Legend Analysis and Axis Analysis, and present a lightweight YOLO-based method for detection and domain-specific heuristic algorithms (Axis Matching and Legend Matching), for matching. We evaluate the efficacy of our proposed method, LYLAA, on a real-world dataset, the ICPR2022 UB PMC dataset, and observe promising results compared to the competing teams in the ICPR2022 CHART-Infographics competition. Our findings showcase the potential of our proposed method in the CDE process.