{"title":"fAST: Flattening Abstract Syntax Trees for Efficiency","authors":"Yijun Yu","doi":"10.1109/ICSE-Companion.2019.00113","DOIUrl":null,"url":null,"abstract":"Frequently source code analysis tools need to exchange internal representations of abstract syntax trees (AST) with each other. Conveniently, and intuitively, the externalised representations are in the form of hierarchical trees. We argue, counter-intuitively, that hierarchical representation is not the most efficient way for source analysis tools to exchange parsed AST. In this work, we propose to speed up AST parsing whilst preserving the equivalence of hierarchies in binary forms: (1) AST could be saved as a flat one-dimensional array where pointers to tree nodes are converted into integer offsets, and (2) such flattened AST are more efficient to access by programming tools through the generated application programming interfaces (API). In programming language-agnostic evaluations, we show that parsing flattened AST becomes 100x faster than in textual form AST on a benchmark of open-source projects of 6 different programming languages.","PeriodicalId":273100,"journal":{"name":"2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion)","volume":"2016 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSE-Companion.2019.00113","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
Frequently source code analysis tools need to exchange internal representations of abstract syntax trees (AST) with each other. Conveniently, and intuitively, the externalised representations are in the form of hierarchical trees. We argue, counter-intuitively, that hierarchical representation is not the most efficient way for source analysis tools to exchange parsed AST. In this work, we propose to speed up AST parsing whilst preserving the equivalence of hierarchies in binary forms: (1) AST could be saved as a flat one-dimensional array where pointers to tree nodes are converted into integer offsets, and (2) such flattened AST are more efficient to access by programming tools through the generated application programming interfaces (API). In programming language-agnostic evaluations, we show that parsing flattened AST becomes 100x faster than in textual form AST on a benchmark of open-source projects of 6 different programming languages.