Pub Date : 2022-03-01DOI: 10.1109/ICECCS54210.2022.00029
Penglong Chen, Zhuguo Li, Yu Wen, Lili Liu
Deep learning models have been widely used in source code processing tasks, such as code captioning, code summarization, code completion, and code classification. Recent studies have shown that deep learning-based source code processing models are vulnerable. Attackers can generate adversarial examples by adding perturbations to source programs. Existing attack methods perturb a source program by renaming one or multiple variables in the program. These attack methods do not take into account the perturbation of the equivalent structural transformations of the source code. We propose a set of program transformations involving identifier renaming and structural transformations, which can ensure that the perturbed program retains the original semantics but can fool the source code processing model to change the original prediction result. We propose a novel method of applying semantics-preserving structural transformations to attack the source program pro-cessing model in the white-box setting. This is the first time that semantics-preserving structural transformations are applied to generate adversarial examples of source code processing models. We first find the important tokens in the program by calculating the contribution values of each part of the program, then select the best transformation for each important token to generate semantic adversarial examples. The experimental results show that the attack success rate of our attack method can improve 8.29 % on average compared with the state-of-the-art attack method; adversarial training using the adversarial examples generated by our attack method can reduce the attack success rates of source code processing models by 21.79% on average.
{"title":"Generating Adversarial Source Programs Using Important Tokens-based Structural Transformations","authors":"Penglong Chen, Zhuguo Li, Yu Wen, Lili Liu","doi":"10.1109/ICECCS54210.2022.00029","DOIUrl":"https://doi.org/10.1109/ICECCS54210.2022.00029","url":null,"abstract":"Deep learning models have been widely used in source code processing tasks, such as code captioning, code summarization, code completion, and code classification. Recent studies have shown that deep learning-based source code processing models are vulnerable. Attackers can generate adversarial examples by adding perturbations to source programs. Existing attack methods perturb a source program by renaming one or multiple variables in the program. These attack methods do not take into account the perturbation of the equivalent structural transformations of the source code. We propose a set of program transformations involving identifier renaming and structural transformations, which can ensure that the perturbed program retains the original semantics but can fool the source code processing model to change the original prediction result. We propose a novel method of applying semantics-preserving structural transformations to attack the source program pro-cessing model in the white-box setting. This is the first time that semantics-preserving structural transformations are applied to generate adversarial examples of source code processing models. We first find the important tokens in the program by calculating the contribution values of each part of the program, then select the best transformation for each important token to generate semantic adversarial examples. The experimental results show that the attack success rate of our attack method can improve 8.29 % on average compared with the state-of-the-art attack method; adversarial training using the adversarial examples generated by our attack method can reduce the attack success rates of source code processing models by 21.79% on average.","PeriodicalId":344493,"journal":{"name":"2022 26th International Conference on Engineering of Complex Computer Systems (ICECCS)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126503108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-03-01DOI: 10.1109/ICECCS54210.2022.00013
Mufan Xiang, Yongjian Li, Sijun Tan, Yongxin Zhao, Yiwei Chi
Multi-ported memories are essential modules to provide parallel access for high-performance parallel computation systems such as VLIW and vector processors, etc. However, the design of multi-ported memories are rather complex and error-prone, which usually causes the high implementation cost. Therefore, the designs and verification of multi-ported memories become challenging. In this paper, we firstly present a modular and parameterized approach based on Chisel to design and implement multi-ported memory concisely. Furthermore, to verify the correctness of the design, we formalize properties of multi-write-read operations of the memories by generalized symbolic trajectory assertion (GSTE) graphs and verified them by two kinds of approaches: SystemVerilog Assertions-based, and GSTE-based approaches. Our verification through SVA and STE/GSTE successfully finds an error caused by misusing one parameter in our high-level design.
{"title":"Parameterized Design and Formal Verification of Multi-ported Memory","authors":"Mufan Xiang, Yongjian Li, Sijun Tan, Yongxin Zhao, Yiwei Chi","doi":"10.1109/ICECCS54210.2022.00013","DOIUrl":"https://doi.org/10.1109/ICECCS54210.2022.00013","url":null,"abstract":"Multi-ported memories are essential modules to provide parallel access for high-performance parallel computation systems such as VLIW and vector processors, etc. However, the design of multi-ported memories are rather complex and error-prone, which usually causes the high implementation cost. Therefore, the designs and verification of multi-ported memories become challenging. In this paper, we firstly present a modular and parameterized approach based on Chisel to design and implement multi-ported memory concisely. Furthermore, to verify the correctness of the design, we formalize properties of multi-write-read operations of the memories by generalized symbolic trajectory assertion (GSTE) graphs and verified them by two kinds of approaches: SystemVerilog Assertions-based, and GSTE-based approaches. Our verification through SVA and STE/GSTE successfully finds an error caused by misusing one parameter in our high-level design.","PeriodicalId":344493,"journal":{"name":"2022 26th International Conference on Engineering of Complex Computer Systems (ICECCS)","volume":"225 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115492710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-03-01DOI: 10.1109/ICECCS54210.2022.00026
Cong Zhou, Li Kuang
Code is a kind of complex data. Recent models learn code representation using global or local aggregation. Global encoding allows all tokens of code to be connected directly and neglects the graph structure. Local encoding focuses on the neighbor nodes when capturing the graph structure but fails to capture long dependencies. In this work, we gather both encoding strategies and investigate different models that combine both global and local representations of code in order to learn code representation better. Specifically, we modify the layer structure based on the sequence-to-sequence model to incorporate a structured model in the encoder and decoder parts, respectively. To further consider different integration ways, we propose four models for method naming. In an extensive evaluation, we demonstrate that our models have a significant improvement on a well-studied dataset of method naming, achieving ROUGE-1 score of 54.1, ROUGE-2 score of 26.7, and ROUGE-L score of 54.3, outperforming state-of-the-art models by 2.7, 1.7, and 4.3 points, respectively. Our data and code are available at https://github.com/zc-work/CGLNaming.
{"title":"Combining Global and Local Representations of Source Code for Method Naming","authors":"Cong Zhou, Li Kuang","doi":"10.1109/ICECCS54210.2022.00026","DOIUrl":"https://doi.org/10.1109/ICECCS54210.2022.00026","url":null,"abstract":"Code is a kind of complex data. Recent models learn code representation using global or local aggregation. Global encoding allows all tokens of code to be connected directly and neglects the graph structure. Local encoding focuses on the neighbor nodes when capturing the graph structure but fails to capture long dependencies. In this work, we gather both encoding strategies and investigate different models that combine both global and local representations of code in order to learn code representation better. Specifically, we modify the layer structure based on the sequence-to-sequence model to incorporate a structured model in the encoder and decoder parts, respectively. To further consider different integration ways, we propose four models for method naming. In an extensive evaluation, we demonstrate that our models have a significant improvement on a well-studied dataset of method naming, achieving ROUGE-1 score of 54.1, ROUGE-2 score of 26.7, and ROUGE-L score of 54.3, outperforming state-of-the-art models by 2.7, 1.7, and 4.3 points, respectively. Our data and code are available at https://github.com/zc-work/CGLNaming.","PeriodicalId":344493,"journal":{"name":"2022 26th International Conference on Engineering of Complex Computer Systems (ICECCS)","volume":"24 54","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120842089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-03-01DOI: 10.1109/ICECCS54210.2022.00031
Yanzhao Wang, Fei Xie
Deep-learning accelerators are increasingly popular. There are two prevalent accelerator architectures: one based on general matrix multiplication units and the other on convolution cores. However, Tensor Virtual Machine (TVM), a widely used deep-learning compiler stack, does not support the latter. This paper proposes a general framework for extending TVM to support deep-learning accelerators with convolution cores. We have applied it to two well-known accelerators: Nvidia's NVDLA and Bitmain's BM1880 successfully. Deep-learning workloads can now be readily deployed to these accelerators through TVM and executed efficiently. This framework can extend TVM to other accelerators with minimum effort.
{"title":"Extending Tensor Virtual Machine to Support Deep-Learning Accelerators with Convolution Cores","authors":"Yanzhao Wang, Fei Xie","doi":"10.1109/ICECCS54210.2022.00031","DOIUrl":"https://doi.org/10.1109/ICECCS54210.2022.00031","url":null,"abstract":"Deep-learning accelerators are increasingly popular. There are two prevalent accelerator architectures: one based on general matrix multiplication units and the other on convolution cores. However, Tensor Virtual Machine (TVM), a widely used deep-learning compiler stack, does not support the latter. This paper proposes a general framework for extending TVM to support deep-learning accelerators with convolution cores. We have applied it to two well-known accelerators: Nvidia's NVDLA and Bitmain's BM1880 successfully. Deep-learning workloads can now be readily deployed to these accelerators through TVM and executed efficiently. This framework can extend TVM to other accelerators with minimum effort.","PeriodicalId":344493,"journal":{"name":"2022 26th International Conference on Engineering of Complex Computer Systems (ICECCS)","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126003750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-03-01DOI: 10.1109/ICECCS54210.2022.00012
Matteo Basso, F. Schiavio, Andrea Rosà, Walter Binder
The Java Stream API increases developer produc-tivity and greatly simplifies exploiting parallel computation by providing a high-level abstraction on top of complex data pro-cessing, parallelization, and synchronization algorithms. However, the usage of the Java Stream API often incurs significant runtime overhead. Method inlining and the automated translation of code using the Java Stream API into imperative code using loops can reduce such overhead; however, existing approaches and tools are applicable only to sequential stream pipelines, leaving the optimization of parallel streams an open issue. We bridge this gap by presenting a novel method to exploit high-level static analysis to characterize stream pipelines, detect parallel streams, and apply transformations removing the abstraction overhead. We evaluate our method on a set of benchmarks, showing that our approach significantly reduces execution time and memory allocation.
{"title":"Optimizing Parallel Java Streams","authors":"Matteo Basso, F. Schiavio, Andrea Rosà, Walter Binder","doi":"10.1109/ICECCS54210.2022.00012","DOIUrl":"https://doi.org/10.1109/ICECCS54210.2022.00012","url":null,"abstract":"The Java Stream API increases developer produc-tivity and greatly simplifies exploiting parallel computation by providing a high-level abstraction on top of complex data pro-cessing, parallelization, and synchronization algorithms. However, the usage of the Java Stream API often incurs significant runtime overhead. Method inlining and the automated translation of code using the Java Stream API into imperative code using loops can reduce such overhead; however, existing approaches and tools are applicable only to sequential stream pipelines, leaving the optimization of parallel streams an open issue. We bridge this gap by presenting a novel method to exploit high-level static analysis to characterize stream pipelines, detect parallel streams, and apply transformations removing the abstraction overhead. We evaluate our method on a set of benchmarks, showing that our approach significantly reduces execution time and memory allocation.","PeriodicalId":344493,"journal":{"name":"2022 26th International Conference on Engineering of Complex Computer Systems (ICECCS)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114907929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-01-18DOI: 10.1109/ICECCS54210.2022.00009
Jaime Arias, W. Penczek, L. Petrucci, Teofil Sidoruk
Expressing attack-defence trees in a multi-agent setting allows for studying a new aspect of security scenarios, namely how the number of agents and their task assignment impact the performance, e.g. attack time, of strategies executed by opposing coalitions. Optimal scheduling of agents' actions, a non-trivial problem, is thus vital. We discuss associated caveats and propose an algorithm that synthesises such an assignment, targeting minimal attack time and using minimal number of agents for a given attack-defence tree.
{"title":"Minimal Schedule with Minimal Number of Agents in Attack-Defence Trees","authors":"Jaime Arias, W. Penczek, L. Petrucci, Teofil Sidoruk","doi":"10.1109/ICECCS54210.2022.00009","DOIUrl":"https://doi.org/10.1109/ICECCS54210.2022.00009","url":null,"abstract":"Expressing attack-defence trees in a multi-agent setting allows for studying a new aspect of security scenarios, namely how the number of agents and their task assignment impact the performance, e.g. attack time, of strategies executed by opposing coalitions. Optimal scheduling of agents' actions, a non-trivial problem, is thus vital. We discuss associated caveats and propose an algorithm that synthesises such an assignment, targeting minimal attack time and using minimal number of agents for a given attack-defence tree.","PeriodicalId":344493,"journal":{"name":"2022 26th International Conference on Engineering of Complex Computer Systems (ICECCS)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123454881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-08DOI: 10.1109/ICECCS54210.2022.00018
Yepeng Ding, Hiroyuki Sato
Decentralized systems have been widely developed and applied to address security and privacy issues in centralized systems, especially since the advancement of distributed ledger technology. However, it is challenging to ensure their correct functioning with respect to their designs and minimize the technical risk before the delivery. Although formal methods have made significant progress over the past decades, a feasible solution based on formal methods from a development process perspective has not been well developed. In this paper, we formulate an iterative and incremental development process, named formalism-driven development (FDD), for developing provably correct decentralized systems under the guidance of formal methods. We also present a framework named Seniz, to practicalize FDD with a new modeling language and scaffolds. Furthermore, we conduct case studies to demonstrate the effectiveness of FDD in practice with the support of Seniz.
{"title":"Formalism- Driven Development of Decentralized Systems","authors":"Yepeng Ding, Hiroyuki Sato","doi":"10.1109/ICECCS54210.2022.00018","DOIUrl":"https://doi.org/10.1109/ICECCS54210.2022.00018","url":null,"abstract":"Decentralized systems have been widely developed and applied to address security and privacy issues in centralized systems, especially since the advancement of distributed ledger technology. However, it is challenging to ensure their correct functioning with respect to their designs and minimize the technical risk before the delivery. Although formal methods have made significant progress over the past decades, a feasible solution based on formal methods from a development process perspective has not been well developed. In this paper, we formulate an iterative and incremental development process, named formalism-driven development (FDD), for developing provably correct decentralized systems under the guidance of formal methods. We also present a framework named Seniz, to practicalize FDD with a new modeling language and scaffolds. Furthermore, we conduct case studies to demonstrate the effectiveness of FDD in practice with the support of Seniz.","PeriodicalId":344493,"journal":{"name":"2022 26th International Conference on Engineering of Complex Computer Systems (ICECCS)","volume":"152 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116338440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}