Eth2Vec: Learning Contract-Wide Code Representations for Vulnerability Detection on Ethereum Smart Contracts

Proceedings of the 3rd ACM International Symposium on Blockchain and Secure Critical Infrastructure Pub Date : 2021-01-07 DOI:10.1145/3457337.3457841

Nami Ashizawa, Naoto Yanai, Jason Paul Cruz, Shingo Okamura

{"title":"Eth2Vec: Learning Contract-Wide Code Representations for Vulnerability Detection on Ethereum Smart Contracts","authors":"Nami Ashizawa, Naoto Yanai, Jason Paul Cruz, Shingo Okamura","doi":"10.1145/3457337.3457841","DOIUrl":null,"url":null,"abstract":"Ethereum smart contracts are programs that run on the Ethereum blockchain, and many smart contract vulnerabilities have been discovered in the past decade. Many security analysis tools have been created to detect such vulnerabilities, but their performance decreases drastically when codes to be analyzed are being rewritten. In this paper, we propose Eth2Vec, a machine-learning-based static analysis tool for vulnerability detection in smart contracts. It is also robust against code rewrites, i.e., it can detect vulnerabilities even in rewritten codes. Existing machine-learning-based static analysis tools for vulnerability detection need features, which analysts create manually, as inputs. In contrast, Eth2Vec automatically learns features of vulnerable Ethereum Virtual Machine (EVM) bytecodes with tacit knowledge through a neural network for natural language processing. Therefore, Eth2Vec can detect vulnerabilities in smart contracts by comparing the code similarity between target EVM bytecodes and the EVM bytecodes it already learned. We conducted experiments with existing open databases, such as Etherscan, and our results show that Eth2Vec outperforms a recent model based on support vector machine in terms of well-known metrics, i.e., precision, recall, and F1-score.","PeriodicalId":270073,"journal":{"name":"Proceedings of the 3rd ACM International Symposium on Blockchain and Secure Critical Infrastructure","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"49","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd ACM International Symposium on Blockchain and Secure Critical Infrastructure","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3457337.3457841","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 49

Abstract

Ethereum smart contracts are programs that run on the Ethereum blockchain, and many smart contract vulnerabilities have been discovered in the past decade. Many security analysis tools have been created to detect such vulnerabilities, but their performance decreases drastically when codes to be analyzed are being rewritten. In this paper, we propose Eth2Vec, a machine-learning-based static analysis tool for vulnerability detection in smart contracts. It is also robust against code rewrites, i.e., it can detect vulnerabilities even in rewritten codes. Existing machine-learning-based static analysis tools for vulnerability detection need features, which analysts create manually, as inputs. In contrast, Eth2Vec automatically learns features of vulnerable Ethereum Virtual Machine (EVM) bytecodes with tacit knowledge through a neural network for natural language processing. Therefore, Eth2Vec can detect vulnerabilities in smart contracts by comparing the code similarity between target EVM bytecodes and the EVM bytecodes it already learned. We conducted experiments with existing open databases, such as Etherscan, and our results show that Eth2Vec outperforms a recent model based on support vector machine in terms of well-known metrics, i.e., precision, recall, and F1-score.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Eth2Vec:学习以太坊智能合约漏洞检测的合约范围代码表示

以太坊智能合约是在以太坊区块链上运行的程序，在过去十年中发现了许多智能合约漏洞。已经创建了许多安全分析工具来检测此类漏洞，但是当要分析的代码被重写时，它们的性能会急剧下降。在本文中，我们提出了Eth2Vec，一种基于机器学习的静态分析工具，用于智能合约中的漏洞检测。它对代码重写也很健壮，也就是说，它甚至可以在重写的代码中检测漏洞。现有的基于机器学习的漏洞检测静态分析工具需要分析人员手动创建的特征作为输入。相比之下，Eth2Vec通过神经网络进行自然语言处理，以隐性知识自动学习易受攻击的以太坊虚拟机(EVM)字节码的特征。因此，Eth2Vec可以通过比较目标EVM字节码与它已经学习到的EVM字节码之间的代码相似性来检测智能合约中的漏洞。我们对现有的开放数据库(如Etherscan)进行了实验，结果表明，Eth2Vec在众所周知的指标(即精度、召回率和f1分数)方面优于最近基于支持向量机的模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 3rd ACM International Symposium on Blockchain and Secure Critical Infrastructure

自引率

0.00%

发文量

期刊最新文献

Blockchains, Security, and Infrastructures: What we Know and What we Can Know FutureText: A Blockchain-based Contract Signing Prototype with Security and Convenience Session details: BSCI Short Paper Session 1 Decentralised Peer-to-Peer Crop Insurance Session details: BSCI Session 2