Eth2Vec: Learning Contract-Wide Code Representations for Vulnerability Detection on Ethereum Smart Contracts

Nami Ashizawa, Naoto Yanai, Jason Paul Cruz, Shingo Okamura
{"title":"Eth2Vec: Learning Contract-Wide Code Representations for Vulnerability Detection on Ethereum Smart Contracts","authors":"Nami Ashizawa, Naoto Yanai, Jason Paul Cruz, Shingo Okamura","doi":"10.1145/3457337.3457841","DOIUrl":null,"url":null,"abstract":"Ethereum smart contracts are programs that run on the Ethereum blockchain, and many smart contract vulnerabilities have been discovered in the past decade. Many security analysis tools have been created to detect such vulnerabilities, but their performance decreases drastically when codes to be analyzed are being rewritten. In this paper, we propose Eth2Vec, a machine-learning-based static analysis tool for vulnerability detection in smart contracts. It is also robust against code rewrites, i.e., it can detect vulnerabilities even in rewritten codes. Existing machine-learning-based static analysis tools for vulnerability detection need features, which analysts create manually, as inputs. In contrast, Eth2Vec automatically learns features of vulnerable Ethereum Virtual Machine (EVM) bytecodes with tacit knowledge through a neural network for natural language processing. Therefore, Eth2Vec can detect vulnerabilities in smart contracts by comparing the code similarity between target EVM bytecodes and the EVM bytecodes it already learned. We conducted experiments with existing open databases, such as Etherscan, and our results show that Eth2Vec outperforms a recent model based on support vector machine in terms of well-known metrics, i.e., precision, recall, and F1-score.","PeriodicalId":270073,"journal":{"name":"Proceedings of the 3rd ACM International Symposium on Blockchain and Secure Critical Infrastructure","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"49","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd ACM International Symposium on Blockchain and Secure Critical Infrastructure","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3457337.3457841","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 49

Abstract

Ethereum smart contracts are programs that run on the Ethereum blockchain, and many smart contract vulnerabilities have been discovered in the past decade. Many security analysis tools have been created to detect such vulnerabilities, but their performance decreases drastically when codes to be analyzed are being rewritten. In this paper, we propose Eth2Vec, a machine-learning-based static analysis tool for vulnerability detection in smart contracts. It is also robust against code rewrites, i.e., it can detect vulnerabilities even in rewritten codes. Existing machine-learning-based static analysis tools for vulnerability detection need features, which analysts create manually, as inputs. In contrast, Eth2Vec automatically learns features of vulnerable Ethereum Virtual Machine (EVM) bytecodes with tacit knowledge through a neural network for natural language processing. Therefore, Eth2Vec can detect vulnerabilities in smart contracts by comparing the code similarity between target EVM bytecodes and the EVM bytecodes it already learned. We conducted experiments with existing open databases, such as Etherscan, and our results show that Eth2Vec outperforms a recent model based on support vector machine in terms of well-known metrics, i.e., precision, recall, and F1-score.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Eth2Vec:学习以太坊智能合约漏洞检测的合约范围代码表示
以太坊智能合约是在以太坊区块链上运行的程序,在过去十年中发现了许多智能合约漏洞。已经创建了许多安全分析工具来检测此类漏洞,但是当要分析的代码被重写时,它们的性能会急剧下降。在本文中,我们提出了Eth2Vec,一种基于机器学习的静态分析工具,用于智能合约中的漏洞检测。它对代码重写也很健壮,也就是说,它甚至可以在重写的代码中检测漏洞。现有的基于机器学习的漏洞检测静态分析工具需要分析人员手动创建的特征作为输入。相比之下,Eth2Vec通过神经网络进行自然语言处理,以隐性知识自动学习易受攻击的以太坊虚拟机(EVM)字节码的特征。因此,Eth2Vec可以通过比较目标EVM字节码与它已经学习到的EVM字节码之间的代码相似性来检测智能合约中的漏洞。我们对现有的开放数据库(如Etherscan)进行了实验,结果表明,Eth2Vec在众所周知的指标(即精度、召回率和f1分数)方面优于最近基于支持向量机的模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Blockchains, Security, and Infrastructures: What we Know and What we Can Know FutureText: A Blockchain-based Contract Signing Prototype with Security and Convenience Session details: BSCI Short Paper Session 1 Decentralised Peer-to-Peer Crop Insurance Session details: BSCI Session 2
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1