2DynEthNet: A Two-Dimensional Streaming Framework for Ethereum Phishing Scam Detection

IF 8 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS IEEE Transactions on Information Forensics and Security Pub Date : 2024-10-21 DOI:10.1109/TIFS.2024.3484296

Jingjing Yang;Wenjia Yu;Jiajing Wu;Dan Lin;Zhiying Wu;Zibin Zheng

{"title":"2DynEthNet: A Two-Dimensional Streaming Framework for Ethereum Phishing Scam Detection","authors":"Jingjing Yang;Wenjia Yu;Jiajing Wu;Dan Lin;Zhiying Wu;Zibin Zheng","doi":"10.1109/TIFS.2024.3484296","DOIUrl":null,"url":null,"abstract":"In recent years, phishing scams have emerged as one of the most serious crimes on Ethereum. Existing phishing scam detection methods typically model public transaction records on the blockchain as a graph, and then identify phishing addresses through manual feature extraction or graph learning frameworks. Meanwhile, these methods model transactions within a period as a static network for analysis. Therefore, these methods lack the ability to capture fine-grained time dynamics, and on the other hand, they cannot handle the large-scale and continuously growing transaction data on the Ethereum blockchain, resulting in lower scalability and efficiency. In this paper, we propose a two-dimensional streaming framework 2DynEthNet for Ethereum phishing scam detection. First, we cast the transaction series into 6 slices according to block numbers, treating each as a separate task. In the first dimension, we treat transaction features as edge features instead of node features within one task, allowing each transaction to be streamed in 2DynEthNet, aiming to capture the evolutionary features of the Ethereum transaction network at a fine-grained level in continuous time. In the second dimension, we adopt the strategy of incremental information training between tasks, which utilizes meta-learning to quickly update the model parameters under new slices, thus effectively improving the scalability of the model. Finally, experimental results on large-scale real Ethereum phishing scam datasets show that our 2DynEthNet outperforms the state-of-the-art methods with 28.44% average Recall and achieves the most efficient training speed, proving the effectiveness of both temporal edge representation and meta-learning. In addition, we provide an Ethereum large-scale dynamic graph transaction dataset, ETGraph, which aligns with the data distribution in real transaction scenarios without sampling and filtering unlabeled accounts.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"19 ","pages":"9924-9937"},"PeriodicalIF":8.0000,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Forensics and Security","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10723803/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

In recent years, phishing scams have emerged as one of the most serious crimes on Ethereum. Existing phishing scam detection methods typically model public transaction records on the blockchain as a graph, and then identify phishing addresses through manual feature extraction or graph learning frameworks. Meanwhile, these methods model transactions within a period as a static network for analysis. Therefore, these methods lack the ability to capture fine-grained time dynamics, and on the other hand, they cannot handle the large-scale and continuously growing transaction data on the Ethereum blockchain, resulting in lower scalability and efficiency. In this paper, we propose a two-dimensional streaming framework 2DynEthNet for Ethereum phishing scam detection. First, we cast the transaction series into 6 slices according to block numbers, treating each as a separate task. In the first dimension, we treat transaction features as edge features instead of node features within one task, allowing each transaction to be streamed in 2DynEthNet, aiming to capture the evolutionary features of the Ethereum transaction network at a fine-grained level in continuous time. In the second dimension, we adopt the strategy of incremental information training between tasks, which utilizes meta-learning to quickly update the model parameters under new slices, thus effectively improving the scalability of the model. Finally, experimental results on large-scale real Ethereum phishing scam datasets show that our 2DynEthNet outperforms the state-of-the-art methods with 28.44% average Recall and achieves the most efficient training speed, proving the effectiveness of both temporal edge representation and meta-learning. In addition, we provide an Ethereum large-scale dynamic graph transaction dataset, ETGraph, which aligns with the data distribution in real transaction scenarios without sampling and filtering unlabeled accounts.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

2DynEthNet：以太坊网络钓鱼欺诈检测的二维流框架

近年来，网络钓鱼诈骗已成为以太坊上最严重的犯罪之一。现有的网络钓鱼欺诈检测方法通常将区块链上的公共交易记录建模为图，然后通过人工特征提取或图学习框架识别网络钓鱼地址。同时，这些方法将一段时间内的交易建模为静态网络进行分析。因此，这些方法一方面缺乏捕捉细粒度时间动态的能力，另一方面无法处理以太坊区块链上大规模且持续增长的交易数据，导致可扩展性和效率较低。本文提出了一种用于以太坊钓鱼欺诈检测的二维流框架 2DynEthNet。首先，我们根据区块编号将交易序列划分为 6 个片段，将每个片段视为一个单独的任务。在第一个维度中，我们将交易特征视为边缘特征，而不是一个任务中的节点特征，使每笔交易都能在 2DynEthNet 中进行流式处理，从而在连续时间中捕捉以太坊交易网络细粒度的演化特征。在第二个维度上，我们采用了任务间增量信息训练的策略，利用元学习快速更新新切片下的模型参数，从而有效提高了模型的可扩展性。最后，在大规模真实以太坊网络钓鱼欺诈数据集上的实验结果表明，我们的 2DynEthNet 以 28.44% 的平均 Recall 优于最先进的方法，并实现了最高效的训练速度，证明了时空边缘表示和元学习的有效性。此外，我们还提供了以太坊大规模动态图交易数据集 ETGraph，该数据集与真实交易场景中的数据分布一致，无需对未标记账户进行采样和过滤。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Information Forensics and Security 工程技术-工程：电子与电气

CiteScore

14.40

自引率

7.40%

发文量

234

审稿时长

6.5 months

期刊介绍： The IEEE Transactions on Information Forensics and Security covers the sciences, technologies, and applications relating to information forensics, information security, biometrics, surveillance and systems applications that incorporate these features