{"title":"Fingerprinting Bitcoin entities using money flow representation learning","authors":"Natkamon Tovanich, Rémy Cazabet","doi":"10.1007/s41109-023-00591-2","DOIUrl":null,"url":null,"abstract":"Abstract Deanonymization is one of the major research challenges in the Bitcoin blockchain, as entities are pseudonymous and cannot be identified from the on-chain data. Various approaches exist to identify multiple addresses of the same entity, i.e., address clustering. But it is known that these approaches tend to find several clusters for the same actor. In this work, we propose to assign a fingerprint to entities based on the dynamic graph of the taint flow of money originating from them, with the idea that we could identify multiple clusters of addresses belonging to the same entity as having similar fingerprints. We experiment with different configurations to generate substructure patterns from taint flows before embedding them using representation learning models. To evaluate our method, we train classification models to identify entities from their fingerprints. Experiments show that our approach can accurately classify entities on three datasets. We compare different fingerprint strategies and show that including the temporality of transactions improves classification accuracy and that following the flow for too long impairs performance. Our work demonstrates that out-flow fingerprinting is a valid approach for recognizing multiple clusters of the same entity.","PeriodicalId":37010,"journal":{"name":"Applied Network Science","volume":null,"pages":null},"PeriodicalIF":1.3000,"publicationDate":"2023-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Network Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s41109-023-00591-2","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Abstract Deanonymization is one of the major research challenges in the Bitcoin blockchain, as entities are pseudonymous and cannot be identified from the on-chain data. Various approaches exist to identify multiple addresses of the same entity, i.e., address clustering. But it is known that these approaches tend to find several clusters for the same actor. In this work, we propose to assign a fingerprint to entities based on the dynamic graph of the taint flow of money originating from them, with the idea that we could identify multiple clusters of addresses belonging to the same entity as having similar fingerprints. We experiment with different configurations to generate substructure patterns from taint flows before embedding them using representation learning models. To evaluate our method, we train classification models to identify entities from their fingerprints. Experiments show that our approach can accurately classify entities on three datasets. We compare different fingerprint strategies and show that including the temporality of transactions improves classification accuracy and that following the flow for too long impairs performance. Our work demonstrates that out-flow fingerprinting is a valid approach for recognizing multiple clusters of the same entity.