Pub Date : 2025-03-01DOI: 10.1016/j.fsidi.2025.301867
Hannes Spichiger , Frank Adelstein
Preservation is generally considered as the step in the forensic process that stops evidence from decaying. In this paper, we argue that the traditional scope of preservation in digital forensic science, focused on the trace, is not sufficient to ensure the stop of decay in the context of evolving systems. Instead, insufficiently preserved reference material may lead to the loss of meaning, resulting in an overall increase of uncertainty in the presented evidence. An expanded definition of Preservation and a definition of Reference Data are proposed. We present suggestions for future avenues of research of ways to preserve reference data in order to avoid a loss of meaning of the trace data.
{"title":"Preserving meaning of evidence from evolving systems","authors":"Hannes Spichiger , Frank Adelstein","doi":"10.1016/j.fsidi.2025.301867","DOIUrl":"10.1016/j.fsidi.2025.301867","url":null,"abstract":"<div><div>Preservation is generally considered as the step in the forensic process that stops evidence from decaying. In this paper, we argue that the traditional scope of preservation in digital forensic science, focused on the trace, is not sufficient to ensure the stop of decay in the context of evolving systems. Instead, insufficiently preserved reference material may lead to the loss of meaning, resulting in an overall increase of uncertainty in the presented evidence. An expanded definition of Preservation and a definition of Reference Data are proposed. We present suggestions for future avenues of research of ways to preserve reference data in order to avoid a loss of meaning of the trace data.</div></div>","PeriodicalId":48481,"journal":{"name":"Forensic Science International-Digital Investigation","volume":"52 ","pages":"Article 301867"},"PeriodicalIF":2.0,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143679789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-01DOI: 10.1016/j.fsidi.2025.301870
Romke van Dijk , Judith van de Wetering , Ranieri Argentini , Leonie Gorka , Anne Fleur van Luenen , Sieds Minnema , Edwin Rijgersberg , Mattijs Ugen , Zoltán Ádám Mann , Zeno Geradts
In digital forensic investigations, the ability to identify passwords in cleartext within digital evidence is often essential for the acquisition of data from encrypted devices. Passwords may be stored in cleartext, knowingly or accidentally, in various locations within a device, e.g., in text messages, notes, or system log files. Finding those passwords is a challenging task, as devices typically contain a substantial amount and a wide variety of textual data. This paper explores the performance of several different types of machine learning models trained to distinguish passwords from non-passwords, and ranks them according to their likelihood of being a human-generated password. Three deep learning models (PassGPT, CodeBERT and DistilBERT) were fine-tuned, and two traditional machine learning models (a feature-based XGBoost and a TF/IDF-based XGBoost) were trained. These were compared to the existing state-of-the-art technology, a password recognition model based on probabilistic context-free grammars. Our research shows that the fine-tuned PassGPT model outperforms the other models. We show that the combination of multiple different types of training datasets, carefully chosen based on the context, is needed to achieve good results. In particular, it is important to train not only on dictionary words and leaked credentials, but also on data scraped from chats and websites. Our approach was evaluated with realistic hardware that could fit inside an investigator's workstation. The evaluation was conducted on the publicly available RockYou and MyHeritage leaks, but also on a dataset derived from real casework, showing that these innovations can indeed be used in a real forensic context.
{"title":"PaSSw0rdVib3s!: AI-assisted password recognition for digital forensic investigations","authors":"Romke van Dijk , Judith van de Wetering , Ranieri Argentini , Leonie Gorka , Anne Fleur van Luenen , Sieds Minnema , Edwin Rijgersberg , Mattijs Ugen , Zoltán Ádám Mann , Zeno Geradts","doi":"10.1016/j.fsidi.2025.301870","DOIUrl":"10.1016/j.fsidi.2025.301870","url":null,"abstract":"<div><div>In digital forensic investigations, the ability to identify passwords in cleartext within digital evidence is often essential for the acquisition of data from encrypted devices. Passwords may be stored in cleartext, knowingly or accidentally, in various locations within a device, e.g., in text messages, notes, or system log files. Finding those passwords is a challenging task, as devices typically contain a substantial amount and a wide variety of textual data. This paper explores the performance of several different types of machine learning models trained to distinguish passwords from non-passwords, and ranks them according to their likelihood of being a human-generated password. Three deep learning models (PassGPT, CodeBERT and DistilBERT) were fine-tuned, and two traditional machine learning models (a feature-based XGBoost and a TF/IDF-based XGBoost) were trained. These were compared to the existing state-of-the-art technology, a password recognition model based on probabilistic context-free grammars. Our research shows that the fine-tuned PassGPT model outperforms the other models. We show that the combination of multiple different types of training datasets, carefully chosen based on the context, is needed to achieve good results. In particular, it is important to train not only on dictionary words and leaked credentials, but also on data scraped from chats and websites. Our approach was evaluated with realistic hardware that could fit inside an investigator's workstation. The evaluation was conducted on the publicly available RockYou and MyHeritage leaks, but also on a dataset derived from real casework, showing that these innovations can indeed be used in a real forensic context.</div></div>","PeriodicalId":48481,"journal":{"name":"Forensic Science International-Digital Investigation","volume":"52 ","pages":"Article 301870"},"PeriodicalIF":2.0,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143679792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-01DOI: 10.1016/j.fsidi.2025.301874
Lena L. Voigt , Felix Freiling , Christopher Hargreaves
There is currently no systematic method for evaluating digital forensic datasets. This makes it difficult to judge their suitability for specific use cases in digital forensic education and training. Additionally, there is limited comparability in the quality of synthetic datasets or the strengths and weaknesses of different data synthesis approaches. In this paper, we propose the concept of a quantitative, metrics-based assessment of forensic datasets as a first step toward a systematic evaluation approach. As a concrete implementation of this approach, we introduce Mass Disk Processor, a tool that automates the collection of metrics from large sets of disk images. It enables a privacy-preserving retrieval of high-level disk image characteristics, facilitating the assessment of not only synthetic but also real-world disk images. We demonstrate two applications of our tool. First, we create a comprehensive datasheet for publicly available, scenario-based synthetic disk images. Second, we propose a formal definition of synthetic data realism that compares properties of synthetic data to properties of real-world data and present results from an examination of the realism of current scenario-based disk images.
目前还没有评估数字法医数据集的系统方法。这使得很难判断它们是否适合数字法医教育和培训中的特定用例。此外,合成数据集的质量或不同数据合成方法的优缺点具有有限的可比性。在本文中,我们提出了一个定量的,基于指标的法医数据集评估的概念,作为迈向系统评估方法的第一步。作为这种方法的具体实现,我们介绍了Mass Disk Processor,这是一种工具,可以自动收集来自大型磁盘映像集的指标。它支持高级磁盘映像特征的隐私保护检索,不仅便于对合成磁盘映像进行评估,还便于对真实磁盘映像进行评估。我们将演示该工具的两个应用程序。首先,我们为公开可用的、基于场景的合成磁盘映像创建一个全面的数据表。其次,我们提出了合成数据真实感的正式定义,将合成数据的属性与真实世界数据的属性进行比较,并给出了对当前基于场景的磁盘映像的真实感检查的结果。
{"title":"A metrics-based look at disk images: Insights and applications","authors":"Lena L. Voigt , Felix Freiling , Christopher Hargreaves","doi":"10.1016/j.fsidi.2025.301874","DOIUrl":"10.1016/j.fsidi.2025.301874","url":null,"abstract":"<div><div>There is currently no systematic method for evaluating digital forensic datasets. This makes it difficult to judge their suitability for specific use cases in digital forensic education and training. Additionally, there is limited comparability in the quality of synthetic datasets or the strengths and weaknesses of different data synthesis approaches. In this paper, we propose the concept of a quantitative, metrics-based assessment of forensic datasets as a first step toward a systematic evaluation approach. As a concrete implementation of this approach, we introduce <em>Mass Disk Processor</em>, a tool that automates the collection of metrics from large sets of disk images. It enables a privacy-preserving retrieval of high-level disk image characteristics, facilitating the assessment of not only synthetic but also real-world disk images. We demonstrate two applications of our tool. First, we create a comprehensive datasheet for publicly available, scenario-based synthetic disk images. Second, we propose a formal definition of synthetic data realism that compares properties of synthetic data to properties of real-world data and present results from an examination of the realism of current scenario-based disk images.</div></div>","PeriodicalId":48481,"journal":{"name":"Forensic Science International-Digital Investigation","volume":"52 ","pages":"Article 301874"},"PeriodicalIF":2.0,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143679884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-01DOI: 10.1016/j.fsidi.2025.301864
Christopher Hargreaves , Harm van Beek , Eoghan Casey
This work presents SOLVE-IT (Systematic Objective-based Listing of Various Established (Digital) Investigation Techniques), a digital forensics knowledge base inspired by the MITRE ATT&CK cybersecurity resource. Several applications of the knowledge-base are demonstrated: strengthening tool testing by scoping error-focused data sets for a technique, reinforcing digital forensic techniques by cataloguing available mitigations for weaknesses (a systematic approach to performing Error Mitigation Analysis), bolstering quality assurance by identifying potential weaknesses in a specific digital forensic investigation or standard processes, structured consideration of potential uses of AI in digital forensics, augmenting automation by highlighting relevant CASE ontology classes and identifying ontology gaps, and prioritizing innovation by identifying academic research opportunities. The paper provides the structure and partial implementation of a knowledge base that includes an organised set of 104 digital forensic techniques, organised over 17 objectives, with detailed descriptions, errors, and mitigations provided for 33 of them. The knowledge base is hosted on an open platform (GitHub) to allow crowdsourced contributions to evolve the contents. Tools are also provided to export the machine readable back-end data into usable formats such as spreadsheets to support many applications, including systematic error mitigation and quality assurance documentation.
{"title":"SOLVE-IT: A proposed digital forensic knowledge base inspired by MITRE ATT&CK","authors":"Christopher Hargreaves , Harm van Beek , Eoghan Casey","doi":"10.1016/j.fsidi.2025.301864","DOIUrl":"10.1016/j.fsidi.2025.301864","url":null,"abstract":"<div><div>This work presents SOLVE-IT (Systematic Objective-based Listing of Various Established (Digital) Investigation Techniques), a digital forensics knowledge base inspired by the MITRE ATT&CK cybersecurity resource. Several applications of the knowledge-base are demonstrated: strengthening tool testing by scoping error-focused data sets for a technique, reinforcing digital forensic techniques by cataloguing available mitigations for weaknesses (a systematic approach to performing Error Mitigation Analysis), bolstering quality assurance by identifying potential weaknesses in a specific digital forensic investigation or standard processes, structured consideration of potential uses of AI in digital forensics, augmenting automation by highlighting relevant CASE ontology classes and identifying ontology gaps, and prioritizing innovation by identifying academic research opportunities. The paper provides the structure and partial implementation of a knowledge base that includes an organised set of 104 digital forensic techniques, organised over 17 objectives, with detailed descriptions, errors, and mitigations provided for 33 of them. The knowledge base is hosted on an open platform (GitHub) to allow crowdsourced contributions to evolve the contents. Tools are also provided to export the machine readable back-end data into usable formats such as spreadsheets to support many applications, including systematic error mitigation and quality assurance documentation.</div></div>","PeriodicalId":48481,"journal":{"name":"Forensic Science International-Digital Investigation","volume":"52 ","pages":"Article 301864"},"PeriodicalIF":2.0,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143679787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-01DOI: 10.1016/j.fsidi.2025.301875
Hongseok Yang, Sanghyug Han, Mindong Kim, Gibum Kim
With the advancement of offline Finding Network (OFN) technology, tracking tags are being utilized in various fields, including locating elderly individuals with dementia, caring for children, and managing lost items. Recently, however, tracking tags have been misused in stalking, surveillance, and debt collection, highlighting the growing importance of digital forensics in proving criminal acts. While there has been some research on Apple AirTag and Tile products, studies focusing on Samsung's tracking tag have been lacking. Therefore, this paper proposes digital forensic techniques for law enforcement agencies to analyze Samsung tracking tag applications to identify perpetrators and substantiate criminal activities. We analyzed six tags and three applications, recognizing tag identifiers, and confirmed that location data is stored in both plaintext and encrypted forms within SQLite databases and XML files. Additionally, we conducted experiments on five different anti-forensics scenarios: 1) deletion of a registered tracking tag, 2) deletion of location data, 3) account logout, 4) service withdrawal, and 5) application synchronization, finding meaningful results to substantiate criminal actions. Furthermore, we developed S.TASER (Smart Tag Parser) based on Python that allows for the identification of deleted tags, recovery of identification data, and visualization of collected location data per tag. S.TASER's code, experimental scenarios, and raw data are publicly available for further verification. This study aims to contribute to the global digital forensic industry by suggesting additional options for investigation and evidence gathering of crimes that make use of Network.
{"title":"Samsung tracking tag application forensics in criminal investigations","authors":"Hongseok Yang, Sanghyug Han, Mindong Kim, Gibum Kim","doi":"10.1016/j.fsidi.2025.301875","DOIUrl":"10.1016/j.fsidi.2025.301875","url":null,"abstract":"<div><div>With the advancement of offline Finding Network (OFN) technology, tracking tags are being utilized in various fields, including locating elderly individuals with dementia, caring for children, and managing lost items. Recently, however, tracking tags have been misused in stalking, surveillance, and debt collection, highlighting the growing importance of digital forensics in proving criminal acts. While there has been some research on Apple AirTag and Tile products, studies focusing on Samsung's tracking tag have been lacking. Therefore, this paper proposes digital forensic techniques for law enforcement agencies to analyze Samsung tracking tag applications to identify perpetrators and substantiate criminal activities. We analyzed six tags and three applications, recognizing tag identifiers, and confirmed that location data is stored in both plaintext and encrypted forms within SQLite databases and XML files. Additionally, we conducted experiments on five different anti-forensics scenarios: 1) deletion of a registered tracking tag, 2) deletion of location data, 3) account logout, 4) service withdrawal, and 5) application synchronization, finding meaningful results to substantiate criminal actions. Furthermore, we developed S.TASER (Smart Tag Parser) based on Python that allows for the identification of deleted tags, recovery of identification data, and visualization of collected location data per tag. S.TASER's code, experimental scenarios, and raw data are publicly available for further verification. This study aims to contribute to the global digital forensic industry by suggesting additional options for investigation and evidence gathering of crimes that make use of Network.</div></div>","PeriodicalId":48481,"journal":{"name":"Forensic Science International-Digital Investigation","volume":"52 ","pages":"Article 301875"},"PeriodicalIF":2.0,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143679885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-01DOI: 10.1016/j.fsidi.2025.301878
Sean McKeown
Forensic analysts are often tasked with analysing large volumes of data in modern investigations, and frequently make use of hashing technologies to identify previously encountered images. Perceptual hashes, which seek to model the semantic (visual) content of images, are typically compared by way of Normalised Hamming Distance, counting the ratio of bits which differ between two hashes. However, this global measure of difference may overlook structural information, such as the position and relative clustering of these differences. This paper investigates the relationship between localised/positional changes in an image and the extent to which this information is encoded in various perceptual hashes. Our findings indicate that the relative position of bits in the hash does encode useful information. Consequently, we prototype and evaluate three alternative perceptual hashing distance metrics: Normalised Convolution Distance, Hatched Matrix Distance, and 2-D Ngram Cosine Distance. Results demonstrate that there is room for improvement over Hamming Distance. In particular, the worst-case image mirroring transform for DCT-based hashes can be completely mitigated without needing to change the mechanism for generating the hash. Indeed, perceived hash weaknesses may actually be deficits in the distance metric being used, and large-scale providers could potentially benefit from modifying their approach.
{"title":"Beyond Hamming Distance: Exploring spatial encoding in perceptual hashes","authors":"Sean McKeown","doi":"10.1016/j.fsidi.2025.301878","DOIUrl":"10.1016/j.fsidi.2025.301878","url":null,"abstract":"<div><div>Forensic analysts are often tasked with analysing large volumes of data in modern investigations, and frequently make use of hashing technologies to identify previously encountered images. Perceptual hashes, which seek to model the semantic (visual) content of images, are typically compared by way of Normalised Hamming Distance, counting the ratio of bits which differ between two hashes. However, this global measure of difference may overlook structural information, such as the position and relative clustering of these differences. This paper investigates the relationship between localised/positional changes in an image and the extent to which this information is encoded in various perceptual hashes. Our findings indicate that the relative position of bits in the hash does encode useful information. Consequently, we prototype and evaluate three alternative perceptual hashing distance metrics: Normalised Convolution Distance, Hatched Matrix Distance, and 2-D Ngram Cosine Distance. Results demonstrate that there is room for improvement over Hamming Distance. In particular, the worst-case image mirroring transform for DCT-based hashes can be completely mitigated without needing to change the mechanism for generating the hash. Indeed, perceived hash weaknesses may actually be deficits in the distance metric being used, and large-scale providers could potentially benefit from modifying their approach.</div></div>","PeriodicalId":48481,"journal":{"name":"Forensic Science International-Digital Investigation","volume":"52 ","pages":"Article 301878"},"PeriodicalIF":2.0,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143679886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-01DOI: 10.1016/j.fsidi.2025.301863
Andrea Oliveri , Nikola Nemes , Branislav Andjelic , Davide Balzarotti
Over the years, memory forensics has emerged as a powerful analysis technique for uncovering security breaches that often evade detection. However, the differences in layouts used by the operating systems to organize data in memory can undermine its effectiveness. To overcome this problem, forensics tools rely on specialized “maps”, the profiles, that describe the location and layout of kernel data types in volatile memory for each different OS. To avoid compromising the entire forensics analysis, it is crucial to meticulously select the profile to use, which is also tailored to the specific version of the OS.
In this work, for the first time, we conduct a longitudinal measurement study on kernel data types evolution across multiple kernel releases and its impact on memory forensics profiles. We analyze 2298 Linux, macOS, and Windows Volatility 3 profiles from 2007 to 2024 to investigate patterns in data type changes across different OS releases, with a particular focus on types relevant to forensic analysis. This allowed the identification of fields commonly affected by modifications and, consequently, the Volatility plugins that are more vulnerable to these changes. In cases where an exact profile is unavailable, we propose guidelines for deciding on the most appropriate alternative profile to modify and use. Additionally, using a tool we developed, we analyze the source code of 77 Linux kernel versions to measure, for the first time, how the evolution of compile-time options influences kernel data types. Our findings show that even options unrelated to memory forensics can significantly alter data structure layouts and derived profiles, offering crucial insights for forensic analysts in navigating kernel configuration changes.
{"title":"A study on the evolution of kernel data types used in memory forensics and their dependency on compilation options","authors":"Andrea Oliveri , Nikola Nemes , Branislav Andjelic , Davide Balzarotti","doi":"10.1016/j.fsidi.2025.301863","DOIUrl":"10.1016/j.fsidi.2025.301863","url":null,"abstract":"<div><div>Over the years, memory forensics has emerged as a powerful analysis technique for uncovering security breaches that often evade detection. However, the differences in layouts used by the operating systems to organize data in memory can undermine its effectiveness. To overcome this problem, forensics tools rely on specialized “maps”, the profiles, that describe the location and layout of kernel data types in volatile memory for each different OS. To avoid compromising the entire forensics analysis, it is crucial to meticulously select the profile to use, which is also tailored to the specific version of the OS.</div><div>In this work, for the first time, we conduct a longitudinal measurement study on kernel data types evolution across multiple kernel releases and its impact on memory forensics profiles. We analyze 2298 Linux, macOS, and Windows Volatility 3 profiles from 2007 to 2024 to investigate patterns in data type changes across different OS releases, with a particular focus on types relevant to forensic analysis. This allowed the identification of fields commonly affected by modifications and, consequently, the Volatility plugins that are more vulnerable to these changes. In cases where an exact profile is unavailable, we propose guidelines for deciding on the most appropriate alternative profile to modify and use. Additionally, using a tool we developed, we analyze the source code of 77 Linux kernel versions to measure, for the first time, how the evolution of compile-time options influences kernel data types. Our findings show that even options unrelated to memory forensics can significantly alter data structure layouts and derived profiles, offering crucial insights for forensic analysts in navigating kernel configuration changes.</div></div>","PeriodicalId":48481,"journal":{"name":"Forensic Science International-Digital Investigation","volume":"52 ","pages":"Article 301863"},"PeriodicalIF":2.0,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143679786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-01DOI: 10.1016/j.fsidi.2025.301873
Jian Li , Fei Wang , Bin Ma , Chunpeng Wang , Xiaoming Wu
This paper addresses the performance of a PRNU-based (photo response non-uniformity) scheme to identify the capturing device of a video. A common concern is PRNU in each frame being misaligned due to the video stabilization process compensating for unintended camera movements. We first derive the expectation of a similarity measure between two PRNUs: a reference and a test. The statistical analysis of the similarity measure helps us to understand the effect of homogeneous or heterogeneous misalignment of PRNU on the performance of identification for video capturing devices. We notice that dividing a test PRNU into several blocks and then matching each block with a part of the reference PRNU can decrease the negative effect of video stabilization. Hence a block-based matching algorithm for identifying video capturing devices is designed to improve the identification efficiency, especially when only a limited number of test video frames is available. Extensive experimental results prove that the proposed block-based matching algorithm can outperform the prior arts under the same test conditions.
{"title":"Video capturing device identification through block-based PRNU matching","authors":"Jian Li , Fei Wang , Bin Ma , Chunpeng Wang , Xiaoming Wu","doi":"10.1016/j.fsidi.2025.301873","DOIUrl":"10.1016/j.fsidi.2025.301873","url":null,"abstract":"<div><div>This paper addresses the performance of a PRNU-based (photo response non-uniformity) scheme to identify the capturing device of a video. A common concern is PRNU in each frame being misaligned due to the video stabilization process compensating for unintended camera movements. We first derive the expectation of a similarity measure between two PRNUs: a reference and a test. The statistical analysis of the similarity measure helps us to understand the effect of homogeneous or heterogeneous misalignment of PRNU on the performance of identification for video capturing devices. We notice that dividing a test PRNU into several blocks and then matching each block with a part of the reference PRNU can decrease the negative effect of video stabilization. Hence a block-based matching algorithm for identifying video capturing devices is designed to improve the identification efficiency, especially when only a limited number of test video frames is available. Extensive experimental results prove that the proposed block-based matching algorithm can outperform the prior arts under the same test conditions.</div></div>","PeriodicalId":48481,"journal":{"name":"Forensic Science International-Digital Investigation","volume":"52 ","pages":"Article 301873"},"PeriodicalIF":2.0,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143679883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-01DOI: 10.1016/j.fsidi.2025.301876
Pascal Tippe, Christoph Deckers
This paper investigates the operational patterns and forensic traceability of Bitcoin mixing services, which pose significant challenges to anti-money laundering efforts. We analyze blockchain data using Neo4j to identify unique mixing patterns and potential deanonymization techniques. Our research includes a comprehensive survey of 20 currently available mixing services, examining their features such as input/output address policies, delay options, and security measures. We also analyze three legal cases from the U.S. involving Bitcoin mixers to understand investigative techniques used by law enforcement. We conduct two test transactions and use graph analysis to identify distinct transaction patterns associated with specific mixers, including peeling chains and multi-input transactions. We simulate scenarios where investigators have partial knowledge about transactions, demonstrating how this information can be leveraged to trace funds through mixers. Our findings reveal that while mixers significantly obfuscate transaction trails, certain patterns and behaviors can still be exploited for forensic analysis. We examine current investigative approaches for identifying users and operators of mixing services, primarily focusing on methods that associate addresses with entities and utilize off-chain attacks. Additionally, we discuss the limitations of our approach and propose potential improvements that can aid investigators in applying effective techniques. This research contributes to the growing field of cryptocurrency forensics by providing a comprehensive analysis of mixer operations and investigative techniques. Our insights can assist law enforcement agencies in developing more effective strategies to tackle the challenges posed by Bitcoin mixers in cybercrime investigations.
{"title":"Unmixing the mix: Patterns and challenges in Bitcoin mixer investigations","authors":"Pascal Tippe, Christoph Deckers","doi":"10.1016/j.fsidi.2025.301876","DOIUrl":"10.1016/j.fsidi.2025.301876","url":null,"abstract":"<div><div>This paper investigates the operational patterns and forensic traceability of Bitcoin mixing services, which pose significant challenges to anti-money laundering efforts. We analyze blockchain data using Neo4j to identify unique mixing patterns and potential deanonymization techniques. Our research includes a comprehensive survey of 20 currently available mixing services, examining their features such as input/output address policies, delay options, and security measures. We also analyze three legal cases from the U.S. involving Bitcoin mixers to understand investigative techniques used by law enforcement. We conduct two test transactions and use graph analysis to identify distinct transaction patterns associated with specific mixers, including peeling chains and multi-input transactions. We simulate scenarios where investigators have partial knowledge about transactions, demonstrating how this information can be leveraged to trace funds through mixers. Our findings reveal that while mixers significantly obfuscate transaction trails, certain patterns and behaviors can still be exploited for forensic analysis. We examine current investigative approaches for identifying users and operators of mixing services, primarily focusing on methods that associate addresses with entities and utilize off-chain attacks. Additionally, we discuss the limitations of our approach and propose potential improvements that can aid investigators in applying effective techniques. This research contributes to the growing field of cryptocurrency forensics by providing a comprehensive analysis of mixer operations and investigative techniques. Our insights can assist law enforcement agencies in developing more effective strategies to tackle the challenges posed by Bitcoin mixers in cybercrime investigations.</div></div>","PeriodicalId":48481,"journal":{"name":"Forensic Science International-Digital Investigation","volume":"52 ","pages":"Article 301876"},"PeriodicalIF":2.0,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143679889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}