Digital Threats: Research and Practice最新文献

英文中文

Evading Anti-Phishing Models: A Field Note Documenting an Experience in the Machine Learning Security Evasion Competition 2022 规避反钓鱼模型:记录机器学习安全规避竞赛2022经验的现场笔记

Digital Threats: Research and Practice

Pub Date : 2023-06-15 DOI: 10.1145/3603507

Yang Gao, Benjamin Ampel, S. Samtani

Although machine learning-based anti-phishing detectors have provided promising results in phishing website detection, they remain vulnerable to evasion attacks. The Machine Learning Security Evasion Competition 2022 (MLSEC 2022) provides researchers and practitioners with the opportunity to deploy evasion attacks against anti-phishing machine learning models in real-world settings. In this field note, we share our experience participating in MLSEC 2022. We manipulated the source code of ten phishing HTML pages provided by the competition using obfuscation techniques to evade anti-phishing models. Our evasion attacks employing a benign overlap strategy achieved third place in the competition with 46 out of a potential 80 points. The results of our MLSEC 2022 performance can provide valuable insights for research seeking to robustify machine learning-based anti-phishing detectors.

尽管基于机器学习的反网络钓鱼检测器在网络钓鱼网站检测方面提供了有希望的结果，但它们仍然容易受到逃避攻击。机器学习安全规避竞赛2022 (MLSEC 2022)为研究人员和从业人员提供了在现实环境中部署针对反网络钓鱼机器学习模型的规避攻击的机会。在这篇现场笔记中，我们分享了参加MLSEC 2022的经验。我们利用混淆技术对大赛提供的10个网络钓鱼HTML页面的源代码进行了篡改，以规避反网络钓鱼模型。我们的闪避攻击采用良性重叠策略，在比赛中以46分(满分80分)获得第三名。我们的MLSEC 2022性能结果可以为寻求增强基于机器学习的反网络钓鱼检测器的研究提供有价值的见解。

引用次数: 2

Asm2Seq: Explainable Assembly Code Functional Summary Generation for Reverse Engineering and Vulnerability Analysis Asm2Seq：为逆向工程和漏洞分析生成可解释的汇编代码功能摘要

Digital Threats: Research and Practice

Pub Date : 2023-05-18 DOI: 10.1145/3592623

Scarlett Taviss, Steven H. H. Ding, Mohammad Zulkernine, P. Charland, Sudipta Acharya

Reverse engineering is the process of understanding the inner working of a software system without having the source code. It is critical for firmware security validation, software vulnerability research, and malware analysis. However, it often requires a significant amount of manual effort. Recently, data-driven solutions were proposed to reduce manual effort by identifying the code clones on the assembly or the source level. However, security analysts still have to understand the matched assembly or source code to develop an understanding of the functionality, and it is assumed that such a matched candidate always exists. This research bridges the gap by introducing the problem of assembly code summarization. Given the assembly code as input, we propose a machine-learning-based system that can produce human-readable summarizations of the functionalities in the context of code vulnerability analysis. We generate the first assembly code to function summary dataset and propose to leverage the encoder-decoder architecture. With the attention mechanism, it is possible to understand what aspects of the assembly code had the largest impact on generating the summary. Our experiment shows that the proposed solution achieves high accuracy and the Bilingual Evaluation Understudy (BLEU) score. Finally, we have performed case studies on real-life CVE vulnerability cases to better understand the proposed method’s performance and practical implications.

逆向工程是在没有源代码的情况下了解软件系统内部工作原理的过程。它对于固件安全验证、软件漏洞研究和恶意软件分析至关重要。然而，逆向工程往往需要大量的人工操作。最近，人们提出了数据驱动解决方案，通过识别汇编或源代码级的代码克隆来减少人工工作量。但是，安全分析人员仍然必须了解匹配的程序集或源代码，才能对其功能有所了解，而且假设这种匹配的候选代码总是存在的。本研究通过引入汇编代码总结问题弥补了这一差距。将汇编代码作为输入，我们提出了一种基于机器学习的系统，它能在代码漏洞分析中生成人类可读的功能总结。我们生成了第一个汇编代码到功能摘要数据集，并建议利用编码器-解码器架构。通过关注机制，可以了解汇编代码的哪些方面对生成摘要的影响最大。我们的实验表明，所提出的解决方案实现了较高的准确率和双语评估（BLEU）得分。最后，我们对现实生活中的 CVE 漏洞案例进行了案例研究，以更好地了解所提方法的性能和实际意义。

{"title":"Asm2Seq: Explainable Assembly Code Functional Summary Generation for Reverse Engineering and Vulnerability Analysis","authors":"Scarlett Taviss, Steven H. H. Ding, Mohammad Zulkernine, P. Charland, Sudipta Acharya","doi":"10.1145/3592623","DOIUrl":"https://doi.org/10.1145/3592623","url":null,"abstract":"Reverse engineering is the process of understanding the inner working of a software system without having the source code. It is critical for firmware security validation, software vulnerability research, and malware analysis. However, it often requires a significant amount of manual effort. Recently, data-driven solutions were proposed to reduce manual effort by identifying the code clones on the assembly or the source level. However, security analysts still have to understand the matched assembly or source code to develop an understanding of the functionality, and it is assumed that such a matched candidate always exists. This research bridges the gap by introducing the problem of assembly code summarization. Given the assembly code as input, we propose a machine-learning-based system that can produce human-readable summarizations of the functionalities in the context of code vulnerability analysis. We generate the first assembly code to function summary dataset and propose to leverage the encoder-decoder architecture. With the attention mechanism, it is possible to understand what aspects of the assembly code had the largest impact on generating the summary. Our experiment shows that the proposed solution achieves high accuracy and the Bilingual Evaluation Understudy (BLEU) score. Finally, we have performed case studies on real-life CVE vulnerability cases to better understand the proposed method’s performance and practical implications.","PeriodicalId":202552,"journal":{"name":"Digital Threats: Research and Practice","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130601200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Improving Automated Labeling for ATT&CK Tactics in Malware Threat Reports 在恶意软件威胁报告中改进攻击和攻击策略的自动标记

Digital Threats: Research and Practice

Pub Date : 2023-05-17 DOI: 10.1145/3594553

Eva Domschot, Ramyaa Ramyaa, Michael R. Smith

Once novel malware is detected, threat reports are written by security companies that discover it. The reports often vary in the terminology describing the behavior of the malware making comparisons of reports of the same malware from different companies difficult. To aid in the automated discovery of novel malware, it was recently proposed that novel malware could be detected by identifying behaviors. This assumes that a core set of behaviors are present in most, if not all, malware variants. However, there is a lack of malware datasets that are labeled with behaviors. Motivated by a need to label malware with a common set of behaviors, this work examines automating the process of labeling malware with behaviors identified in malware threat reports despite the variability of terminology. To do so, we examine several techniques from the natural language processing (NLP) domain. We find that most state-of-the-art word embedding NLP methods require large amounts of data and are trained on generic corpora of text data—missing the nuances related to information security. To address this, we use simple feature selection techniques. We find that simple feature selection techniques generally outperform word embedding methods and achieve an increase of 6% in the F.5-score over prior work when used to predict MITRE ATT&CK tactics in threat reports. Our work indicates that feature selection, which has commonly been overlooked by sophisticated methods in NLP tasks, is beneficial for information security related tasks, where more sophisticated NLP methodologies are not able to pick out relevant information security terms.

一旦检测到新的恶意软件，发现它的安全公司就会撰写威胁报告。这些报告通常在描述恶意软件行为的术语上有所不同，这使得比较来自不同公司的同一恶意软件的报告变得困难。为了帮助自动发现新的恶意软件，最近有人提出可以通过识别行为来检测新的恶意软件。这假定在大多数(如果不是全部)恶意软件变体中都存在一组核心行为。然而，缺乏带有行为标签的恶意软件数据集。由于需要将恶意软件标记为一组常见的行为，这项工作检查了将恶意软件标记为恶意软件威胁报告中识别的行为的自动化过程，尽管术语存在可变性。为此，我们研究了自然语言处理(NLP)领域的几种技术。我们发现，大多数最先进的词嵌入NLP方法需要大量的数据，并且是在文本数据的通用语料库上训练的，缺少与信息安全相关的细微差别。为了解决这个问题，我们使用简单的特征选择技术。我们发现，简单的特征选择技术通常优于词嵌入方法，当用于预测威胁报告中的MITRE攻击和ck策略时，f .5得分比先前的工作提高了6%。我们的工作表明，特征选择通常被NLP任务中复杂的方法所忽视，对于信息安全相关的任务是有益的，在这些任务中，更复杂的NLP方法无法挑选出相关的信息安全术语。

{"title":"Improving Automated Labeling for ATT&CK Tactics in Malware Threat Reports","authors":"Eva Domschot, Ramyaa Ramyaa, Michael R. Smith","doi":"10.1145/3594553","DOIUrl":"https://doi.org/10.1145/3594553","url":null,"abstract":"Once novel malware is detected, threat reports are written by security companies that discover it. The reports often vary in the terminology describing the behavior of the malware making comparisons of reports of the same malware from different companies difficult. To aid in the automated discovery of novel malware, it was recently proposed that novel malware could be detected by identifying behaviors. This assumes that a core set of behaviors are present in most, if not all, malware variants. However, there is a lack of malware datasets that are labeled with behaviors. Motivated by a need to label malware with a common set of behaviors, this work examines automating the process of labeling malware with behaviors identified in malware threat reports despite the variability of terminology. To do so, we examine several techniques from the natural language processing (NLP) domain. We find that most state-of-the-art word embedding NLP methods require large amounts of data and are trained on generic corpora of text data—missing the nuances related to information security. To address this, we use simple feature selection techniques. We find that simple feature selection techniques generally outperform word embedding methods and achieve an increase of 6% in the F.5-score over prior work when used to predict MITRE ATT&CK tactics in threat reports. Our work indicates that feature selection, which has commonly been overlooked by sophisticated methods in NLP tasks, is beneficial for information security related tasks, where more sophisticated NLP methodologies are not able to pick out relevant information security terms.","PeriodicalId":202552,"journal":{"name":"Digital Threats: Research and Practice","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116601871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Keeping Up with the Emotets: Tracking a Multi-infrastructure Botnet 跟上表情:跟踪多基础设施僵尸网络

Digital Threats: Research and Practice

Pub Date : 2023-05-02 DOI: 10.1145/3594554

Oleg Boyarchuk, Sebastiano Mariani, Stefano Ortolani, G. Vigna

Throughout its eight-year history, Emotet has caused substantial damage. This threat reappeared at the beginning of 2022 following a take-down by law enforcement in November 2021. Emotet is arguably one of the most notorious advanced persistent threats, causing substantial damage during its earlier phases and continuing to pose a danger to organizations everywhere. In this article, we present a longitudinal study of several waves of Emotet-based attacks that we observed in VMware’s customer telemetry. By analyzing Emotet’s software development life cycle, we were able to dissect how it quickly changes its command and control (C2) infrastructure, obfuscates its configuration, adapts and tests its evasive execution chains, deploys different attack vectors at different stages, laterally propagates, and continues to evolve using numerous tactics and techniques.

在其8年的历史中，Emotet造成了巨大的损害。这一威胁在2021年11月被执法部门取缔后，于2022年初再次出现。Emotet可以说是最臭名昭著的高级持续性威胁之一，在其早期阶段造成了巨大的破坏，并继续对各地的组织构成威胁。在本文中，我们对VMware客户遥测中观察到的几波基于表情符号的攻击进行了纵向研究。通过分析Emotet的软件开发生命周期，我们能够分析它如何快速改变其指挥和控制(C2)基础设施，模糊其配置，适应和测试其规避执行链，在不同阶段部署不同的攻击向量，横向传播，并使用多种战术和技术继续发展。

引用次数: 1

Zero Trust Architecture: Risk Discussion 零信任架构:风险讨论

Digital Threats: Research and Practice

Pub Date : 2023-03-31 DOI: 10.1145/3573892

Alan Levine, B. Tucker

Implemented well, Zero Trust Architecture (ZTA) promises to mitigate cyber risk for organizations of all sizes, risk postures, and cybersecurity maturity states. However, ZTA development, deployment, and operation present challenges that may hinder full adoption and sustained effectiveness and create new risk. Cyber risk should be evaluated by organizations as they make their decision for or against ZTA. Then, as organizations work toward

如果实现得好，零信任架构(ZTA)有望降低各种规模、风险状态和网络安全成熟度状态的组织的网络风险。然而，ZTA的开发、部署和操作都面临着挑战，这些挑战可能会阻碍ZTA的全面采用和持续的有效性，并产生新的风险。组织在做出支持或反对ZTA的决定时，应该评估网络风险。然后，随着组织的努力

引用次数: 1

Towards a Greater Understanding of Coordinated Vulnerability Disclosure Policy Documents 进一步了解协调漏洞披露政策文件

Digital Threats: Research and Practice

Pub Date : 2023-03-23 DOI: 10.1145/3586180

T. Walshe, Andrew C. Simpson

Bug bounty programmes and vulnerability disclosure programmes, collectively referred to as Coordinated Vulnerability Disclosure (CVD) programmes, open up an organisation’s assets to the inquisitive gaze of (often eager) white-hat hackers. Motivated by the question What information do organisations convey to hackers through public CVD policy documents?, we aim to better understand the information available to hackers wishing to participate in the search for vulnerabilities. As such, in this article we consider three key issues. First, to address the differences in the legal language communicated to hackers, it is necessary to understand the formal constraints by which hackers must abide. Second, it is beneficial to understand the variation that exists in the informal constraints that are communicated to hackers through a variety of institutional elements. Third, for organisations wishing to better understand the commonplace elements that form current policy documents, we offer broad analysis of the components frequently included therein and identify gaps in programme policies. We report the results of a quantitative study, leveraging deep learning based natural language processing models, providing insights into the policy documents that accompany the CVD programmes of thousands of organisations, covering both stand-alone programmes and those hosted on 13 bug bounty programmes. We found that organisations often inadequately convey the formal constraints that are applicable to hackers, requiring hackers to have a deep understanding of the laws that underpin safe and legal security research. Furthermore, a lack of standardisation across similar policy components is prevalent, and may lead to a decreased understanding of the informal constraints placed upon hackers when searching for and disclosing vulnerabilities. Analysis of the institutional elements included in the policy documents of organisations reveals insufficient inclusion of many key components. Namely, legal information and information pertaining to restrictions on the backgrounds of hackers is found to be absent in a majority of policies analysed. Finally, to assist ongoing research, we provide novel annotated policy datasets that include human-labelled annotations at both the sentence and paragraph level, covering a broad range of CVD programme backgrounds.

漏洞赏金计划和漏洞披露计划，统称为协调漏洞披露(CVD)计划，将组织的资产开放给(通常是渴望的)白帽黑客的好奇目光。组织通过公开的CVD政策文件向黑客传递了哪些信息?，我们的目标是更好地了解希望参与搜索漏洞的黑客可以获得的信息。因此，在本文中，我们考虑三个关键问题。首先，为了解决与黑客沟通的法律语言的差异，有必要了解黑客必须遵守的正式约束。其次，了解通过各种制度因素传达给黑客的非正式约束中存在的变化是有益的。第三，对于希望更好地理解构成当前政策文件的常见元素的组织，我们提供了对其中经常包含的组成部分的广泛分析，并确定项目政策中的差距。我们报告了一项定量研究的结果，利用基于深度学习的自然语言处理模型，提供了对数千个组织的CVD计划的政策文件的见解，涵盖了独立计划和13个漏洞赏金计划。我们发现，组织往往不能充分传达适用于黑客的正式约束，要求黑客对支持安全和法律安全研究的法律有深刻的理解。此外，在类似的策略组件之间普遍缺乏标准化，这可能导致在搜索和披露漏洞时对黑客施加的非正式约束的理解减少。对组织政策文件中包含的制度要素的分析显示，许多关键要素未被充分纳入。也就是说，在分析的大多数政策中，法律信息和有关黑客背景限制的信息是缺失的。最后，为了协助正在进行的研究，我们提供了新颖的注释政策数据集，其中包括句子和段落级别的人工标记注释，涵盖了广泛的CVD计划背景。

{"title":"Towards a Greater Understanding of Coordinated Vulnerability Disclosure Policy Documents","authors":"T. Walshe, Andrew C. Simpson","doi":"10.1145/3586180","DOIUrl":"https://doi.org/10.1145/3586180","url":null,"abstract":"Bug bounty programmes and vulnerability disclosure programmes, collectively referred to as Coordinated Vulnerability Disclosure (CVD) programmes, open up an organisation’s assets to the inquisitive gaze of (often eager) white-hat hackers. Motivated by the question What information do organisations convey to hackers through public CVD policy documents?, we aim to better understand the information available to hackers wishing to participate in the search for vulnerabilities. As such, in this article we consider three key issues. First, to address the differences in the legal language communicated to hackers, it is necessary to understand the formal constraints by which hackers must abide. Second, it is beneficial to understand the variation that exists in the informal constraints that are communicated to hackers through a variety of institutional elements. Third, for organisations wishing to better understand the commonplace elements that form current policy documents, we offer broad analysis of the components frequently included therein and identify gaps in programme policies. We report the results of a quantitative study, leveraging deep learning based natural language processing models, providing insights into the policy documents that accompany the CVD programmes of thousands of organisations, covering both stand-alone programmes and those hosted on 13 bug bounty programmes. We found that organisations often inadequately convey the formal constraints that are applicable to hackers, requiring hackers to have a deep understanding of the laws that underpin safe and legal security research. Furthermore, a lack of standardisation across similar policy components is prevalent, and may lead to a decreased understanding of the informal constraints placed upon hackers when searching for and disclosing vulnerabilities. Analysis of the institutional elements included in the policy documents of organisations reveals insufficient inclusion of many key components. Namely, legal information and information pertaining to restrictions on the backgrounds of hackers is found to be absent in a majority of policies analysed. Finally, to assist ongoing research, we provide novel annotated policy datasets that include human-labelled annotations at both the sentence and paragraph level, covering a broad range of CVD programme backgrounds.","PeriodicalId":202552,"journal":{"name":"Digital Threats: Research and Practice","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133506323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

ANDROIDGYNY: Reviewing clustering techniques for Android malware family classification ANDROIDGYNY:回顾Android恶意软件家族分类的聚类技术

Digital Threats: Research and Practice

Pub Date : 2023-03-14 DOI: 10.1145/3587471

Thalita Scharr Rodrigues Pimenta, Fabrício Ceschin, A. Grégio

Thousands of malicious applications (apps) are daily created, modified with the aid of automation tools, and released on the World Wide Web. Several techniques have been applied over the years to identify whether an APK is malicious or not. The use of these techniques intends to identify unknown malware mainly by calculating the similarity of a sample with previously grouped, already known families of malicious apps. Thus, high rates of accuracy would enable several countermeasures: from further quick detection to the development of vaccines and aid for reverse engineering new variants. However, most of the literature consists of limited experiments—either short-term and offline or based exclusively on well-known malicious apps’ families. In this paper, we explore the use of malware phylogeny, a term borrowed from biology, consisting of the genealogical study of the relationship between elements and families. Also, we investigate the literature on clustering techniques applied to mobile malware classification and discuss how researchers have been setting up their experiments.

每天都有成千上万的恶意应用程序被创建，在自动化工具的帮助下被修改，并在万维网上发布。多年来，已经应用了几种技术来确定APK是否为恶意软件。使用这些技术主要是通过计算样本与先前分组的已知恶意应用程序家族的相似性来识别未知恶意软件。因此，高准确率将使若干对策成为可能:从进一步的快速检测到疫苗的开发以及对新变种的逆向工程提供帮助。然而，大多数文献都是由有限的实验组成的——要么是短期的，要么是离线的，要么是完全基于众所周知的恶意应用程序家族。在本文中，我们探讨了恶意软件系统发育的使用，这是一个从生物学借来的术语，包括对元素和家庭之间关系的系谱研究。此外，我们研究了应用于移动恶意软件分类的聚类技术的文献，并讨论了研究人员如何建立他们的实验。

引用次数: 1

Introduction to the Special Issue on the Digital Threats of Hardware Security 《硬件安全的数字威胁》特刊简介

Digital Threats: Research and Practice

Pub Date : 2023-03-07 DOI: 10.1145/3585011

Aydin Aysu, S. Graham

Digital threats to computing and network security continue their relentless advance, with malware growing in sophistication and frequently targeting the lower levels of the computing stack. Recently, these threats have evolved and started to target hardware vulnerabilities. Such attacks are hard to detect at the higher abstraction levels and even harder to mitigate given the challenges of changing the hardware infrastructure. To be effective, defensive measures must also consider the physical effects of computing at lower levels of the hardware stack

对计算和网络安全的数字威胁继续无情地发展，恶意软件越来越复杂，经常以较低水平的计算堆栈为目标。最近，这些威胁已经演变并开始以硬件漏洞为目标。这种攻击很难在更高的抽象级别上检测到，考虑到改变硬件基础设施的挑战，更难以减轻。为了有效，防御措施还必须考虑硬件堆栈较低级别计算的物理影响

引用次数: 0

Breaking Captcha System with Minimal Exertion through Deep Learning: Real-time Risk Assessment on Indian Government Websites 通过深度学习以最小的努力破解Captcha系统:对印度政府网站的实时风险评估

Digital Threats: Research and Practice

Pub Date : 2023-02-23 DOI: 10.1145/3584974

Rajat Subhra Bhowmick, Rahul Indra, Isha Ganguli, Jayanta Paul, J. Sil

Captchas are used to prevent computer bots from launching spam attacks and automatically extracting data available in the websites. The government websites mostly contain sensitive data related to citizens and assets of the country, and the vulnerability to its captcha systems raises a major security challenge. The proposed work focuses on the real-time captcha systems used by the government websites of India and identifies the risks level. To effectively analyze its captcha security, we concentrate on the problem from an attacker’s perspective. From the viewpoint of an attacker, building an effective solver to breach the captcha security system from scratch with limited feature engineering knowledge of text and image processing is a challenge. Neural network models are useful in automated feature extraction, and a simple model can be trained with a minimum number of manually annotated real captchas. Along with popular text captchas, government websites of India use text instructions–based captchas. We analyze an effective neural network pipeline for solving text captchas. The text instructions captchas are relatively new, and the work provides novel end-to-end neural network architectures to break different types of text instructions captchas. The proposed models achieve more than 80% accuracy and on a desktop GPU has a maximum inference speed of 1.063 seconds. The study comes up with an ecosystem and procedure to rate the overall risk of a captcha system used on a website. We observe that concerning the importance of available information on these government websites, the effort required to solve the captcha systems by an attacker is alarming.

验证码用于防止计算机机器人发起垃圾邮件攻击，并自动提取网站上可用的数据。政府网站大多包含与公民和国家资产相关的敏感数据，其captcha系统的脆弱性提出了重大的安全挑战。拟议的工作重点是印度政府网站使用的实时验证码系统，并确定风险级别。为了有效地分析验证码的安全性，我们从攻击者的角度来关注这个问题。从攻击者的角度来看，在文本和图像处理的特征工程知识有限的情况下，从零开始构建一个有效的求解器来攻破captcha安全系统是一个挑战。神经网络模型在自动特征提取中很有用，一个简单的模型可以用最少数量的手动注释的真实验证码来训练。除了流行的文本验证码，印度的政府网站也使用基于文本说明的验证码。我们分析了一个有效的神经网络管道来解决文本验证码。文本指令验证码相对较新，该工作提供了新颖的端到端神经网络架构来破解不同类型的文本指令验证码。该模型的准确率超过80%，在桌面GPU上的最大推理速度为1.063秒。该研究提出了一个生态系统和程序来评估网站上使用captcha系统的整体风险。我们注意到，考虑到这些政府网站上可用信息的重要性，攻击者破解captcha系统所需的努力令人震惊。

{"title":"Breaking Captcha System with Minimal Exertion through Deep Learning: Real-time Risk Assessment on Indian Government Websites","authors":"Rajat Subhra Bhowmick, Rahul Indra, Isha Ganguli, Jayanta Paul, J. Sil","doi":"10.1145/3584974","DOIUrl":"https://doi.org/10.1145/3584974","url":null,"abstract":"Captchas are used to prevent computer bots from launching spam attacks and automatically extracting data available in the websites. The government websites mostly contain sensitive data related to citizens and assets of the country, and the vulnerability to its captcha systems raises a major security challenge. The proposed work focuses on the real-time captcha systems used by the government websites of India and identifies the risks level. To effectively analyze its captcha security, we concentrate on the problem from an attacker’s perspective. From the viewpoint of an attacker, building an effective solver to breach the captcha security system from scratch with limited feature engineering knowledge of text and image processing is a challenge. Neural network models are useful in automated feature extraction, and a simple model can be trained with a minimum number of manually annotated real captchas. Along with popular text captchas, government websites of India use text instructions–based captchas. We analyze an effective neural network pipeline for solving text captchas. The text instructions captchas are relatively new, and the work provides novel end-to-end neural network architectures to break different types of text instructions captchas. The proposed models achieve more than 80% accuracy and on a desktop GPU has a maximum inference speed of 1.063 seconds. The study comes up with an ecosystem and procedure to rate the overall risk of a captcha system used on a website. We observe that concerning the importance of available information on these government websites, the effort required to solve the captcha systems by an attacker is alarming.","PeriodicalId":202552,"journal":{"name":"Digital Threats: Research and Practice","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123964005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

OATs’inside: Retrieving Object Behaviors From Native-based Obfuscated Android Applications 内部:从基于本地的模糊Android应用程序中检索对象行为

Digital Threats: Research and Practice

Pub Date : 2023-02-22 DOI: 10.1145/3584975

Pierre Graux, Jean-François Lalande, Valérie Viet Triem Tong, Pierre Wilke

Analyzing Android applications is essential to review proprietary code and to understand malware behaviors. However, Android applications use obfuscation techniques to slow down this process. These obfuscation techniques are increasingly based on native code. In this article, we propose OATs’inside, a new analysis tool that focuses on high-level behaviors to circumvent native obfuscation techniques transparently. The targeted high-level behaviors are object-level behaviors, i.e., actions performed on Java objects (e.g., field accesses, method calls), regardless of whether these actions are performed using Java or native code. Our system uses a hybrid approach based on dynamic monitoring and trace-based symbolic execution to output control flow graphs (CFGs), 27 pages. for each method of the analyzed application. CFGs are composed of Java-like actions enriched with condition expressions and dataflows between actions, giving an understandable representation of any code, even those fully native. OATs’inside spares users the need to dive into low-level instructions, which are difficult to reverse engineer. We extensively compare OATs’inside functionalities against state-of-the-art tools to highlight the benefit when observing native operations. Our experiments are conducted on a real smartphone: We discuss the performance impact of OATs’inside, and we demonstrate its practical use on applications containing anti-debugging techniques provided by the OWASP foundation. We also evaluate the robustness of OATs’inside using obfuscated unit tests using the Tigress obfuscator.

分析Android应用程序对于审查专有代码和理解恶意软件行为至关重要。然而，Android应用程序使用混淆技术来减缓这个过程。这些混淆技术越来越多地基于本地代码。在本文中，我们提出了一种新的分析工具，它专注于高级行为，以透明地规避本地混淆技术。目标高级行为是对象级行为，即在Java对象上执行的操作(例如，字段访问、方法调用)，而不管这些操作是使用Java还是本机代码执行的。我们的系统使用基于动态监控和基于跟踪的符号执行的混合方法来输出27页的控制流图(cfg)。对于所分析应用程序的每个方法。cfg由类似java的操作组成，这些操作在操作之间添加了条件表达式和数据流，从而为任何代码提供了可理解的表示，即使是那些完全本地的代码。燕麦的内部使用户不必深入研究低级指令，而低级指令很难逆向工程。我们将OATs的内部功能与最先进的工具进行了广泛的比较，以便在观察本地操作时突出其优势。我们的实验是在一个真实的智能手机上进行的:我们讨论了OATs对内部性能的影响，并演示了它在包含OWASP基础提供的反调试技术的应用程序中的实际使用。我们还使用tiger混淆器使用混淆单元测试来评估OATs内部的健壮性。

{"title":"OATs’inside: Retrieving Object Behaviors From Native-based Obfuscated Android Applications","authors":"Pierre Graux, Jean-François Lalande, Valérie Viet Triem Tong, Pierre Wilke","doi":"10.1145/3584975","DOIUrl":"https://doi.org/10.1145/3584975","url":null,"abstract":"Analyzing Android applications is essential to review proprietary code and to understand malware behaviors. However, Android applications use obfuscation techniques to slow down this process. These obfuscation techniques are increasingly based on native code. In this article, we propose OATs’inside, a new analysis tool that focuses on high-level behaviors to circumvent native obfuscation techniques transparently. The targeted high-level behaviors are object-level behaviors, i.e., actions performed on Java objects (e.g., field accesses, method calls), regardless of whether these actions are performed using Java or native code. Our system uses a hybrid approach based on dynamic monitoring and trace-based symbolic execution to output control flow graphs (CFGs), 27 pages. for each method of the analyzed application. CFGs are composed of Java-like actions enriched with condition expressions and dataflows between actions, giving an understandable representation of any code, even those fully native. OATs’inside spares users the need to dive into low-level instructions, which are difficult to reverse engineer. We extensively compare OATs’inside functionalities against state-of-the-art tools to highlight the benefit when observing native operations. Our experiments are conducted on a real smartphone: We discuss the performance impact of OATs’inside, and we demonstrate its practical use on applications containing anti-debugging techniques provided by the OWASP foundation. We also evaluate the robustness of OATs’inside using obfuscated unit tests using the Tigress obfuscator.","PeriodicalId":202552,"journal":{"name":"Digital Threats: Research and Practice","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127447872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Digital Threats: Research and Practice

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀