arXiv - CS - Cryptography and Security最新文献

英文中文

The Impact of SBOM Generators on Vulnerability Assessment in Python: A Comparison and a Novel Approach SBOM 生成器对 Python 漏洞评估的影响：比较与新方法

arXiv - CS - Cryptography and Security

Pub Date : 2024-09-10 DOI: arxiv-2409.06390

Giacomo Benedetti, Serena Cofano, Alessandro Brighente, Mauro Conti

The Software Supply Chain (SSC) security is a critical concern for both usersand developers. Recent incidents, like the SolarWinds Orion compromise, provedthe widespread impact resulting from the distribution of compromised software.The reliance on open-source components, which constitute a significant portionof modern software, further exacerbates this risk. To enhance SSC security, theSoftware Bill of Materials (SBOM) has been promoted as a tool to increasetransparency and verifiability in software composition. However, despite itspromise, SBOMs are not without limitations. Current SBOM generation tools oftensuffer from inaccuracies in identifying components and dependencies, leading tothe creation of erroneous or incomplete representations of the SSC. Despiteexisting studies exposing these limitations, their impact on the vulnerabilitydetection capabilities of security tools is still unknown. In this paper, we perform the first security analysis on the vulnerabilitydetection capabilities of tools receiving SBOMs as input. We comprehensivelyevaluate SBOM generation tools by providing their outputs to vulnerabilityidentification software. Based on our results, we identify the root causes ofthese tools' ineffectiveness and propose PIP-sbom, a novel pip-inspiredsolution that addresses their shortcomings. PIP-sbom provides improved accuracyin component identification and dependency resolution. Compared tobest-performing state-of-the-art tools, PIP-sbom increases the averageprecision and recall by 60%, and reduces by ten times the number of falsepositives.

软件供应链 (SSC) 的安全性是用户和开发人员都极为关注的问题。最近发生的事件，如 SolarWinds Orion 入侵事件，证明了被入侵软件的传播所造成的广泛影响。对开源组件的依赖进一步加剧了这一风险，而开源组件在现代软件中占有很大比例。为了提高 SSC 的安全性，软件材料清单（SBOM）已被作为一种工具加以推广，以提高软件组成的透明度和可验证性。然而，尽管SBOM大有可为，但也并非没有局限性。当前的 SBOM 生成工具在识别组件和依赖关系时往往存在误差，从而导致生成错误或不完整的 SSC 表示。尽管已有研究揭示了这些局限性，但它们对安全工具漏洞检测能力的影响仍然未知。在本文中，我们首次对接收 SBOM 作为输入的工具的漏洞检测能力进行了安全分析。我们通过向漏洞识别软件提供 SBOM 生成工具的输出，对这些工具进行了全面评估。根据分析结果，我们找出了这些工具效率低下的根本原因，并提出了 PIP-sbom 这一新型 pip-inspire 解决方案，以解决这些工具的不足之处。PIP-sbom 提高了组件识别和依赖性解析的准确性。与性能最好的先进工具相比，PIP-sbom 的平均精确度和召回率提高了 60%，误判率降低了 10 倍。

{"title":"The Impact of SBOM Generators on Vulnerability Assessment in Python: A Comparison and a Novel Approach","authors":"Giacomo Benedetti, Serena Cofano, Alessandro Brighente, Mauro Conti","doi":"arxiv-2409.06390","DOIUrl":"https://doi.org/arxiv-2409.06390","url":null,"abstract":"The Software Supply Chain (SSC) security is a critical concern for both users\u0000and developers. Recent incidents, like the SolarWinds Orion compromise, proved\u0000the widespread impact resulting from the distribution of compromised software.\u0000The reliance on open-source components, which constitute a significant portion\u0000of modern software, further exacerbates this risk. To enhance SSC security, the\u0000Software Bill of Materials (SBOM) has been promoted as a tool to increase\u0000transparency and verifiability in software composition. However, despite its\u0000promise, SBOMs are not without limitations. Current SBOM generation tools often\u0000suffer from inaccuracies in identifying components and dependencies, leading to\u0000the creation of erroneous or incomplete representations of the SSC. Despite\u0000existing studies exposing these limitations, their impact on the vulnerability\u0000detection capabilities of security tools is still unknown. In this paper, we perform the first security analysis on the vulnerability\u0000detection capabilities of tools receiving SBOMs as input. We comprehensively\u0000evaluate SBOM generation tools by providing their outputs to vulnerability\u0000identification software. Based on our results, we identify the root causes of\u0000these tools' ineffectiveness and propose PIP-sbom, a novel pip-inspired\u0000solution that addresses their shortcomings. PIP-sbom provides improved accuracy\u0000in component identification and dependency resolution. Compared to\u0000best-performing state-of-the-art tools, PIP-sbom increases the average\u0000precision and recall by 60%, and reduces by ten times the number of false\u0000positives.","PeriodicalId":501332,"journal":{"name":"arXiv - CS - Cryptography and Security","volume":"166 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142201335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Ransomware Detection Using Machine Learning in the Linux Kernel 在 Linux 内核中使用机器学习检测勒索软件

arXiv - CS - Cryptography and Security

Pub Date : 2024-09-10 DOI: arxiv-2409.06452

Adrian Brodzik, Tomasz Malec-Kruszyński, Wojciech Niewolski, Mikołaj Tkaczyk, Krzysztof Bocianiak, Sok-Yen Loui

Linux-based cloud environments have become lucrative targets for ransomwareattacks, employing various encryption schemes at unprecedented speeds.Addressing the urgency for real-time ransomware protection, we proposeleveraging the extended Berkeley Packet Filter (eBPF) to collect system callinformation regarding active processes and infer about the data directly at thekernel level. In this study, we implement two Machine Learning (ML) models ineBPF - a decision tree and a multilayer perceptron. Benchmarking latency andaccuracy against their user space counterparts, our findings underscore theefficacy of this approach.

针对实时勒索软件保护的紧迫性，我们建议利用扩展的伯克利包过滤器（eBPF）来收集有关活动进程的系统调用信息，并直接在内核级别推断数据。在这项研究中，我们在 eBPF 中实施了两种机器学习（ML）模型--决策树和多层感知器。通过将延迟和准确性与用户空间对应模型进行比较，我们的研究结果证明了这种方法的有效性。

引用次数: 0

Catch Me if You Can: Detecting Unauthorized Data Use in Deep Learning Models 有本事来抓我：检测深度学习模型中未经授权的数据使用

arXiv - CS - Cryptography and Security

Pub Date : 2024-09-10 DOI: arxiv-2409.06280

Zitao Chen, Karthik Pattabiraman

The rise of deep learning (DL) has led to a surging demand for training data,which incentivizes the creators of DL models to trawl through the Internet fortraining materials. Meanwhile, users often have limited control over whethertheir data (e.g., facial images) are used to train DL models without theirconsent, which has engendered pressing concerns. This work proposes MembershipTracker, a practical data provenance tool thatcan empower ordinary users to take agency in detecting the unauthorized use oftheir data in training DL models. We view tracing data provenance through thelens of membership inference (MI). MembershipTracker consists of a lightweightdata marking component to mark the target data with small and targeted changes,which can be strongly memorized by the model trained on them; and a specializedMI-based verification process to audit whether the model exhibits strongmemorization on the target samples. Overall, MembershipTracker only requires the users to mark a small fractionof data (0.005% to 0.1% in proportion to the training set), and it enables theusers to reliably detect the unauthorized use of their data (average 0%FPR@100% TPR). We show that MembershipTracker is highly effective acrossvarious settings, including industry-scale training on the full-sizeImageNet-1k dataset. We finally evaluate MembershipTracker under multipleclasses of countermeasures.

深度学习（DL）的兴起导致对训练数据的需求激增，这刺激了 DL 模型的创建者在互联网上搜索训练材料。与此同时，用户对自己的数据（如面部图像）是否在未经自己同意的情况下被用于训练深度学习模型的控制权往往很有限，这引起了人们的迫切关注。这项工作提出了一个实用的数据出处工具--MembershipTracker，它能让普通用户有能力检测自己的数据是否在未经授权的情况下被用于训练 DL 模型。我们从成员推理（MI）的角度来看待数据来源追踪。MembershipTracker由一个轻量级数据标记组件和一个专门的基于MI的验证流程组成，前者用于标记目标数据的细小且有针对性的变化，后者用于审核模型在目标样本上是否表现出很强的记忆能力。总体而言，MembershipTracker 只需要用户标记一小部分数据（占训练集的 0.005% 到 0.1%），就能让用户可靠地检测到未经授权使用其数据的情况（平均 0%FPR@100% TPR）。我们的研究表明，MembershipTracker 在各种环境下都非常有效，包括在全尺寸的 ImageNet-1k 数据集上进行行业规模的训练。最后，我们评估了多类反措施下的 MembershipTracker。

{"title":"Catch Me if You Can: Detecting Unauthorized Data Use in Deep Learning Models","authors":"Zitao Chen, Karthik Pattabiraman","doi":"arxiv-2409.06280","DOIUrl":"https://doi.org/arxiv-2409.06280","url":null,"abstract":"The rise of deep learning (DL) has led to a surging demand for training data,\u0000which incentivizes the creators of DL models to trawl through the Internet for\u0000training materials. Meanwhile, users often have limited control over whether\u0000their data (e.g., facial images) are used to train DL models without their\u0000consent, which has engendered pressing concerns. This work proposes MembershipTracker, a practical data provenance tool that\u0000can empower ordinary users to take agency in detecting the unauthorized use of\u0000their data in training DL models. We view tracing data provenance through the\u0000lens of membership inference (MI). MembershipTracker consists of a lightweight\u0000data marking component to mark the target data with small and targeted changes,\u0000which can be strongly memorized by the model trained on them; and a specialized\u0000MI-based verification process to audit whether the model exhibits strong\u0000memorization on the target samples. Overall, MembershipTracker only requires the users to mark a small fraction\u0000of data (0.005% to 0.1% in proportion to the training set), and it enables the\u0000users to reliably detect the unauthorized use of their data (average 0%\u0000FPR@100% TPR). We show that MembershipTracker is highly effective across\u0000various settings, including industry-scale training on the full-size\u0000ImageNet-1k dataset. We finally evaluate MembershipTracker under multiple\u0000classes of countermeasures.","PeriodicalId":501332,"journal":{"name":"arXiv - CS - Cryptography and Security","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142201336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

On the Weaknesses of Backdoor-based Model Watermarking: An Information-theoretic Perspective 论基于后门的模型水印的弱点：信息论视角

arXiv - CS - Cryptography and Security

Pub Date : 2024-09-10 DOI: arxiv-2409.06130

Aoting Hu, Yanzhi Chen, Renjie Xie, Adrian Weller

Safeguarding the intellectual property of machine learning models has emergedas a pressing concern in AI security. Model watermarking is a powerfultechnique for protecting ownership of machine learning models, yet itsreliability has been recently challenged by recent watermark removal attacks.In this work, we investigate why existing watermark embedding techniquesparticularly those based on backdooring are vulnerable. Through aninformation-theoretic analysis, we show that the resilience of watermarkingagainst erasure attacks hinges on the choice of trigger-set samples, wherecurrent uses of out-distribution trigger-set are inherently vulnerable towhite-box adversaries. Based on this discovery, we propose a novel modelwatermarking scheme, In-distribution Watermark Embedding (IWE), to overcome thelimitations of existing method. To further minimise the gap to clean models, weanalyze the role of logits as watermark information carriers and propose a newapproach to better conceal watermark information within the logits. Experimentson real-world datasets including CIFAR-100 and Caltech-101 demonstrate that ourmethod robustly defends against various adversaries with negligible accuracyloss (< 0.1%).

保护机器学习模型的知识产权已成为人工智能安全领域亟待解决的问题。模型水印是一种保护机器学习模型所有权的强大技术，但其可靠性最近受到了近期水印清除攻击的挑战。在这项工作中，我们研究了现有水印嵌入技术（尤其是基于反向删除的技术）易受攻击的原因。通过信息理论分析，我们发现水印技术抵御擦除攻击的能力取决于触发集样本的选择，而目前使用的分布外触发集在本质上容易受到白盒对手的攻击。基于这一发现，我们提出了一种新颖的模型水印方案--分布内水印嵌入（IWE），以克服现有方法的局限性。为了进一步缩小与干净模型的差距，我们分析了对数作为水印信息载体的作用，并提出了一种在对数中更好地隐藏水印信息的新方法。在 CIFAR-100 和 Caltech-101 等真实数据集上进行的实验表明，我们的方法能够稳健地抵御各种对手的攻击，精确度损失几乎可以忽略不计（< 0.1%）。

{"title":"On the Weaknesses of Backdoor-based Model Watermarking: An Information-theoretic Perspective","authors":"Aoting Hu, Yanzhi Chen, Renjie Xie, Adrian Weller","doi":"arxiv-2409.06130","DOIUrl":"https://doi.org/arxiv-2409.06130","url":null,"abstract":"Safeguarding the intellectual property of machine learning models has emerged\u0000as a pressing concern in AI security. Model watermarking is a powerful\u0000technique for protecting ownership of machine learning models, yet its\u0000reliability has been recently challenged by recent watermark removal attacks.\u0000In this work, we investigate why existing watermark embedding techniques\u0000particularly those based on backdooring are vulnerable. Through an\u0000information-theoretic analysis, we show that the resilience of watermarking\u0000against erasure attacks hinges on the choice of trigger-set samples, where\u0000current uses of out-distribution trigger-set are inherently vulnerable to\u0000white-box adversaries. Based on this discovery, we propose a novel model\u0000watermarking scheme, In-distribution Watermark Embedding (IWE), to overcome the\u0000limitations of existing method. To further minimise the gap to clean models, we\u0000analyze the role of logits as watermark information carriers and propose a new\u0000approach to better conceal watermark information within the logits. Experiments\u0000on real-world datasets including CIFAR-100 and Caltech-101 demonstrate that our\u0000method robustly defends against various adversaries with negligible accuracy\u0000loss (< 0.1%).","PeriodicalId":501332,"journal":{"name":"arXiv - CS - Cryptography and Security","volume":"52 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142201451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

DroneXNFT: An NFT-Driven Framework for Secure Autonomous UAV Operations and Flight Data Management DroneXNFT：安全自主无人机操作和飞行数据管理的 NFT 驱动框架

arXiv - CS - Cryptography and Security

Pub Date : 2024-09-10 DOI: arxiv-2409.06507

Khaoula Hidawi

Non-Fungible Tokens (NFTs) have emerged as a revolutionary method formanaging digital assets, providing transparency and secure ownership records ona blockchain. In this paper, we present a theoretical framework for leveragingNFTs to manage UAV (Unmanned Aerial Vehicle) flight data. Our approach focuseson ensuring data integrity, ownership transfer, and secure data sharing amongstakeholders. This framework utilizes cryptographic methods, smart contracts,and access control mechanisms to enable a tamper-proof and privacy-preservingmanagement system for UAV flight data.

不可篡改代币（NFT）已成为一种管理数字资产的革命性方法，可在区块链上提供透明度和安全的所有权记录。在本文中，我们提出了一个利用 NFT 管理无人机（UAV）飞行数据的理论框架。我们的方法侧重于确保数据完整性、所有权转移和利益相关者之间的安全数据共享。该框架利用加密方法、智能合约和访问控制机制，为无人机飞行数据建立了一个防篡改和保护隐私的管理系统。

引用次数: 0

How to Verify Any (Reasonable) Distribution Property: Computationally Sound Argument Systems for Distributions 如何验证任何（合理的）分布属性：计算合理的分布论证系统

arXiv - CS - Cryptography and Security

Pub Date : 2024-09-10 DOI: arxiv-2409.06594

Tal Herman, Guy Rothblum

As statistical analyses become more central to science, industry and society,there is a growing need to ensure correctness of their results. Approximatecorrectness can be verified by replicating the entire analysis, but can weverify without replication? Building on a recent line of work, we studyproof-systems that allow a probabilistic verifier to ascertain that the resultsof an analysis are approximately correct, while drawing fewer samples and usingless computational resources than would be needed to replicate the analysis. Wefocus on distribution testing problems: verifying that an unknown distributionis close to having a claimed property. Our main contribution is a interactive protocol between a verifier and anuntrusted prover, which can be used to verify any distribution property thatcan be decided in polynomial time given a full and explicit description of thedistribution. If the distribution is at statistical distance $varepsilon$ fromhaving the property, then the verifier rejects with high probability. Thissoundness property holds against any polynomial-time strategy that a cheatingprover might follow, assuming the existence of collision-resistant hashfunctions (a standard assumption in cryptography). For distributions over adomain of size $N$, the protocol consists of $4$ messages and the communicationcomplexity and verifier runtime are roughly $widetilde{O}left(sqrt{N} /varepsilon^2 right)$. The verifier's sample complexity is$widetilde{O}left(sqrt{N} / varepsilon^2 right)$, and this is optimal upto $polylog(N)$ factors (for any protocol, regardless of its communicationcomplexity). Even for simple properties, approximately deciding whether anunknown distribution has the property can require quasi-linear samplecomplexity and running time. For any such property, our protocol provides aquadratic speedup over replicating the analysis.

随着统计分析在科学、工业和社会中变得越来越重要，人们越来越需要确保其结果的正确性。近似正确性可以通过复制整个分析来验证，但我们能在不复制的情况下验证吗？在最近的工作基础上，我们研究了允许概率验证者确定分析结果近似正确的验证系统，同时比复制分析所需的样本和计算资源更少。我们的重点是分布测试问题：验证未知分布是否接近所宣称的属性。我们的主要贡献在于验证者与不受信任的证明者之间的交互协议，该协议可用于验证任何分布属性，只要给定对分布的完整而明确的描述，就能在多项式时间内确定分布属性。如果分布与该属性的统计距离为 $varepsilon$ ，那么验证者就会高概率地拒绝验证。假设存在抗碰撞的哈希函数（密码学中的标准假设），那么这个健全性就能抵御作弊者可能采取的任何多项式时间策略。对于大小为 $N$ 的域上分布，协议由 $4$ 消息组成，通信复杂度和验证者运行时间大致为 $widetilde{O}left(sqrt{N} /varepsilon^2 right)$。验证者的采样复杂度为$widetilde{O}left(sqrt{N} / varepsilon^2 right)$，而且这是最优的，最高可达$polylog(N)$因子（对于任何协议，无论其通信复杂度如何）。即使是简单的属性，近似判断未知分布是否具有该属性也需要准线性的样本复杂度和运行时间。对于任何此类属性，我们的协议都能比复制分析提供无量级的速度提升。

{"title":"How to Verify Any (Reasonable) Distribution Property: Computationally Sound Argument Systems for Distributions","authors":"Tal Herman, Guy Rothblum","doi":"arxiv-2409.06594","DOIUrl":"https://doi.org/arxiv-2409.06594","url":null,"abstract":"As statistical analyses become more central to science, industry and society,\u0000there is a growing need to ensure correctness of their results. Approximate\u0000correctness can be verified by replicating the entire analysis, but can we\u0000verify without replication? Building on a recent line of work, we study\u0000proof-systems that allow a probabilistic verifier to ascertain that the results\u0000of an analysis are approximately correct, while drawing fewer samples and using\u0000less computational resources than would be needed to replicate the analysis. We\u0000focus on distribution testing problems: verifying that an unknown distribution\u0000is close to having a claimed property. Our main contribution is a interactive protocol between a verifier and an\u0000untrusted prover, which can be used to verify any distribution property that\u0000can be decided in polynomial time given a full and explicit description of the\u0000distribution. If the distribution is at statistical distance $varepsilon$ from\u0000having the property, then the verifier rejects with high probability. This\u0000soundness property holds against any polynomial-time strategy that a cheating\u0000prover might follow, assuming the existence of collision-resistant hash\u0000functions (a standard assumption in cryptography). For distributions over a\u0000domain of size $N$, the protocol consists of $4$ messages and the communication\u0000complexity and verifier runtime are roughly $widetilde{O}left(sqrt{N} /\u0000varepsilon^2 right)$. The verifier's sample complexity is\u0000$widetilde{O}left(sqrt{N} / varepsilon^2 right)$, and this is optimal up\u0000to $polylog(N)$ factors (for any protocol, regardless of its communication\u0000complexity). Even for simple properties, approximately deciding whether an\u0000unknown distribution has the property can require quasi-linear sample\u0000complexity and running time. For any such property, our protocol provides a\u0000quadratic speedup over replicating the analysis.","PeriodicalId":501332,"journal":{"name":"arXiv - CS - Cryptography and Security","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142201452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

LLM-Enhanced Software Patch Localization LLM 增强型软件补丁本地化

arXiv - CS - Cryptography and Security

Pub Date : 2024-09-10 DOI: arxiv-2409.06816

Jinhong Yu, Yi Chen, Di Tang, Xiaozhong Liu, XiaoFeng Wang, Chen Wu, Haixu Tang

Open source software (OSS) is integral to modern product development, and anyvulnerability within it potentially compromises numerous products. Whiledevelopers strive to apply security patches, pinpointing these patches amongextensive OSS updates remains a challenge. Security patch localization (SPL)recommendation methods are leading approaches to address this. However,existing SPL models often falter when a commit lacks a clear association withits corresponding CVE, and do not consider a scenario that a vulnerability hasmultiple patches proposed over time before it has been fully resolved. Toaddress these challenges, we introduce LLM-SPL, a recommendation-based SPLapproach that leverages the capabilities of the Large Language Model (LLM) tolocate the security patch commit for a given CVE. More specifically, we proposea joint learning framework, in which the outputs of LLM serves as additionalfeatures to aid our recommendation model in prioritizing security patches. Ourevaluation on a dataset of 1,915 CVEs associated with 2,461 patchesdemonstrates that LLM-SPL excels in ranking patch commits, surpassing thestate-of-the-art method in terms of Recall, while significantly reducing manualeffort. Notably, for vulnerabilities requiring multiple patches, LLM-SPLsignificantly improves Recall by 22.83%, NDCG by 19.41%, and reduces manualeffort by over 25% when checking up to the top 10 rankings. The dataset andsource code are available aturl{https://anonymous.4open.science/r/LLM-SPL-91F8}.

开放源码软件（OSS）是现代产品开发不可或缺的一部分，其中的任何漏洞都有可能危及众多产品。虽然开发人员努力应用安全补丁，但在广泛的开放源码软件更新中精确定位这些补丁仍然是一项挑战。安全补丁本地化（SPL）推荐方法是解决这一问题的主要方法。然而，现有的 SPL 模型在提交与相应的 CVE 缺乏明确关联时往往会出现问题，而且不会考虑漏洞在被完全解决之前会随着时间的推移被提出多个补丁的情况。为了应对这些挑战，我们引入了 LLM-SPL，这是一种基于推荐的 SPL 方法，它利用大语言模型（LLM）的功能来定位给定 CVE 的安全补丁提交。更具体地说，我们提出了一个联合学习框架，其中 LLM 的输出可作为额外特征，帮助我们的推荐模型确定安全补丁的优先级。对与 2,461 个补丁相关联的 1,915 个 CVE 数据集进行的评估表明，LLM-SPL 在补丁提交排序方面表现出色，在召回率方面超过了最先进的方法，同时大大减少了人工操作的工作量。值得注意的是，对于需要多个补丁的漏洞，LLM-SPL显著提高了22.83%的召回率（Recall）和19.41%的NDCG（NDCG），并且在检查前10个排名时减少了超过25%的人工工作量。数据集和源代码请访问：url{https://anonymous.4open.science/r/LLM-SPL-91F8}。

{"title":"LLM-Enhanced Software Patch Localization","authors":"Jinhong Yu, Yi Chen, Di Tang, Xiaozhong Liu, XiaoFeng Wang, Chen Wu, Haixu Tang","doi":"arxiv-2409.06816","DOIUrl":"https://doi.org/arxiv-2409.06816","url":null,"abstract":"Open source software (OSS) is integral to modern product development, and any\u0000vulnerability within it potentially compromises numerous products. While\u0000developers strive to apply security patches, pinpointing these patches among\u0000extensive OSS updates remains a challenge. Security patch localization (SPL)\u0000recommendation methods are leading approaches to address this. However,\u0000existing SPL models often falter when a commit lacks a clear association with\u0000its corresponding CVE, and do not consider a scenario that a vulnerability has\u0000multiple patches proposed over time before it has been fully resolved. To\u0000address these challenges, we introduce LLM-SPL, a recommendation-based SPL\u0000approach that leverages the capabilities of the Large Language Model (LLM) to\u0000locate the security patch commit for a given CVE. More specifically, we propose\u0000a joint learning framework, in which the outputs of LLM serves as additional\u0000features to aid our recommendation model in prioritizing security patches. Our\u0000evaluation on a dataset of 1,915 CVEs associated with 2,461 patches\u0000demonstrates that LLM-SPL excels in ranking patch commits, surpassing the\u0000state-of-the-art method in terms of Recall, while significantly reducing manual\u0000effort. Notably, for vulnerabilities requiring multiple patches, LLM-SPL\u0000significantly improves Recall by 22.83%, NDCG by 19.41%, and reduces manual\u0000effort by over 25% when checking up to the top 10 rankings. The dataset and\u0000source code are available at\u0000url{https://anonymous.4open.science/r/LLM-SPL-91F8}.","PeriodicalId":501332,"journal":{"name":"arXiv - CS - Cryptography and Security","volume":"57 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142201539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Conditional Encryption with Applications to Secure Personalized Password Typo Correction 条件加密与个性化密码错别字校正安全应用

arXiv - CS - Cryptography and Security

Pub Date : 2024-09-10 DOI: arxiv-2409.06128

Mohammad Hassan Ameri, Jeremiah Blocki

We introduce the notion of a conditional encryption scheme as an extension ofpublic key encryption. In addition to the standard public key algorithms($mathsf{KG}$, $mathsf{Enc}$, $mathsf{Dec}$) for key generation, encryptionand decryption, a conditional encryption scheme for a binary predicate $P$ addsa new conditional encryption algorithm $mathsf{CEnc}$. The conditionalencryption algorithm $c=mathsf{CEnc}_{pk}(c_1,m_2,m_3)$ takes as input thepublic encryption key $pk$, a ciphertext $c_1 = mathsf{Enc}_{pk}(m_1)$ for anunknown message $m_1$, a control message $m_2$ and a payload message $m_3$ andoutputs a conditional ciphertext $c$. Intuitively, if $P(m_1,m_2)=1$ then theconditional ciphertext $c$ should decrypt to the payload message $m_3$. On theother hand if $P(m_1,m_2) = 0$ then the ciphertext should not leak anyinformation about the control message $m_2$ or the payload message $m_3$ evenif the attacker already has the secret decryption key $sk$. We formalize thenotion of conditional encryption secrecy and provide concretely efficientconstructions for a set of predicates relevant to password typo correction. Ourpractical constructions utilize the Paillier partially homomorphic encryptionscheme as well as Shamir Secret Sharing. We prove that our constructions aresecure and demonstrate how to use conditional encryption to improve thesecurity of personalized password typo correction systems such as TypTop. Weimplement a C++ library for our practically efficient conditional encryptionschemes and evaluate the performance empirically. We also update theimplementation of TypTop to utilize conditional encryption for enhancedsecurity guarantees and evaluate the performance of the updated implementation.

我们引入了条件加密方案的概念，作为公钥加密的扩展。除了用于密钥生成、加密和解密的标准公钥算法（$mathsf{KG}$, $mathsf{Enc}$, $mathsf{Dec}$）之外，二元谓词 $P$ 的条件加密算法还增加了一个新的条件加密算法 $mathsf{CEnc}$。条件加密算法$c=mathsf{CEnc}_{pk}(c_1,m_2,m_3)$的输入是公开加密密钥$pk$、未知信息$m_1$的密文$c_1 = mathsf{Enc}_{pk}(m_1)$、控制信息$m_2$和有效载荷信息$m_3$，并输出条件密文$c$。直观地说，如果 $P(m_1,m_2)=1$，那么条件密码文 $c$ 应解密为有效载荷信息 $m_3$。另一方面，如果 $P(m_1,m_2)=0$，那么即使攻击者已经掌握了解密密钥 $sk$，密文也不会泄露任何有关控制信息 $m_2$ 或有效信息 $m_3$ 的信息。我们正式提出了条件加密保密的概念，并为一组与密码错字校正相关的谓词提供了具体有效的结构。我们的实际构造利用了派利尔（Paillier）部分同态加密算法和沙米尔秘密共享（Shamir Secret Sharing）。我们证明了我们的构造是安全的，并演示了如何使用条件加密来提高 TypTop 等个性化密码错别字纠正系统的安全性。我们为我们实际有效的条件加密算法实现了一个 C++ 库，并对其性能进行了经验评估。我们还更新了 TypTop 的实现，以利用条件加密来增强安全保证，并评估了更新后实现的性能。

{"title":"Conditional Encryption with Applications to Secure Personalized Password Typo Correction","authors":"Mohammad Hassan Ameri, Jeremiah Blocki","doi":"arxiv-2409.06128","DOIUrl":"https://doi.org/arxiv-2409.06128","url":null,"abstract":"We introduce the notion of a conditional encryption scheme as an extension of\u0000public key encryption. In addition to the standard public key algorithms\u0000($mathsf{KG}$, $mathsf{Enc}$, $mathsf{Dec}$) for key generation, encryption\u0000and decryption, a conditional encryption scheme for a binary predicate $P$ adds\u0000a new conditional encryption algorithm $mathsf{CEnc}$. The conditional\u0000encryption algorithm $c=mathsf{CEnc}_{pk}(c_1,m_2,m_3)$ takes as input the\u0000public encryption key $pk$, a ciphertext $c_1 = mathsf{Enc}_{pk}(m_1)$ for an\u0000unknown message $m_1$, a control message $m_2$ and a payload message $m_3$ and\u0000outputs a conditional ciphertext $c$. Intuitively, if $P(m_1,m_2)=1$ then the\u0000conditional ciphertext $c$ should decrypt to the payload message $m_3$. On the\u0000other hand if $P(m_1,m_2) = 0$ then the ciphertext should not leak any\u0000information about the control message $m_2$ or the payload message $m_3$ even\u0000if the attacker already has the secret decryption key $sk$. We formalize the\u0000notion of conditional encryption secrecy and provide concretely efficient\u0000constructions for a set of predicates relevant to password typo correction. Our\u0000practical constructions utilize the Paillier partially homomorphic encryption\u0000scheme as well as Shamir Secret Sharing. We prove that our constructions are\u0000secure and demonstrate how to use conditional encryption to improve the\u0000security of personalized password typo correction systems such as TypTop. We\u0000implement a C++ library for our practically efficient conditional encryption\u0000schemes and evaluate the performance empirically. We also update the\u0000implementation of TypTop to utilize conditional encryption for enhanced\u0000security guarantees and evaluate the performance of the updated implementation.","PeriodicalId":501332,"journal":{"name":"arXiv - CS - Cryptography and Security","volume":"44 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142201633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Advancing Android Privacy Assessments with Automation 利用自动化推进安卓隐私评估

arXiv - CS - Cryptography and Security

Pub Date : 2024-09-10 DOI: arxiv-2409.06564

Mugdha Khedkar, Michael Schlichtig, Eric Bodden

Android apps collecting data from users must comply with legal frameworks toensure data protection. This requirement has become even more important sincethe implementation of the General Data Protection Regulation (GDPR) by theEuropean Union in 2018. Moreover, with the proposed Cyber Resilience Act on thehorizon, stakeholders will soon need to assess software against even morestringent security and privacy standards. Effective privacy assessments requirecollaboration among groups with diverse expertise to function effectively as acohesive unit. This paper motivates the need for an automated approach that enhancesunderstanding of data protection in Android apps and improves communicationbetween the various parties involved in privacy assessments. We propose theAssessor View, a tool designed to bridge the knowledge gap between theseparties, facilitating more effective privacy assessments of Androidapplications.

收集用户数据的安卓应用程序必须遵守确保数据保护的法律框架。由于 2018 年欧盟实施了《通用数据保护条例》（GDPR），这一要求变得更加重要。此外，随着拟议的《网络复原力法案》即将出台，利益相关者很快就需要根据更加严格的安全和隐私标准来评估软件。有效的隐私评估需要具有不同专业知识的团体之间的合作，才能作为一个整体有效发挥作用。本文提出了对自动化方法的需求，这种方法可以增强对 Android 应用程序中数据保护的理解，并改善隐私评估所涉及的各方之间的沟通。我们提出了 "评估者视图"（Assessor View）这一工具，旨在弥合各方之间的知识鸿沟，促进对安卓应用程序进行更有效的隐私评估。

引用次数: 0

ChatGPT's Potential in Cryptography Misuse Detection: A Comparative Analysis with Static Analysis Tools ChatGPT 在密码学滥用检测中的潜力：与静态分析工具的比较分析

arXiv - CS - Cryptography and Security

Pub Date : 2024-09-10 DOI: arxiv-2409.06561

Ehsan Firouzi, Mohammad Ghafari, Mike Ebrahimi

The correct adoption of cryptography APIs is challenging for mainstreamdevelopers, often resulting in widespread API misuse. Meanwhile, cryptographymisuse detectors have demonstrated inconsistent performance and remain largelyinaccessible to most developers. We investigated the extent to which ChatGPTcan detect cryptography misuses and compared its performance with that of thestate-of-the-art static analysis tools. Our investigation, mainly based on theCryptoAPI-Bench benchmark, demonstrated that ChatGPT is effective inidentifying cryptography API misuses, and with the use of prompt engineering,it can even outperform leading static cryptography misuse detectors.

对于主流开发者来说，正确采用密码学 API 是一项挑战，往往会导致广泛的 API 滥用。与此同时，密码学滥用检测器的性能并不稳定，大多数开发人员仍然无法使用。我们研究了 ChatGPT 能在多大程度上检测到密码学滥用，并将其性能与最先进的静态分析工具进行了比较。我们的调查主要基于 CryptoAPI-Bench 基准，结果表明 ChatGPT 在识别密码学 API 滥用方面非常有效，如果使用提示工程，它的性能甚至可以超过领先的静态密码学滥用检测器。

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

arXiv - CS - Cryptography and Security

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀