arXiv - CS - Software Engineering最新文献_第6页

FAST: Boosting Uncertainty-based Test Prioritization Methods for Neural Networks via Feature Selection FAST：通过特征选择提升基于不确定性的神经网络测试优先级方法

arXiv - CS - Software Engineering

Pub Date : 2024-09-13 DOI: arxiv-2409.09130

Jialuo Chen, Jingyi Wang, Xiyue Zhang, Youcheng Sun, Marta Kwiatkowska, Jiming Chen, Peng Cheng

Due to the vast testing space, the increasing demand for effective andefficient testing of deep neural networks (DNNs) has led to the development ofvarious DNN test case prioritization techniques. However, the fact that DNNscan deliver high-confidence predictions for incorrectly predicted examples,known as the over-confidence problem, causes these methods to fail to revealhigh-confidence errors. To address this limitation, in this work, we proposeFAST, a method that boosts existing prioritization methods through guidedFeAture SelecTion. FAST is based on the insight that certain features mayintroduce noise that affects the model's output confidence, therebycontributing to high-confidence errors. It quantifies the importance of eachfeature for the model's correct predictions, and then dynamically prunes theinformation from the noisy features during inference to derive a newprobability vector for the uncertainty estimation. With the help of FAST, thehigh-confidence errors and correctly classified examples become moredistinguishable, resulting in higher APFD (Average Percentage of FaultDetection) values for test prioritization, and higher generalization abilityfor model enhancement. We conduct extensive experiments to evaluate FAST acrossa diverse set of model structures on multiple benchmark datasets to validatethe effectiveness, efficiency, and scalability of FAST compared to thestate-of-the-art prioritization techniques.

由于测试空间广阔，对深度神经网络（DNN）进行有效和高效测试的需求与日俱增，因此开发了各种 DNN 测试用例优先级排序技术。然而，由于 DNN 可以为预测错误的示例提供高置信度预测，即所谓的过置信度问题，导致这些方法无法揭示高置信度错误。为了解决这一局限性，我们在这项工作中提出了 FAST 方法，该方法通过引导图像选择来提升现有的优先级排序方法。FAST 基于这样一种认识：某些特征可能会带来噪声，影响模型输出的置信度，从而导致高置信度错误。它量化了每个特征对模型正确预测的重要性，然后在推理过程中动态修剪噪声特征信息，为不确定性估计导出新的概率向量。在 FAST 的帮助下，高置信度错误和正确分类的示例变得更加难以区分，从而为测试优先级的确定带来更高的 APFD（平均故障检测百分比）值，并为模型的增强带来更高的泛化能力。我们进行了广泛的实验，在多个基准数据集上对 FAST 的各种模型结构进行了评估，从而验证了 FAST 与最先进的优先级排序技术相比的有效性、效率和可扩展性。

{"title":"FAST: Boosting Uncertainty-based Test Prioritization Methods for Neural Networks via Feature Selection","authors":"Jialuo Chen, Jingyi Wang, Xiyue Zhang, Youcheng Sun, Marta Kwiatkowska, Jiming Chen, Peng Cheng","doi":"arxiv-2409.09130","DOIUrl":"https://doi.org/arxiv-2409.09130","url":null,"abstract":"Due to the vast testing space, the increasing demand for effective and\u0000efficient testing of deep neural networks (DNNs) has led to the development of\u0000various DNN test case prioritization techniques. However, the fact that DNNs\u0000can deliver high-confidence predictions for incorrectly predicted examples,\u0000known as the over-confidence problem, causes these methods to fail to reveal\u0000high-confidence errors. To address this limitation, in this work, we propose\u0000FAST, a method that boosts existing prioritization methods through guided\u0000FeAture SelecTion. FAST is based on the insight that certain features may\u0000introduce noise that affects the model's output confidence, thereby\u0000contributing to high-confidence errors. It quantifies the importance of each\u0000feature for the model's correct predictions, and then dynamically prunes the\u0000information from the noisy features during inference to derive a new\u0000probability vector for the uncertainty estimation. With the help of FAST, the\u0000high-confidence errors and correctly classified examples become more\u0000distinguishable, resulting in higher APFD (Average Percentage of Fault\u0000Detection) values for test prioritization, and higher generalization ability\u0000for model enhancement. We conduct extensive experiments to evaluate FAST across\u0000a diverse set of model structures on multiple benchmark datasets to validate\u0000the effectiveness, efficiency, and scalability of FAST compared to the\u0000state-of-the-art prioritization techniques.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Trimming the Risk: Towards Reliable Continuous Training for Deep Learning Inspection Systems 降低风险：为深度学习检测系统提供可靠的持续训练

arXiv - CS - Software Engineering

Pub Date : 2024-09-13 DOI: arxiv-2409.09108

Altaf Allah Abbassi, Houssem Ben Braiek, Foutse Khomh, Thomas Reid

The industry increasingly relies on deep learning (DL) technology formanufacturing inspections, which are challenging to automate with rule-basedmachine vision algorithms. DL-powered inspection systems derive defect patternsfrom labeled images, combining human-like agility with the consistency of acomputerized system. However, finite labeled datasets often fail to encompassall natural variations necessitating Continuous Training (CT) to regularlyadjust their models with recent data. Effective CT requires fresh labeledsamples from the original distribution; otherwise, selfgenerated labels canlead to silent performance degradation. To mitigate this risk, we develop arobust CT-based maintenance approach that updates DL models using reliable dataselections through a two-stage filtering process. The initial stage filters outlow-confidence predictions, as the model inherently discredits them. The secondstage uses variational auto-encoders and histograms to generate imageembeddings that capture latent and pixel characteristics, then rejects theinputs of substantially shifted embeddings as drifted data with erroneousoverconfidence. Then, a fine-tuning of the original DL model is executed on thefiltered inputs while validating on a mixture of recent production and originaldatasets. This strategy mitigates catastrophic forgetting and ensures the modeladapts effectively to new operational conditions. Evaluations on industrialinspection systems for popsicle stick prints and glass bottles using criticalreal-world datasets showed less than 9% of erroneous self-labeled data areretained after filtering and used for fine-tuning, improving model performanceon production data by up to 14% without compromising its results on originalvalidation data.

制造业越来越依赖于深度学习（DL）技术来进行制造检测，而使用基于规则的机器视觉算法来实现自动化具有挑战性。由深度学习驱动的检测系统从标记图像中提取缺陷模式，将人类的敏捷性与计算机化系统的一致性结合起来。然而，有限的标注数据集往往无法涵盖所有自然变化，因此需要进行持续训练（CT），利用最新数据定期调整模型。有效的持续训练需要来自原始分布的新鲜标签样本；否则，自生成的标签会导致无声的性能下降。为了降低这种风险，我们开发了一种基于 CT 的稳健维护方法，通过两阶段过滤过程，使用可靠的数据选择更新 DL 模型。第一阶段过滤掉置信度低的预测，因为模型本身就会否定这些预测。第二阶段使用变异自动编码器和直方图来生成图像嵌入，以捕捉潜在特征和像素特征，然后将大幅偏移的嵌入作为具有错误过高置信度的漂移数据剔除。然后，对过滤后的输入执行原始 DL 模型的微调，同时在最新生产数据集和原始数据集的混合数据上进行验证。这种策略可以减少灾难性遗忘，确保模型有效适应新的运行条件。利用重要的真实数据集对冰棒棍印花和玻璃瓶的工业检测系统进行的评估表明，经过过滤并用于微调后，错误的自标注数据只保留了不到 9%，从而将模型在生产数据上的性能提高了 14%，而不会影响其在原始验证数据上的结果。

{"title":"Trimming the Risk: Towards Reliable Continuous Training for Deep Learning Inspection Systems","authors":"Altaf Allah Abbassi, Houssem Ben Braiek, Foutse Khomh, Thomas Reid","doi":"arxiv-2409.09108","DOIUrl":"https://doi.org/arxiv-2409.09108","url":null,"abstract":"The industry increasingly relies on deep learning (DL) technology for\u0000manufacturing inspections, which are challenging to automate with rule-based\u0000machine vision algorithms. DL-powered inspection systems derive defect patterns\u0000from labeled images, combining human-like agility with the consistency of a\u0000computerized system. However, finite labeled datasets often fail to encompass\u0000all natural variations necessitating Continuous Training (CT) to regularly\u0000adjust their models with recent data. Effective CT requires fresh labeled\u0000samples from the original distribution; otherwise, selfgenerated labels can\u0000lead to silent performance degradation. To mitigate this risk, we develop a\u0000robust CT-based maintenance approach that updates DL models using reliable data\u0000selections through a two-stage filtering process. The initial stage filters out\u0000low-confidence predictions, as the model inherently discredits them. The second\u0000stage uses variational auto-encoders and histograms to generate image\u0000embeddings that capture latent and pixel characteristics, then rejects the\u0000inputs of substantially shifted embeddings as drifted data with erroneous\u0000overconfidence. Then, a fine-tuning of the original DL model is executed on the\u0000filtered inputs while validating on a mixture of recent production and original\u0000datasets. This strategy mitigates catastrophic forgetting and ensures the model\u0000adapts effectively to new operational conditions. Evaluations on industrial\u0000inspection systems for popsicle stick prints and glass bottles using critical\u0000real-world datasets showed less than 9% of erroneous self-labeled data are\u0000retained after filtering and used for fine-tuning, improving model performance\u0000on production data by up to 14% without compromising its results on original\u0000validation data.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ROCAS: Root Cause Analysis of Autonomous Driving Accidents via Cyber-Physical Co-mutation ROCAS：通过网络-物理协同突变分析自动驾驶事故的根本原因

arXiv - CS - Software Engineering

Pub Date : 2024-09-12 DOI: arxiv-2409.07774

Shiwei Feng, Yapeng Ye, Qingkai Shi, Zhiyuan Cheng, Xiangzhe Xu, Siyuan Cheng, Hongjun Choi, Xiangyu Zhang

As Autonomous driving systems (ADS) have transformed our daily life, safetyof ADS is of growing significance. While various testing approaches haveemerged to enhance the ADS reliability, a crucial gap remains in understandingthe accidents causes. Such post-accident analysis is paramount and beneficialfor enhancing ADS safety and reliability. Existing cyber-physical system (CPS)root cause analysis techniques are mainly designed for drones and cannot handlethe unique challenges introduced by more complex physical environments and deeplearning models deployed in ADS. In this paper, we address the gap by offeringa formal definition of ADS root cause analysis problem and introducing ROCAS, anovel ADS root cause analysis framework featuring cyber-physical co-mutation.Our technique uniquely leverages both physical and cyber mutation that canprecisely identify the accident-trigger entity and pinpoint themisconfiguration of the target ADS responsible for an accident. We furtherdesign a differential analysis to identify the responsible module to reducesearch space for the misconfiguration. We study 12 categories of ADS accidentsand demonstrate the effectiveness and efficiency of ROCAS in narrowing downsearch space and pinpointing the misconfiguration. We also show detailed casestudies on how the identified misconfiguration helps understand rationalebehind accidents.

随着自动驾驶系统（ADS）改变了我们的日常生活，其安全性也变得越来越重要。虽然已经出现了各种测试方法来提高自动驾驶系统的可靠性，但在了解事故原因方面仍存在重大差距。这种事故后分析对于提高 ADS 的安全性和可靠性至关重要。现有的网络物理系统（CPS）根源分析技术主要是针对无人机设计的，无法应对更复杂的物理环境和 ADS 中部署的深度学习模型所带来的独特挑战。我们的技术独特地利用了物理和网络突变，可以精确地识别事故触发实体，并精确定位造成事故的目标 ADS 配置。我们进一步设计了一种差分分析方法来识别责任模块，以减少错误配置的搜索空间。我们研究了 12 类 ADS 事故，证明了 ROCAS 在缩小搜索空间和精确定位错误配置方面的有效性和效率。我们还展示了详细的案例研究，说明识别出的错误配置如何帮助理解事故背后的原因。

{"title":"ROCAS: Root Cause Analysis of Autonomous Driving Accidents via Cyber-Physical Co-mutation","authors":"Shiwei Feng, Yapeng Ye, Qingkai Shi, Zhiyuan Cheng, Xiangzhe Xu, Siyuan Cheng, Hongjun Choi, Xiangyu Zhang","doi":"arxiv-2409.07774","DOIUrl":"https://doi.org/arxiv-2409.07774","url":null,"abstract":"As Autonomous driving systems (ADS) have transformed our daily life, safety\u0000of ADS is of growing significance. While various testing approaches have\u0000emerged to enhance the ADS reliability, a crucial gap remains in understanding\u0000the accidents causes. Such post-accident analysis is paramount and beneficial\u0000for enhancing ADS safety and reliability. Existing cyber-physical system (CPS)\u0000root cause analysis techniques are mainly designed for drones and cannot handle\u0000the unique challenges introduced by more complex physical environments and deep\u0000learning models deployed in ADS. In this paper, we address the gap by offering\u0000a formal definition of ADS root cause analysis problem and introducing ROCAS, a\u0000novel ADS root cause analysis framework featuring cyber-physical co-mutation.\u0000Our technique uniquely leverages both physical and cyber mutation that can\u0000precisely identify the accident-trigger entity and pinpoint the\u0000misconfiguration of the target ADS responsible for an accident. We further\u0000design a differential analysis to identify the responsible module to reduce\u0000search space for the misconfiguration. We study 12 categories of ADS accidents\u0000and demonstrate the effectiveness and efficiency of ROCAS in narrowing down\u0000search space and pinpointing the misconfiguration. We also show detailed case\u0000studies on how the identified misconfiguration helps understand rationale\u0000behind accidents.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"73 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Mixed-Methods Study of Open-Source Software Maintainers On Vulnerability Management and Platform Security Features 关于漏洞管理和平台安全功能的开源软件维护者混合方法研究

arXiv - CS - Software Engineering

Pub Date : 2024-09-12 DOI: arxiv-2409.07669

Jessy Ayala, Yu-Jye Tung, Joshua Garcia

In open-source software (OSS), software vulnerabilities have significantlyincreased. Although researchers have investigated the perspectives ofvulnerability reporters and OSS contributor security practices, understandingthe perspectives of OSS maintainers on vulnerability management and platformsecurity features is currently understudied. In this paper, we investigate theperspectives of OSS maintainers who maintain projects listed in the GitHubAdvisory Database. We explore this area by conducting two studies: identifyingaspects through a listing survey ($n_1=80$) and gathering insights fromsemi-structured interviews ($n_2=22$). Of the 37 identified aspects, we findthat supply chain mistrust and lack of automation for vulnerability managementare the most challenging, and barriers to adopting platform security featuresinclude a lack of awareness and the perception that they are not necessary.Surprisingly, we find that despite being previously vulnerable, somemaintainers still allow public vulnerability reporting, or ignore reportsaltogether. Based on our findings, we discuss implications for OSS platformsand how the research community can better support OSS vulnerability managementefforts.

在开放源码软件（OSS）中，软件漏洞显著增加。虽然研究人员已经调查了漏洞报告者的观点和开放源码软件贡献者的安全实践，但目前对了解开放源码软件维护者在漏洞管理和平台安全功能方面的观点还研究不足。在本文中，我们调查了维护 GitHub 管理数据库所列项目的开放源码软件维护者的观点。我们通过两项研究探索了这一领域：通过列表调查（$n_1=80$）确定方面；通过半结构化访谈（$n_2=22$）收集见解。在确定的 37 个方面中，我们发现供应链不信任和缺乏漏洞管理自动化是最具挑战性的，而采用平台安全功能的障碍包括缺乏意识和认为这些功能没有必要。基于我们的发现，我们讨论了对开放源码软件平台的影响，以及研究界如何才能更好地支持开放源码软件漏洞管理工作。

{"title":"A Mixed-Methods Study of Open-Source Software Maintainers On Vulnerability Management and Platform Security Features","authors":"Jessy Ayala, Yu-Jye Tung, Joshua Garcia","doi":"arxiv-2409.07669","DOIUrl":"https://doi.org/arxiv-2409.07669","url":null,"abstract":"In open-source software (OSS), software vulnerabilities have significantly\u0000increased. Although researchers have investigated the perspectives of\u0000vulnerability reporters and OSS contributor security practices, understanding\u0000the perspectives of OSS maintainers on vulnerability management and platform\u0000security features is currently understudied. In this paper, we investigate the\u0000perspectives of OSS maintainers who maintain projects listed in the GitHub\u0000Advisory Database. We explore this area by conducting two studies: identifying\u0000aspects through a listing survey ($n_1=80$) and gathering insights from\u0000semi-structured interviews ($n_2=22$). Of the 37 identified aspects, we find\u0000that supply chain mistrust and lack of automation for vulnerability management\u0000are the most challenging, and barriers to adopting platform security features\u0000include a lack of awareness and the perception that they are not necessary.\u0000Surprisingly, we find that despite being previously vulnerable, some\u0000maintainers still allow public vulnerability reporting, or ignore reports\u0000altogether. Based on our findings, we discuss implications for OSS platforms\u0000and how the research community can better support OSS vulnerability management\u0000efforts.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"96 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SoVAR: Building Generalizable Scenarios from Accident Reports for Autonomous Driving Testing SoVAR：为自动驾驶测试从事故报告中构建可通用的场景

arXiv - CS - Software Engineering

Pub Date : 2024-09-12 DOI: arxiv-2409.08081

An Guo, Yuan Zhou, Haoxiang Tian, Chunrong Fang, Yunjian Sun, Weisong Sun, Xinyu Gao, Anh Tuan Luu, Yang Liu, Zhenyu Chen

Autonomous driving systems (ADSs) have undergone remarkable development andare increasingly employed in safety-critical applications. However, recentlyreported data on fatal accidents involving ADSs suggests that the desired levelof safety has not yet been fully achieved. Consequently, there is a growingneed for more comprehensive and targeted testing approaches to ensure safedriving. Scenarios from real-world accident reports provide valuable resourcesfor ADS testing, including critical scenarios and high-quality seeds. However,existing scenario reconstruction methods from accident reports often exhibitlimited accuracy in information extraction. Moreover, due to the diversity andcomplexity of road environments, matching current accident information with thesimulation map data for reconstruction poses significant challenges. In thispaper, we design and implement SoVAR, a tool for automatically generatingroad-generalizable scenarios from accident reports. SoVAR utilizeswell-designed prompts with linguistic patterns to guide the large languagemodel in extracting accident information from textual data. Subsequently, itformulates and solves accident-related constraints in conjunction with theextracted accident information to generate accident trajectories. Finally,SoVAR reconstructs accident scenarios on various map structures and convertsthem into test scenarios to evaluate its capability to detect defects inindustrial ADSs. We experiment with SoVAR, using accident reports from theNational Highway Traffic Safety Administration's database to generate testscenarios for the industrial-grade ADS Apollo. The experimental findingsdemonstrate that SoVAR can effectively generate generalized accident scenariosacross different road structures. Furthermore, the results confirm that SoVARidentified 5 distinct safety violation types that contributed to the crash ofBaidu Apollo.

自动驾驶系统（ADS）经历了显著的发展，并越来越多地应用于对安全至关重要的领域。然而，最近报告的涉及自动驾驶系统的致命事故数据表明，预期的安全水平尚未完全达到。因此，人们越来越需要更全面、更有针对性的测试方法来确保驾驶安全。真实世界事故报告中的场景为自动驾驶辅助系统测试提供了宝贵的资源，包括关键场景和高质量的种子。然而，现有的事故报告情景重建方法在信息提取方面往往表现出有限的准确性。此外，由于道路环境的多样性和复杂性，将当前事故信息与模拟地图数据进行匹配以进行重建也是一项重大挑战。在本文中，我们设计并实现了 SoVAR，这是一种从事故报告中自动生成道路通用场景的工具。SoVAR 利用精心设计的语言模式提示，引导大型语言模型从文本数据中提取事故信息。随后，它结合提取的事故信息，制定并解决与事故相关的约束条件，生成事故轨迹。最后，SoVAR 在各种地图结构上重建事故场景，并将其转换为测试场景，以评估其检测工业 ADS 缺陷的能力。我们对 SoVAR 进行了实验，使用美国国家公路交通安全管理局数据库中的事故报告生成工业级 ADS 阿波罗的测试场景。实验结果表明，SoVAR 可以有效生成不同道路结构的通用事故场景。此外，实验结果还证实，SoVAR 能识别出导致百度 Apollo 车祸的 5 种不同的安全违规类型。

{"title":"SoVAR: Building Generalizable Scenarios from Accident Reports for Autonomous Driving Testing","authors":"An Guo, Yuan Zhou, Haoxiang Tian, Chunrong Fang, Yunjian Sun, Weisong Sun, Xinyu Gao, Anh Tuan Luu, Yang Liu, Zhenyu Chen","doi":"arxiv-2409.08081","DOIUrl":"https://doi.org/arxiv-2409.08081","url":null,"abstract":"Autonomous driving systems (ADSs) have undergone remarkable development and\u0000are increasingly employed in safety-critical applications. However, recently\u0000reported data on fatal accidents involving ADSs suggests that the desired level\u0000of safety has not yet been fully achieved. Consequently, there is a growing\u0000need for more comprehensive and targeted testing approaches to ensure safe\u0000driving. Scenarios from real-world accident reports provide valuable resources\u0000for ADS testing, including critical scenarios and high-quality seeds. However,\u0000existing scenario reconstruction methods from accident reports often exhibit\u0000limited accuracy in information extraction. Moreover, due to the diversity and\u0000complexity of road environments, matching current accident information with the\u0000simulation map data for reconstruction poses significant challenges. In this\u0000paper, we design and implement SoVAR, a tool for automatically generating\u0000road-generalizable scenarios from accident reports. SoVAR utilizes\u0000well-designed prompts with linguistic patterns to guide the large language\u0000model in extracting accident information from textual data. Subsequently, it\u0000formulates and solves accident-related constraints in conjunction with the\u0000extracted accident information to generate accident trajectories. Finally,\u0000SoVAR reconstructs accident scenarios on various map structures and converts\u0000them into test scenarios to evaluate its capability to detect defects in\u0000industrial ADSs. We experiment with SoVAR, using accident reports from the\u0000National Highway Traffic Safety Administration's database to generate test\u0000scenarios for the industrial-grade ADS Apollo. The experimental findings\u0000demonstrate that SoVAR can effectively generate generalized accident scenarios\u0000across different road structures. Furthermore, the results confirm that SoVAR\u0000identified 5 distinct safety violation types that contributed to the crash of\u0000Baidu Apollo.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"40 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Towards regulatory compliant lifecycle for AI-based medical devices in EU: Industry perspectives 实现欧盟人工智能医疗设备生命周期的合规性：行业视角

arXiv - CS - Software Engineering

Pub Date : 2024-09-12 DOI: arxiv-2409.08006

Tuomas Granlund, Vlad Stirbu, Tommi Mikkonen

Despite the immense potential of AI-powered medical devices to revolutionizehealthcare, concerns regarding their safety in life-critical applicationsremain. While the European regulatory framework provides a comprehensiveapproach to medical device software development, it falls short in addressingAI-specific considerations. This article proposes a model to bridge this gap byextending the general idea of AI lifecycle with regulatory activities relevantto AI-enabled medical systems.

尽管人工智能驱动的医疗设备在医疗保健领域具有巨大的变革潜力，但人们对其在生命攸关的应用中的安全性仍然存在担忧。虽然欧洲的监管框架为医疗设备软件开发提供了一种全面的方法，但在解决人工智能的具体问题方面却存在不足。本文提出了一个模型，通过扩展人工智能生命周期的一般概念，将与人工智能医疗系统相关的监管活动纳入其中，从而弥补这一不足。

引用次数: 0

Handling expression evaluation under interference 处理干扰下的表达式评估

arXiv - CS - Software Engineering

Pub Date : 2024-09-12 DOI: arxiv-2409.07741

Ian J. Hayes, Cliff B. Jones, Larissa A. Meinicke

Hoare-style inference rules for program constructs permit the copying ofexpressions and tests from program text into logical contexts. It is known thatthis requires care even for sequential programs but further issues arise forconcurrent programs because of potential interference to the values ofvariables. The "rely-guarantee" approach does tackle the issue of recordingacceptable interference and offers a way to provide safe inference rules. Thispaper shows how the algebraic presentation of rely-guarantee ideas can clarifyand formalise the conditions for safely re-using expressions and tests fromprogram text in logical contexts for reasoning about programs.

针对程序构造的 Hoare 式推理规则允许将表达式和检验从程序文本复制到逻辑上下文中。众所周知，即使是顺序程序也需要小心谨慎，但对于并行程序，由于变量值可能受到干扰，因此会出现更多问题。依赖-保证 "方法确实解决了记录可接受干扰的问题，并提供了一种提供安全推理规则的方法。本文展示了 "依赖-保证 "思想的代数表达方式如何阐明并形式化了在逻辑上下文中安全地重复使用程序文本中的表达式和测试的条件，以便对程序进行推理。

引用次数: 0

A Deep Dive Into How Open-Source Project Maintainers Review and Resolve Bug Bounty Reports 深入了解开源项目维护者如何审查和解决漏洞悬赏报告

arXiv - CS - Software Engineering

Pub Date : 2024-09-12 DOI: arxiv-2409.07670

Jessy Ayala, Steven Ngo, Joshua Garcia

Researchers have investigated the bug bounty ecosystem from the lens ofplatforms, programs, and bug hunters. Understanding the perspectives of bugbounty report reviewers, especially those who historically lack a securitybackground and little to no funding for bug hunters, is currently understudied.In this paper, we primarily investigate the perspective of open-source software(OSS) maintainers who have used texttt{huntr}, a bug bounty platform that paysbounties to bug hunters who find security bugs in GitHub projects and have hadvalid vulnerabilities patched as a result. We address this area by conductingthree studies: identifying characteristics through a listing survey ($n_1=51$),their ranked importance with Likert-scale survey data ($n_2=90$), andconducting semi-structured interviews to dive deeper into real-worldexperiences ($n_3=17$). As a result, we categorize 40 identifiedcharacteristics into benefits, challenges, helpful features, and wantedfeatures. We find that private disclosure and project visibility are the mostimportant benefits, while hunters focused on money or CVEs and pressure toreview are the most challenging to overcome. Surprisingly, lack ofcommunication with bug hunters is the least challenging, and CVE creationsupport is the second-least helpful feature for OSS maintainers when reviewingbug bounty reports. We present recommendations to make the bug bounty reviewprocess more accommodating to open-source maintainers and identify areas forfuture work.

研究人员从平台、程序和漏洞猎人的角度对漏洞赏金生态系统进行了调查。在本文中，我们主要调查了使用过 texttt{huntr} 这个漏洞赏金平台的开源软件（OSS）维护者的视角，该平台向在 GitHub 项目中发现安全漏洞并因此修补了有效漏洞的漏洞猎人支付赏金。我们针对这一领域开展了三项研究：通过列表调查确定特征（$n_1=51$），利用李克特量表调查数据对其重要性进行排序（$n_2=90$），以及进行半结构化访谈以深入了解真实世界的经验（$n_3=17$）。因此，我们将发现的 40 个特征分为好处、挑战、有用的特征和想要的特征。我们发现，隐私披露和项目可见性是最重要的好处，而以金钱或 CVE 为重点的猎人和审查压力则是最难克服的挑战。令人惊讶的是，缺乏与漏洞猎人的沟通是最没有挑战性的，而 CVE 创建支持是开放源码软件维护人员在审查漏洞悬赏报告时第二没有帮助的功能。我们提出了一些建议，以使漏洞悬赏审查过程更适合开源软件维护人员，并确定了今后的工作领域。

{"title":"A Deep Dive Into How Open-Source Project Maintainers Review and Resolve Bug Bounty Reports","authors":"Jessy Ayala, Steven Ngo, Joshua Garcia","doi":"arxiv-2409.07670","DOIUrl":"https://doi.org/arxiv-2409.07670","url":null,"abstract":"Researchers have investigated the bug bounty ecosystem from the lens of\u0000platforms, programs, and bug hunters. Understanding the perspectives of bug\u0000bounty report reviewers, especially those who historically lack a security\u0000background and little to no funding for bug hunters, is currently understudied.\u0000In this paper, we primarily investigate the perspective of open-source software\u0000(OSS) maintainers who have used texttt{huntr}, a bug bounty platform that pays\u0000bounties to bug hunters who find security bugs in GitHub projects and have had\u0000valid vulnerabilities patched as a result. We address this area by conducting\u0000three studies: identifying characteristics through a listing survey ($n_1=51$),\u0000their ranked importance with Likert-scale survey data ($n_2=90$), and\u0000conducting semi-structured interviews to dive deeper into real-world\u0000experiences ($n_3=17$). As a result, we categorize 40 identified\u0000characteristics into benefits, challenges, helpful features, and wanted\u0000features. We find that private disclosure and project visibility are the most\u0000important benefits, while hunters focused on money or CVEs and pressure to\u0000review are the most challenging to overcome. Surprisingly, lack of\u0000communication with bug hunters is the least challenging, and CVE creation\u0000support is the second-least helpful feature for OSS maintainers when reviewing\u0000bug bounty reports. We present recommendations to make the bug bounty review\u0000process more accommodating to open-source maintainers and identify areas for\u0000future work.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"62 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142227609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Mobile App Security Trends and Topics: An Examination of Questions From Stack Overflow 移动应用程序安全趋势和主题：对 Stack Overflow 问题的研究

arXiv - CS - Software Engineering

Pub Date : 2024-09-12 DOI: arxiv-2409.07926

Timothy Huo, Ana Catarina Araújo, Jake Imanaka, Anthony Peruma, Rick Kazman

The widespread use of smartphones and tablets has made society heavilyreliant on mobile applications (apps) for accessing various resources andservices. These apps often handle sensitive personal, financial, and healthdata, making app security a critical concern for developers. While there isextensive research on software security topics like malware andvulnerabilities, less is known about the practical security challenges mobileapp developers face and the guidance they seek. rev{In this study, we mineStack Overflow for questions on mobile app security, which we analyze usingquantitative and qualitative techniques.} The findings reveal that StackOverflow is a major resource for developers seeking help with mobile appsecurity, especially for Android apps, and identifies seven main categories ofsecurity questions: Secured Communications, Database, App Distribution Service,Encryption, Permissions, File-Specific, and General Security. Insights fromthis research can inform the development of tools, techniques, and resources bythe research and vendor community to better support developers in securingtheir mobile apps.

智能手机和平板电脑的广泛使用使社会高度依赖移动应用程序（App）来访问各种资源和服务。这些应用程序通常会处理敏感的个人、财务和健康数据，因此应用程序的安全性成为开发人员关注的焦点。虽然对恶意软件和漏洞等软件安全主题有广泛的研究，但对移动应用开发者面临的实际安全挑战和他们寻求的指导却知之甚少。rev{在这项研究中，我们挖掘了Stack Overflow中有关移动应用安全的问题，并使用定量和定性技术对其进行了分析。｝研究结果表明，StackOverflow 是开发人员寻求移动应用程序安全帮助的主要资源，尤其是针对安卓应用程序，并确定了七大类安全问题：安全通信、数据库、应用程序分发服务、加密、权限、特定文件和一般安全。这项研究的启示可以为研究人员和供应商开发工具、技术和资源提供参考，从而更好地支持开发人员保护其移动应用程序的安全。

{"title":"Mobile App Security Trends and Topics: An Examination of Questions From Stack Overflow","authors":"Timothy Huo, Ana Catarina Araújo, Jake Imanaka, Anthony Peruma, Rick Kazman","doi":"arxiv-2409.07926","DOIUrl":"https://doi.org/arxiv-2409.07926","url":null,"abstract":"The widespread use of smartphones and tablets has made society heavily\u0000reliant on mobile applications (apps) for accessing various resources and\u0000services. These apps often handle sensitive personal, financial, and health\u0000data, making app security a critical concern for developers. While there is\u0000extensive research on software security topics like malware and\u0000vulnerabilities, less is known about the practical security challenges mobile\u0000app developers face and the guidance they seek. rev{In this study, we mine\u0000Stack Overflow for questions on mobile app security, which we analyze using\u0000quantitative and qualitative techniques.} The findings reveal that Stack\u0000Overflow is a major resource for developers seeking help with mobile app\u0000security, especially for Android apps, and identifies seven main categories of\u0000security questions: Secured Communications, Database, App Distribution Service,\u0000Encryption, Permissions, File-Specific, and General Security. Insights from\u0000this research can inform the development of tools, techniques, and resources by\u0000the research and vendor community to better support developers in securing\u0000their mobile apps.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"40 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Building a Cybersecurity Risk Metamodel for Improved Method and Tool Integration 建立网络安全风险元模型以改进方法和工具集成

arXiv - CS - Software Engineering

Pub Date : 2024-09-12 DOI: arxiv-2409.07906

Christophe Ponsard

Nowadays, companies are highly exposed to cyber security threats. In manyindustrial domains, protective measures are being deployed and activelysupported by standards. However the global process remains largely dependent ondocument driven approach or partial modelling which impacts both the efficiencyand effectiveness of the cybersecurity process from the risk analysis step. Inthis paper, we report on our experience in applying a model-driven approach onthe initial risk analysis step in connection with a later security testing. Ourwork rely on a common metamodel which is used to map, synchronise and ensureinformation traceability across different tools. We validate our approach usingdifferent scenarios relying domain modelling, system modelling, risk assessmentand security testing tools.

如今，企业极易受到网络安全威胁。在许多行业领域，保护措施正在部署，并得到标准的积极支持。然而，全球流程在很大程度上仍然依赖于文档驱动方法或部分建模，这影响了从风险分析步骤开始的网络安全流程的效率和效果。在本文中，我们报告了在与后期安全测试相关的初始风险分析步骤中应用模型驱动方法的经验。我们的工作依赖于一个通用的元模型，该模型用于映射、同步和确保不同工具之间的信息可追溯性。我们利用不同的场景验证了我们的方法，这些场景依赖于领域建模、系统建模、风险评估和安全测试工具。

引用次数: 0