首页 > 最新文献

2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE)最新文献

英文 中文
TCD: Statically Detecting Type Confusion Errors in C++ Programs 静态检测c++程序中的类型混淆错误
Changwei Zou, Yulei Sui, Hua Yan, Jingling Xue
For performance reasons, C++, albeit unsafe, is often the programming language of choice for developing software infrastructures. A serious type of security vulnerability in C++ programs is type confusion, which may lead to program crashes and control flow hijack attacks. While existing mitigation solutions almost exclusively rely on dynamic analysis techniques, which suffer from low code coverage and high overhead, static analysis has rarely been investigated. This paper presents TCD, a static type confusion detector built on top of a precise demand-driven field-, context-and flow-sensitive pointer analysis. Unlike existing pointer analyses, TCD is type-aware as it not only preserves the type information in the pointed-to objects but also handles complex language features of C++ such as multiple inheritance and placement new, making it therefore possible to reason about type casting in C++ programs. We have implemented TCD in LLVM and evaluated it using seven C++ applications (totaling 526,385 lines of C++ code) from Qt, a widely-adopted C++ toolkit for creating GUIs and cross-platform software. TCD has found five type confusion bugs, including one reported previously in prior work and four new ones, in under 7.3 hours, with a low false positive rate of 28.2%.
由于性能原因,c++虽然不安全,但通常是开发软件基础结构的首选编程语言。c++程序中一个严重的安全漏洞是类型混淆,它可能导致程序崩溃和控制流劫持攻击。虽然现有的缓解解决方案几乎完全依赖于动态分析技术,而动态分析技术的代码覆盖率低,开销大,但静态分析很少被研究。本文介绍了TCD,一个静态类型混淆检测器,建立在精确的需求驱动的领域,上下文和流敏感的指针分析之上。与现有的指针分析不同,TCD是类型感知的,因为它不仅保留了指向对象中的类型信息,而且还处理了复杂的c++语言特性,如多重继承和放置new,因此可以推断c++程序中的类型转换。我们在LLVM中实现了TCD,并使用来自Qt的七个c++应用程序(总共526,385行c++代码)对其进行了评估,Qt是一个广泛采用的用于创建gui和跨平台软件的c++工具包。TCD在7.3小时内发现了5个类型混淆错误,其中包括之前工作中报告的一个错误和4个新错误,假阳性率为28.2%。
{"title":"TCD: Statically Detecting Type Confusion Errors in C++ Programs","authors":"Changwei Zou, Yulei Sui, Hua Yan, Jingling Xue","doi":"10.1109/ISSRE.2019.00037","DOIUrl":"https://doi.org/10.1109/ISSRE.2019.00037","url":null,"abstract":"For performance reasons, C++, albeit unsafe, is often the programming language of choice for developing software infrastructures. A serious type of security vulnerability in C++ programs is type confusion, which may lead to program crashes and control flow hijack attacks. While existing mitigation solutions almost exclusively rely on dynamic analysis techniques, which suffer from low code coverage and high overhead, static analysis has rarely been investigated. This paper presents TCD, a static type confusion detector built on top of a precise demand-driven field-, context-and flow-sensitive pointer analysis. Unlike existing pointer analyses, TCD is type-aware as it not only preserves the type information in the pointed-to objects but also handles complex language features of C++ such as multiple inheritance and placement new, making it therefore possible to reason about type casting in C++ programs. We have implemented TCD in LLVM and evaluated it using seven C++ applications (totaling 526,385 lines of C++ code) from Qt, a widely-adopted C++ toolkit for creating GUIs and cross-platform software. TCD has found five type confusion bugs, including one reported previously in prior work and four new ones, in under 7.3 hours, with a low false positive rate of 28.2%.","PeriodicalId":254749,"journal":{"name":"2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125531628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Generic and Robust Localization of Multi-dimensional Root Causes 多维根本原因的通用鲁棒定位
Zeyan Li, Dan Pei, Cheng Luo, Yiwei Zhao, Yongqian Sun, Kaixin Sui, Xiping Wang, Dapeng Liu, Xing Jin, Qi Wang
Operators of online software services periodically collect various measures with many attributes. When a measure becomes abnormal, indicating service problems such as reliability degrade, operators would like to rapidly and accurately localize the root cause attribute combinations within a huge multi-dimensional search space. Unfortunately, previous approaches are not generic or robust in that they all suffer from impractical root cause assumptions, handling only directly collected measures but not derived ones, handling only anomalies with signicant magnitudes but not those insignicant but important ones, requiring manual parameter ne-tuning, or being too slow. This paper proposes a generic and robust multi-dimensional root cause localization approach, Squeeze, that overcomes all above limitations, the first in the literature. Through our novel bottom-up then top-down searching strategy and the techniques based on our proposed generalized ripple effect and generalized potential score, Squeeze is able to reach a good trade off between search speed and accuracy in a generic and robust manner. Case studies in several banks and an Internet company show that Squeeze can localize root causes much more rapidly and accurately than the traditional manual analysis. Furthermore, our extensive experiments on semi-synthetic datasets show that the F1-score of Squeeze outperforms previous approaches by 0.4 on average, while its localization time is only about 10 seconds
在线软件服务运营商定期收集具有多种属性的各种指标。当某项措施出现异常,预示着可靠性下降等业务问题时,运营商希望在巨大的多维搜索空间中快速准确地定位根本原因属性组合。不幸的是,以前的方法不是通用的或鲁棒的,因为它们都受到不切实际的根本原因假设的影响,只处理直接收集的度量而不处理派生的度量,只处理具有显著幅度的异常而不处理微不足道但重要的异常,需要手动调整参数,或者太慢。本文提出了一种通用的、鲁棒的多维根本原因定位方法Squeeze,克服了上述所有局限性,在文献中尚属首次。通过我们新颖的自下而上然后自上而下的搜索策略以及基于我们提出的广义涟漪效应和广义潜在得分的技术,Squeeze能够以通用和鲁棒的方式在搜索速度和准确性之间取得良好的平衡。几家银行和一家互联网公司的案例研究表明,与传统的人工分析相比,Squeeze可以更快、更准确地定位根本原因。此外,我们在半合成数据集上的大量实验表明,挤压方法的f1分数平均比以前的方法高0.4分,而其定位时间仅为10秒左右
{"title":"Generic and Robust Localization of Multi-dimensional Root Causes","authors":"Zeyan Li, Dan Pei, Cheng Luo, Yiwei Zhao, Yongqian Sun, Kaixin Sui, Xiping Wang, Dapeng Liu, Xing Jin, Qi Wang","doi":"10.1109/ISSRE.2019.00015","DOIUrl":"https://doi.org/10.1109/ISSRE.2019.00015","url":null,"abstract":"Operators of online software services periodically collect various measures with many attributes. When a measure becomes abnormal, indicating service problems such as reliability degrade, operators would like to rapidly and accurately localize the root cause attribute combinations within a huge multi-dimensional search space. Unfortunately, previous approaches are not generic or robust in that they all suffer from impractical root cause assumptions, handling only directly collected measures but not derived ones, handling only anomalies with signicant magnitudes but not those insignicant but important ones, requiring manual parameter ne-tuning, or being too slow. This paper proposes a generic and robust multi-dimensional root cause localization approach, Squeeze, that overcomes all above limitations, the first in the literature. Through our novel bottom-up then top-down searching strategy and the techniques based on our proposed generalized ripple effect and generalized potential score, Squeeze is able to reach a good trade off between search speed and accuracy in a generic and robust manner. Case studies in several banks and an Internet company show that Squeeze can localize root causes much more rapidly and accurately than the traditional manual analysis. Furthermore, our extensive experiments on semi-synthetic datasets show that the F1-score of Squeeze outperforms previous approaches by 0.4 on average, while its localization time is only about 10 seconds","PeriodicalId":254749,"journal":{"name":"2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130161691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Supervised Representation Learning Approach for Cross-Project Aging-Related Bug Prediction 跨项目老化相关Bug预测的监督表示学习方法
Xiaohui Wan, Zheng Zheng, Fangyun Qin, Yu Qiao, Kishor S. Trivedi
Software aging, which is caused by Aging-Related Bugs (ARBs), tends to occur in long-running systems and may lead to performance degradation and increasing failure rate during software execution. ARB prediction can help developers discover and remove ARBs, thus alleviating the impact of software aging. However, ARB-prone files occupy a small percentage of all the analyzed files. It is usually difficult to gather sufficient ARB data within a project. To overcome the limited availability of training data, several researchers have recently developed cross-project models for ARB prediction. A key point for cross-project models is to learn a good representation for instances in different projects. Nevertheless, most of the previous approaches neither consider the reconstruction property of new representation nor encode source samples' label information in learning representation. To address these shortcomings, we propose a Supervised Representation Learning Approach (SRLA), which is based on double encoding-layer autoencoder, to perform cross-project ARB prediction. Moreover, we present a transfer cross-validation framework to select the hyper-parameters of cross-project models. Experiments on three large open-source projects demonstrate the effectiveness and superiority of our approach compared with the state-of-the-art approach TLAP.
软件老化是由老化相关bug (aging - related Bugs, arb)引起的,它往往发生在长时间运行的系统中,并可能导致软件执行过程中的性能下降和故障率增加。ARB预测可以帮助开发人员发现并移除ARB,从而减轻软件老化的影响。然而,有arb倾向的文件只占所有分析文件的一小部分。通常很难在项目中收集足够的ARB数据。为了克服训练数据的有限可用性,一些研究人员最近开发了用于ARB预测的跨项目模型。跨项目模型的一个关键点是学习不同项目中实例的良好表示。然而,以往的大多数方法既没有考虑新表示的重构特性,也没有在学习表示中编码源样本的标签信息。为了解决这些缺点,我们提出了一种基于双编码层自编码器的监督表示学习方法(SRLA)来执行跨项目的ARB预测。此外,我们提出了一个转移交叉验证框架来选择跨项目模型的超参数。在三个大型开源项目上的实验表明,与最先进的方法TLAP相比,我们的方法具有有效性和优越性。
{"title":"Supervised Representation Learning Approach for Cross-Project Aging-Related Bug Prediction","authors":"Xiaohui Wan, Zheng Zheng, Fangyun Qin, Yu Qiao, Kishor S. Trivedi","doi":"10.1109/ISSRE.2019.00025","DOIUrl":"https://doi.org/10.1109/ISSRE.2019.00025","url":null,"abstract":"Software aging, which is caused by Aging-Related Bugs (ARBs), tends to occur in long-running systems and may lead to performance degradation and increasing failure rate during software execution. ARB prediction can help developers discover and remove ARBs, thus alleviating the impact of software aging. However, ARB-prone files occupy a small percentage of all the analyzed files. It is usually difficult to gather sufficient ARB data within a project. To overcome the limited availability of training data, several researchers have recently developed cross-project models for ARB prediction. A key point for cross-project models is to learn a good representation for instances in different projects. Nevertheless, most of the previous approaches neither consider the reconstruction property of new representation nor encode source samples' label information in learning representation. To address these shortcomings, we propose a Supervised Representation Learning Approach (SRLA), which is based on double encoding-layer autoencoder, to perform cross-project ARB prediction. Moreover, we present a transfer cross-validation framework to select the hyper-parameters of cross-project models. Experiments on three large open-source projects demonstrate the effectiveness and superiority of our approach compared with the state-of-the-art approach TLAP.","PeriodicalId":254749,"journal":{"name":"2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128121669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Learning Marked Markov Modulated Poisson Processes for Online Predictive Analysis of Attack Scenarios 学习标记马尔可夫调制泊松过程用于在线预测分析攻击场景
L. Carnevali, Francesco Santoni, E. Vicario
Runtime predictive analysis of quantitative models can support software reliability in various application scenarios. The spread of logging technologies promotes approaches where such models are learned from observed events. We consider a system visiting transient states of a hidden process until reaching a final state and producing observations with stochastic arrival times and types conditioned by visited states, and we abstract it as a marked Markov modulated Poisson Process (MMMPP) with left-to right structure. We present an Expectation-Maximization (EM) algorithm that learns the MMMPP parameters from observation sequences acquired in repeated execution of the transient behavior, and we use the model at runtime to infer the current state of the process from actual observed events and to dynamically evaluate the remaining time to the final state. The approach is illustrated using synthetic datasets generated from a stochastic attack tree of the literature enriched with an observation model associating each state with an expected statistics of observation types and arrival times. Accuracy of prediction is evaluated under different variability of hidden states sojourn durations and of the observations arrival process, and compared against previous literature that mainly exploits either the timing or the types of observed events.
定量模型的运行时预测分析可以支持各种应用场景下的软件可靠性。日志技术的普及促进了从观察到的事件中学习这些模型的方法。我们考虑一个系统访问一个隐藏过程的暂态直到到达最终状态,并产生具有随机到达时间和类型的观测值,并将其抽象为具有从左到右结构的标记马尔可夫调制泊松过程(MMMPP)。我们提出了一种期望最大化(EM)算法,该算法从在瞬态行为的重复执行中获得的观察序列中学习MMMPP参数,并且我们在运行时使用该模型从实际观察到的事件推断过程的当前状态,并动态评估到最终状态的剩余时间。该方法使用从文献随机攻击树生成的合成数据集进行说明,该数据集丰富了将每个状态与观察类型和到达时间的预期统计相关联的观察模型。在隐态停留时间和观测值到达过程的不同变异性下评估了预测的准确性,并与以往主要利用观测事件的时间或类型的文献进行了比较。
{"title":"Learning Marked Markov Modulated Poisson Processes for Online Predictive Analysis of Attack Scenarios","authors":"L. Carnevali, Francesco Santoni, E. Vicario","doi":"10.1109/ISSRE.2019.00028","DOIUrl":"https://doi.org/10.1109/ISSRE.2019.00028","url":null,"abstract":"Runtime predictive analysis of quantitative models can support software reliability in various application scenarios. The spread of logging technologies promotes approaches where such models are learned from observed events. We consider a system visiting transient states of a hidden process until reaching a final state and producing observations with stochastic arrival times and types conditioned by visited states, and we abstract it as a marked Markov modulated Poisson Process (MMMPP) with left-to right structure. We present an Expectation-Maximization (EM) algorithm that learns the MMMPP parameters from observation sequences acquired in repeated execution of the transient behavior, and we use the model at runtime to infer the current state of the process from actual observed events and to dynamically evaluate the remaining time to the final state. The approach is illustrated using synthetic datasets generated from a stochastic attack tree of the literature enriched with an observation model associating each state with an expected statistics of observation types and arrival times. Accuracy of prediction is evaluated under different variability of hidden states sojourn durations and of the observations arrival process, and compared against previous literature that mainly exploits either the timing or the types of observed events.","PeriodicalId":254749,"journal":{"name":"2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE)","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114830188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Safety Analysis Method for Perceptual Components in Automated Driving 自动驾驶中感知部件的安全分析方法
Rick Salay, Matt Angus, K. Czarnecki
The use of machine learning (ML) is increasing in many sectors of safety-critical software development and in particular, for the perceptual components of automated driving (AD) functionality. Although some traditional safety engineering techniques such as FTA and FMEA are applicable to ML components, the unique characteristics of ML create challenges. In this paper, we propose a novel safety analysis method called Classification Failure Mode Effects Analysis (CFMEA) which is specialized to assess classification-based perception in AD. Specifically, it defines a systematic way to assess the risk due to classification failure under adversarial attacks or varying degrees of classification uncertainty across the perception-control linkage. We first present the theoretical and methodological foundations for CFMEA, and then demonstrate it by applying it to an AD case study using semantic segmentation perception trained with the Cityscapes driving dataset. Finally, we discuss how CFMEA results could be used to improve an ML-model.
机器学习(ML)在许多安全关键软件开发领域的使用正在增加,特别是在自动驾驶(AD)功能的感知组件方面。虽然一些传统的安全工程技术,如FTA和FMEA,适用于机器学习部件,但机器学习的独特特性带来了挑战。本文提出了一种新的安全分析方法,称为分类失效模式效应分析(CFMEA),该方法专门用于评估AD中基于分类的感知。具体来说,它定义了一种系统的方法来评估由于对抗性攻击或感知-控制联系中不同程度的分类不确定性而导致的分类失败风险。我们首先介绍了CFMEA的理论和方法基础,然后通过使用cityscape驾驶数据集训练的语义分割感知将其应用于AD案例研究来进行演示。最后,我们讨论了CFMEA结果如何用于改进ml模型。
{"title":"A Safety Analysis Method for Perceptual Components in Automated Driving","authors":"Rick Salay, Matt Angus, K. Czarnecki","doi":"10.1109/ISSRE.2019.00013","DOIUrl":"https://doi.org/10.1109/ISSRE.2019.00013","url":null,"abstract":"The use of machine learning (ML) is increasing in many sectors of safety-critical software development and in particular, for the perceptual components of automated driving (AD) functionality. Although some traditional safety engineering techniques such as FTA and FMEA are applicable to ML components, the unique characteristics of ML create challenges. In this paper, we propose a novel safety analysis method called Classification Failure Mode Effects Analysis (CFMEA) which is specialized to assess classification-based perception in AD. Specifically, it defines a systematic way to assess the risk due to classification failure under adversarial attacks or varying degrees of classification uncertainty across the perception-control linkage. We first present the theoretical and methodological foundations for CFMEA, and then demonstrate it by applying it to an AD case study using semantic segmentation perception trained with the Cityscapes driving dataset. Finally, we discuss how CFMEA results could be used to improve an ML-model.","PeriodicalId":254749,"journal":{"name":"2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE)","volume":"191 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122483428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Mirage: Towards a Metasploit-Like Framework for IoT 海市蜃楼:迈向物联网的类似metasploit的框架
Romain Cayre, V. Nicomette, G. Auriol, E. Alata, M. Kaâniche, G. Marconato
Internet of Things (IoT) devices are nowadays widely used in individual homes and factories. Securing these new systems becomes a priority. However, conducting security audits of these connected objects based on experimental evaluation is a challenging task: it requires the use of heterogeneous hardware components leading to a set of specialised software tools, generally incompatible with each other and often complex to use. In this paper, we present a security audit and penetration testing framework called Mirage. This framework, written in Python, is dedicated to the analysis of wireless communications commonly used by IoT devices, and provides a generic, modular, unified and low level audit environment that is easy to adapt to new protocols. The paper describes the software architecture of Mirage, its goals and main features, and presents a concrete example of security audit performed with this framework.
物联网(IoT)设备如今广泛应用于个人家庭和工厂。保护这些新系统成为当务之急。然而,基于实验评估对这些连接对象进行安全审计是一项具有挑战性的任务:它需要使用异构硬件组件,从而产生一组专门的软件工具,这些工具通常彼此不兼容,使用起来往往很复杂。在本文中,我们提出了一个名为Mirage的安全审计和渗透测试框架。该框架用Python编写,专门用于分析物联网设备常用的无线通信,并提供了一个通用、模块化、统一和低级的审计环境,易于适应新的协议。本文介绍了Mirage的软件体系结构、目标和主要特点,并给出了使用该框架进行安全审计的具体示例。
{"title":"Mirage: Towards a Metasploit-Like Framework for IoT","authors":"Romain Cayre, V. Nicomette, G. Auriol, E. Alata, M. Kaâniche, G. Marconato","doi":"10.1109/ISSRE.2019.00034","DOIUrl":"https://doi.org/10.1109/ISSRE.2019.00034","url":null,"abstract":"Internet of Things (IoT) devices are nowadays widely used in individual homes and factories. Securing these new systems becomes a priority. However, conducting security audits of these connected objects based on experimental evaluation is a challenging task: it requires the use of heterogeneous hardware components leading to a set of specialised software tools, generally incompatible with each other and often complex to use. In this paper, we present a security audit and penetration testing framework called Mirage. This framework, written in Python, is dedicated to the analysis of wireless communications commonly used by IoT devices, and provides a generic, modular, unified and low level audit environment that is easy to adapt to new protocols. The paper describes the software architecture of Mirage, its goals and main features, and presents a concrete example of security audit performed with this framework.","PeriodicalId":254749,"journal":{"name":"2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE)","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129494305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Propheticus: Machine Learning Framework for the Development of Predictive Models for Reliable and Secure Software Propheticus:为可靠和安全软件开发预测模型的机器学习框架
João R. Campos, M. Vieira, E. Costa
The growing complexity of software calls for innovative solutions that support the deployment of reliable and secure software. Machine Learning (ML) has shown its applicability to various complex problems and is frequently used in the dependability domain, both for supporting systems design and verification activities. However, using ML is complex and highly dependent on the problem in hand, increasing the probability of mistakes that compromise the results. In this paper, we introduce Propheticus, a ML framework that can be used to create predictive models for reliable and secure software systems. Propheticus attempts to abstract the complexity of ML whilst being easy to use and accommodating the needs of the users. To demonstrate its use, we present two case studies (vulnerability prediction and online failure prediction) that show how it can considerably ease and expedite a thorough ML workflow.
越来越复杂的软件需要创新的解决方案来支持可靠和安全的软件部署。机器学习(ML)已经显示出其对各种复杂问题的适用性,并且经常用于可靠性领域,用于支持系统设计和验证活动。然而,使用机器学习是复杂的,并且高度依赖于手头的问题,这增加了影响结果的错误的可能性。在本文中,我们介绍了Propheticus,这是一个机器学习框架,可用于为可靠和安全的软件系统创建预测模型。Propheticus试图抽象机器学习的复杂性,同时易于使用并适应用户的需求。为了演示它的使用,我们提出了两个案例研究(漏洞预测和在线故障预测),展示了它如何大大简化和加快彻底的ML工作流程。
{"title":"Propheticus: Machine Learning Framework for the Development of Predictive Models for Reliable and Secure Software","authors":"João R. Campos, M. Vieira, E. Costa","doi":"10.1109/ISSRE.2019.00026","DOIUrl":"https://doi.org/10.1109/ISSRE.2019.00026","url":null,"abstract":"The growing complexity of software calls for innovative solutions that support the deployment of reliable and secure software. Machine Learning (ML) has shown its applicability to various complex problems and is frequently used in the dependability domain, both for supporting systems design and verification activities. However, using ML is complex and highly dependent on the problem in hand, increasing the probability of mistakes that compromise the results. In this paper, we introduce Propheticus, a ML framework that can be used to create predictive models for reliable and secure software systems. Propheticus attempts to abstract the complexity of ML whilst being easy to use and accommodating the needs of the users. To demonstrate its use, we present two case studies (vulnerability prediction and online failure prediction) that show how it can considerably ease and expedite a thorough ML workflow.","PeriodicalId":254749,"journal":{"name":"2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129912062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
How to Explain a Patch: An Empirical Study of Patch Explanations in Open Source Projects 如何解释补丁:开源项目中补丁解释的实证研究
Jingjing Liang, Yaozong Hou, Shurui Zhou, Junjie Chen, Y. Xiong, Gang Huang
Abstract-Bugs are inevitable in software development and maintenance processes. Recently a lot of research efforts have been devoted to automatic program repair, aiming to reduce the efforts of debugging. However, since it is difficult to ensure that the generated patches meet all quality requirements such as correctness, developers still need to review the patch. In addition, current techniques produce only patches without explanation, making it difficult for the developers to understand the patch. Therefore, we believe a more desirable approach should generate not only the patch but also an explanation of the patch. To generate a patch explanation, it is important to first understand how patches were explained. In this paper, we explored how developers explain their patches by manually analyzing 300 merged bug-fixing pull requests from six projects on GitHub. Our contribution is twofold. First, we build a patch explanation model, which summarizes the elements in a patch explanation, and corresponding expressive forms. Second, we conducted a quantitative analysis to understand the distributions of elements, and the correlation between elements and their expressive forms.
在软件开发和维护过程中,bug是不可避免的。为了减少调试的工作量,近年来人们对程序自动修复进行了大量的研究。然而,由于很难确保生成的补丁满足所有的质量要求,例如正确性,开发人员仍然需要审查补丁。此外,目前的技术只产生没有解释的补丁,这使得开发人员很难理解补丁。因此,我们认为更理想的方法不仅应该生成补丁,还应该生成补丁的解释。要生成补丁解释,首先要了解补丁是如何解释的。在本文中,我们探讨了开发人员如何通过手工分析来自GitHub上六个项目的300个合并的bug修复拉取请求来解释他们的补丁。我们的贡献是双重的。首先,我们建立了一个补丁解释模型,总结了补丁解释的要素,以及相应的表达形式。其次,我们进行了定量分析,了解元素的分布,以及元素与表现形式之间的相关性。
{"title":"How to Explain a Patch: An Empirical Study of Patch Explanations in Open Source Projects","authors":"Jingjing Liang, Yaozong Hou, Shurui Zhou, Junjie Chen, Y. Xiong, Gang Huang","doi":"10.1109/ISSRE.2019.00016","DOIUrl":"https://doi.org/10.1109/ISSRE.2019.00016","url":null,"abstract":"Abstract-Bugs are inevitable in software development and maintenance processes. Recently a lot of research efforts have been devoted to automatic program repair, aiming to reduce the efforts of debugging. However, since it is difficult to ensure that the generated patches meet all quality requirements such as correctness, developers still need to review the patch. In addition, current techniques produce only patches without explanation, making it difficult for the developers to understand the patch. Therefore, we believe a more desirable approach should generate not only the patch but also an explanation of the patch. To generate a patch explanation, it is important to first understand how patches were explained. In this paper, we explored how developers explain their patches by manually analyzing 300 merged bug-fixing pull requests from six projects on GitHub. Our contribution is twofold. First, we build a patch explanation model, which summarizes the elements in a patch explanation, and corresponding expressive forms. Second, we conducted a quantitative analysis to understand the distributions of elements, and the correlation between elements and their expressive forms.","PeriodicalId":254749,"journal":{"name":"2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134451653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Benefits and Challenges of Model-Based Software Engineering: Lessons Learned Based on Qualitative and Quantitative Findings 基于模型的软件工程的好处和挑战:基于定性和定量发现的经验教训
K. Goseva-Popstojanova, Thomas Kyanko, Noble Nkwocha
Even though Model-based Software Engineering (MBSwE) techniques and Autogenerated Code (AGC) have been increasingly used to produce complex software systems, there is only anecdotal knowledge about the state-of-the practice. Furthermore, there is a lack of empirical studies that explore the potential quality improvements due to the use of these techniques. This paper presents in-depth qualitative findings about development and Software Assurance (SWA) practices and detailed quantitative analysis of software bug reports of a NASA mission that used MBSwE and AGC. The mission's flight software is a combination of handwritten code and AGC developed by two different approaches: one based on state chart models (AGC-M) and another on specification dictionaries (AGC-D). The empirical analysis of fault proneness is based on 380 closed bug reports created by software developers. Our main findings include: (1) MBSwE and AGC provide some benefits, but also impose challenges. (2) SWA done only at a model level is not sufficient. AGC code should also be tested and the models and AGC should always be kept in-sync. AGC must not be changed manually. (3) Fixes made to address an individual bug report were spread both across multiple modules and across multiple files. On average, for each bug report 1.4 modules, that is, 3.4 files were fixed. (4) Most bug reports led to changes in more than one type of file. The majority of changes to auto-generated source code files were made in conjunction to changes in either file with state chart models or XML files derived from dictionaries. (5) For newly developed files, AGC-M and handwritten code were of similar quality, while AGC-D files were the least fault prone.
尽管基于模型的软件工程(MBSwE)技术和自动生成代码(AGC)已经越来越多地用于生成复杂的软件系统,但是关于实践状态的知识只是轶事。此外,缺乏实证研究,探索潜在的质量改进,由于使用这些技术。本文介绍了关于开发和软件保证(SWA)实践的深入定性发现,以及对使用MBSwE和AGC的NASA任务的软件缺陷报告的详细定量分析。该任务的飞行软件是手写代码和AGC的组合,通过两种不同的方法开发:一种基于状态图模型(AGC- m),另一种基于规格字典(AGC- d)。错误倾向的实证分析是基于软件开发人员创建的380个封闭的错误报告。我们的主要发现包括:(1)MBSwE和AGC提供了一些好处,但也带来了挑战。(2)仅在模型级别进行SWA是不够的。AGC代码也应该进行测试,并且模型和AGC应该始终保持同步。不能手动更改AGC。(3)针对单个错误报告所做的修复被分散到多个模块和多个文件中。平均而言,对于每个bug报告1.4个模块,即修复了3.4个文件。大多数错误报告导致不止一种类型文件的更改。对自动生成的源代码文件的大多数更改都是与对带有状态图模型的文件或从字典派生的XML文件的更改一起进行的。(5)对于新开发的文件,AGC-M和手写代码质量相近,而AGC-D文件最不容易出错。
{"title":"Benefits and Challenges of Model-Based Software Engineering: Lessons Learned Based on Qualitative and Quantitative Findings","authors":"K. Goseva-Popstojanova, Thomas Kyanko, Noble Nkwocha","doi":"10.1109/ISSRE.2019.00048","DOIUrl":"https://doi.org/10.1109/ISSRE.2019.00048","url":null,"abstract":"Even though Model-based Software Engineering (MBSwE) techniques and Autogenerated Code (AGC) have been increasingly used to produce complex software systems, there is only anecdotal knowledge about the state-of-the practice. Furthermore, there is a lack of empirical studies that explore the potential quality improvements due to the use of these techniques. This paper presents in-depth qualitative findings about development and Software Assurance (SWA) practices and detailed quantitative analysis of software bug reports of a NASA mission that used MBSwE and AGC. The mission's flight software is a combination of handwritten code and AGC developed by two different approaches: one based on state chart models (AGC-M) and another on specification dictionaries (AGC-D). The empirical analysis of fault proneness is based on 380 closed bug reports created by software developers. Our main findings include: (1) MBSwE and AGC provide some benefits, but also impose challenges. (2) SWA done only at a model level is not sufficient. AGC code should also be tested and the models and AGC should always be kept in-sync. AGC must not be changed manually. (3) Fixes made to address an individual bug report were spread both across multiple modules and across multiple files. On average, for each bug report 1.4 modules, that is, 3.4 files were fixed. (4) Most bug reports led to changes in more than one type of file. The majority of changes to auto-generated source code files were made in conjunction to changes in either file with state chart models or XML files derived from dictionaries. (5) For newly developed files, AGC-M and handwritten code were of similar quality, while AGC-D files were the least fault prone.","PeriodicalId":254749,"journal":{"name":"2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127711372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Title Page iii 第三页标题
{"title":"Title Page iii","authors":"","doi":"10.1109/issre.2019.00002","DOIUrl":"https://doi.org/10.1109/issre.2019.00002","url":null,"abstract":"","PeriodicalId":254749,"journal":{"name":"2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129975692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1