首页 > 最新文献

2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)最新文献

英文 中文
Synthesis of Infinite-State Systems with Random Behavior 具有随机行为的无限状态系统的综合
Andreas Katis, Grigory Fedyukovich, Jeffrey Chen, D. Greve, Sanjai Rayadurgam, M. Whalen
Diversity in the exhibited behavior of a given system is a desirable characteristic in a variety of application contexts. Synthesis of conformant implementations often proceeds by discovering witnessing Skolem functions, which are traditionally deterministic. In this paper, we present a novel Skolem extraction algorithm to enable synthesis of witnesses with random behavior and demonstrate its applicability in the context of reactive systems. The synthesized solutions are guaranteed by design to meet the given specification, while exhibiting a high degree of diversity in their responses to external stimuli. Case studies demonstrate how our proposed framework unveils a novel application of synthesis in model-based fuzz testing to generate fuzzers of competitive performance to general-purpose alternatives, as well as the practical utility of synthesized controllers in robot motion planning problems.
在各种应用环境中,给定系统所表现的行为的多样性是一个理想的特征。一致性实现的综合通常通过发现传统上确定的目击Skolem函数来进行。在本文中,我们提出了一种新的Skolem提取算法,以实现具有随机行为的证人的合成,并证明了其在反应系统中的适用性。合成的溶液通过设计保证满足给定的规格,同时在对外部刺激的反应中表现出高度的多样性。案例研究展示了我们提出的框架如何在基于模型的模糊测试中揭示了一种新的综合应用,以生成具有竞争性能的通用替代模糊器,以及综合控制器在机器人运动规划问题中的实际应用。
{"title":"Synthesis of Infinite-State Systems with Random Behavior","authors":"Andreas Katis, Grigory Fedyukovich, Jeffrey Chen, D. Greve, Sanjai Rayadurgam, M. Whalen","doi":"10.1145/3324884.3416586","DOIUrl":"https://doi.org/10.1145/3324884.3416586","url":null,"abstract":"Diversity in the exhibited behavior of a given system is a desirable characteristic in a variety of application contexts. Synthesis of conformant implementations often proceeds by discovering witnessing Skolem functions, which are traditionally deterministic. In this paper, we present a novel Skolem extraction algorithm to enable synthesis of witnesses with random behavior and demonstrate its applicability in the context of reactive systems. The synthesized solutions are guaranteed by design to meet the given specification, while exhibiting a high degree of diversity in their responses to external stimuli. Case studies demonstrate how our proposed framework unveils a novel application of synthesis in model-based fuzz testing to generate fuzzers of competitive performance to general-purpose alternatives, as well as the practical utility of synthesized controllers in robot motion planning problems.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121233251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Automated Third-Party Library Detection for Android Applications: Are We There Yet? Android应用程序的自动第三方库检测:我们做到了吗?
Xian Zhan, Lingling Fan, Tianming Liu, Sen Chen, Li Li, Haoyu Wang, Yifei Xu, Xiapu Luo, Yang Liu
Third-party libraries (TPLs) have become a significant part of the Android ecosystem. Developers can employ various TPLs with different functionalities to facilitate their app development. Unfortunately, the popularity of TPLs also brings new challenges and even threats. TPLs may carry malicious or vulnerable code, which can infect popular apps to pose threats to mobile users. Besides, the code of third-party libraries could constitute noises in some downstream tasks (e.g., malware and repackaged app detection). Thus, researchers have developed various tools to identify TPLs. However, no existing work has studied these TPL detection tools in detail; different tools focus on different applications with performance differences, but little is known about them. To better understand existing TPL detection tools and dissect TPL detection techniques, we conduct a comprehensive empirical study to fill the gap by evaluating and comparing all publicly available TPL detection tools based on four criteria: effectiveness, efficiency, code obfuscation-resilience capability, and ease of use. We reveal their advantages and disadvantages based on a systematic and thorough empirical study. Furthermore, we also conduct a user study to evaluate the usability of each tool. The results showthat LibScout outperforms others regarding effectiveness, LibRadar takes less time than others and is also regarded as the most easy-to-use one, and LibPecker performs the best in defending against code obfuscation techniques. We further summarize the lessons learned from different perspectives, including users, tool implementation, and researchers. Besides, we enhance these open-sourced tools by fixing their limitations to improve their detection ability. We also build an extensible framework that integrates all existing available TPL detection tools, providing online service for the research community. We make publicly available the evaluation dataset and enhanced tools. We believe our work provides a clear picture of existing TPL detection techniques and also give a road-map for future directions.
第三方库(tpl)已经成为Android生态系统的重要组成部分。开发人员可以使用具有不同功能的各种tpl来促进他们的应用程序开发。不幸的是,第三方物流的普及也带来了新的挑战甚至威胁。tpl可能携带恶意或易受攻击的代码,这些代码可以感染流行的应用程序,对移动用户构成威胁。此外,第三方库的代码可能会在一些下游任务中构成噪音(例如恶意软件和重新打包的应用程序检测)。因此,研究人员开发了各种工具来识别tpl。然而,目前还没有详细研究这些TPL检测工具的工作;不同的工具针对不同的应用程序,这些应用程序具有不同的性能差异,但人们对它们知之甚少。为了更好地了解现有的TPL检测工具和剖析TPL检测技术,我们进行了一项全面的实证研究,通过基于四个标准评估和比较所有公开可用的TPL检测工具来填补空白:有效性、效率、代码混淆恢复能力和易用性。通过系统深入的实证研究,揭示了它们的优缺点。此外,我们还进行了用户研究,以评估每个工具的可用性。结果表明,LibScout在效率方面优于其他工具,LibRadar比其他工具花费的时间更少,也被认为是最容易使用的工具,LibPecker在防御代码混淆技术方面表现最好。我们进一步总结了从不同角度(包括用户、工具实现和研究人员)获得的经验教训。此外,我们还对这些开源工具进行了改进,修正了它们的局限性,提高了它们的检测能力。我们还构建了一个可扩展的框架,集成了所有现有可用的TPL检测工具,为研究社区提供在线服务。我们公开了评估数据集和增强工具。我们相信我们的工作为现有的TPL检测技术提供了一个清晰的画面,也为未来的方向提供了路线图。
{"title":"Automated Third-Party Library Detection for Android Applications: Are We There Yet?","authors":"Xian Zhan, Lingling Fan, Tianming Liu, Sen Chen, Li Li, Haoyu Wang, Yifei Xu, Xiapu Luo, Yang Liu","doi":"10.1145/3324884.3416582","DOIUrl":"https://doi.org/10.1145/3324884.3416582","url":null,"abstract":"Third-party libraries (TPLs) have become a significant part of the Android ecosystem. Developers can employ various TPLs with different functionalities to facilitate their app development. Unfortunately, the popularity of TPLs also brings new challenges and even threats. TPLs may carry malicious or vulnerable code, which can infect popular apps to pose threats to mobile users. Besides, the code of third-party libraries could constitute noises in some downstream tasks (e.g., malware and repackaged app detection). Thus, researchers have developed various tools to identify TPLs. However, no existing work has studied these TPL detection tools in detail; different tools focus on different applications with performance differences, but little is known about them. To better understand existing TPL detection tools and dissect TPL detection techniques, we conduct a comprehensive empirical study to fill the gap by evaluating and comparing all publicly available TPL detection tools based on four criteria: effectiveness, efficiency, code obfuscation-resilience capability, and ease of use. We reveal their advantages and disadvantages based on a systematic and thorough empirical study. Furthermore, we also conduct a user study to evaluate the usability of each tool. The results showthat LibScout outperforms others regarding effectiveness, LibRadar takes less time than others and is also regarded as the most easy-to-use one, and LibPecker performs the best in defending against code obfuscation techniques. We further summarize the lessons learned from different perspectives, including users, tool implementation, and researchers. Besides, we enhance these open-sourced tools by fixing their limitations to improve their detection ability. We also build an extensible framework that integrates all existing available TPL detection tools, providing online service for the research community. We make publicly available the evaluation dataset and enhanced tools. We believe our work provides a clear picture of existing TPL detection techniques and also give a road-map for future directions.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"718 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122994433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
Discovering UI Display Issues with Visual Understanding 发现UI显示问题与视觉理解
Zhe Liu
GUI complexity posts a great challenge to the GUI implementation. According to our pilot study of crowdtesting bug reports, display issues such as text overlap, blurred screen, missing image always occur during GUI rendering on difference devices due to the software or hardware compatibility. They negatively influence the app usability, resulting in poor user experience. To detect these issues, we propose a novel approach, OwlEye, based on deep learning for modelling visual information of the GUI screenshot.Therefore, OwlEye can detect GUIs with display issues and also locate the detailed region of the issue in the given GUI for guiding developers to fix the bug. We manually construct a large-scale labelled dataset with 4,470 GUI screenshots with UI display issues. We develop a heuristics-based data augmentation method and a GAN-based data augmentation method for boosting the performance of our OwlEye. At present, the evaluation demonstrates that our OwlEye can achieve 85% precision and 84% recall in detecting UI display issues, and 90% accuracy in localizing these issues.
GUI的复杂性给GUI实现带来了巨大的挑战。根据我们对众测漏洞报告的初步研究,由于软件或硬件兼容性的原因,在不同设备上呈现GUI时,总是会出现文本重叠、屏幕模糊、图像缺失等显示问题。它们会对应用的可用性产生负面影响,导致糟糕的用户体验。为了检测这些问题,我们提出了一种基于深度学习的新方法OwlEye,用于建模GUI截图的视觉信息。因此,OwlEye可以检测有显示问题的GUI,还可以在给定的GUI中定位问题的详细区域,以指导开发人员修复错误。我们用带有UI显示问题的4,470个GUI截图手动构建了一个大规模标记数据集。我们开发了一种基于启发式的数据增强方法和一种基于gan的数据增强方法来提高我们的OwlEye的性能。目前的评估表明,我们的OwlEye在检测UI显示问题时可以达到85%的准确率和84%的召回率,在定位这些问题时可以达到90%的准确率。
{"title":"Discovering UI Display Issues with Visual Understanding","authors":"Zhe Liu","doi":"10.1145/3324884.3418917","DOIUrl":"https://doi.org/10.1145/3324884.3418917","url":null,"abstract":"GUI complexity posts a great challenge to the GUI implementation. According to our pilot study of crowdtesting bug reports, display issues such as text overlap, blurred screen, missing image always occur during GUI rendering on difference devices due to the software or hardware compatibility. They negatively influence the app usability, resulting in poor user experience. To detect these issues, we propose a novel approach, OwlEye, based on deep learning for modelling visual information of the GUI screenshot.Therefore, OwlEye can detect GUIs with display issues and also locate the detailed region of the issue in the given GUI for guiding developers to fix the bug. We manually construct a large-scale labelled dataset with 4,470 GUI screenshots with UI display issues. We develop a heuristics-based data augmentation method and a GAN-based data augmentation method for boosting the performance of our OwlEye. At present, the evaluation demonstrates that our OwlEye can achieve 85% precision and 84% recall in detecting UI display issues, and 90% accuracy in localizing these issues.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117145243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Automatic Extraction of Cause-Effect-Relations from Requirements Artifacts 从需求工件中自动提取因果关系
Julian Frattini, Maximilian Junker, M. Unterkalmsteiner, D. Méndez
Background: The detection and extraction of causality from natural language sentences have shown great potential in various fields of application. The field of requirements engineering is eligible for multiple reasons: (1) requirements artifacts are primarily written in natural language, (2) causal sentences convey essential context about the subject of requirements, and (3) extracted and formalized causality relations are usable for a (semi-)automatic translation into further artifacts, such as test cases. Objective: We aim at understanding the value of interactive causality extraction based on syntactic criteria for the context of requirements engineering. Method: We developed a prototype of a system for automatic causality extraction and evaluate it by applying it to a set of publicly available requirements artifacts, determining whether the automatic extraction reduces the manual effort of requirements formalization. Result: During the evaluation we analyzed 4457 natural language sentences from 18 requirements documents, 558 of which were causal (12.52%). The best evaluation of a requirements document provided an automatic extraction of 48.57% cause-effect graphs on average, which demonstrates the feasibility of the approach. Limitation: The feasibility of the approach has been proven in theory but lacks exploration of being scaled up for practical use. Evaluating the applicability of the automatic causality extraction for a requirements engineer is left for future research. Conclusion: A syntactic approach for causality extraction is viable for the context of requirements engineering and can aid a pipeline towards an automatic generation of further artifacts from requirements artifacts.
背景:自然语言句子因果关系的检测和提取在各个领域都显示出巨大的应用潜力。需求工程领域是有资格的,有多种原因:(1)需求工件主要是用自然语言编写的,(2)因果语句传达了关于需求主题的基本上下文,以及(3)提取和形式化的因果关系可用于(半)自动翻译为进一步的工件,例如测试用例。目的:我们的目标是理解基于句法标准的交互因果关系抽取在需求工程环境中的价值。方法:我们开发了一个用于自动因果关系提取的系统原型,并通过将其应用于一组公开可用的需求工件来评估它,确定自动提取是否减少了需求形式化的手工工作。结果:在评估过程中,我们分析了来自18个需求文档的4457个自然语言句子,其中558个是因果关系,占12.52%。需求文档的最佳评估提供了平均48.57%因果图的自动提取,这证明了该方法的可行性。限制:该方法的可行性已在理论上得到证实,但缺乏扩大实际应用的探索。评估自动因果关系提取对需求工程师的适用性有待于未来的研究。结论:因果关系提取的语法方法对于需求工程的上下文是可行的,并且可以帮助从需求工件自动生成进一步工件的管道。
{"title":"Automatic Extraction of Cause-Effect-Relations from Requirements Artifacts","authors":"Julian Frattini, Maximilian Junker, M. Unterkalmsteiner, D. Méndez","doi":"10.1145/3324884.3416549","DOIUrl":"https://doi.org/10.1145/3324884.3416549","url":null,"abstract":"Background: The detection and extraction of causality from natural language sentences have shown great potential in various fields of application. The field of requirements engineering is eligible for multiple reasons: (1) requirements artifacts are primarily written in natural language, (2) causal sentences convey essential context about the subject of requirements, and (3) extracted and formalized causality relations are usable for a (semi-)automatic translation into further artifacts, such as test cases. Objective: We aim at understanding the value of interactive causality extraction based on syntactic criteria for the context of requirements engineering. Method: We developed a prototype of a system for automatic causality extraction and evaluate it by applying it to a set of publicly available requirements artifacts, determining whether the automatic extraction reduces the manual effort of requirements formalization. Result: During the evaluation we analyzed 4457 natural language sentences from 18 requirements documents, 558 of which were causal (12.52%). The best evaluation of a requirements document provided an automatic extraction of 48.57% cause-effect graphs on average, which demonstrates the feasibility of the approach. Limitation: The feasibility of the approach has been proven in theory but lacks exploration of being scaled up for practical use. Evaluating the applicability of the automatic causality extraction for a requirements engineer is left for future research. Conclusion: A syntactic approach for causality extraction is viable for the context of requirements engineering and can aid a pipeline towards an automatic generation of further artifacts from requirements artifacts.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"348 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132418197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A Deep Multitask Learning Approach for Requirements Discovery and Annotation from Open Forum 面向需求发现与标注的深度多任务学习方法
Mingyang Li, Lin Shi, Ye Yang, Qing Wang
The ability in rapidly learning and adapting to evolving user needs is key to modern business successes. Existing methods are based on text mining and machine learning techniques to analyze user comments and feedback, and often constrained by heavy reliance on manually codified rules or insufficient training data. Multitask learning (MTL) is an effective approach with many successful applications, with the potential to address these limitations associated with requirements analysis tasks. In this paper, we propose a deep MTL-based approach, DEMAR, to address these limitations when discovering requirements from massive issue reports and annotating the sentences in support of automated requirements analysis. DEMAR consists of three main phases: (1) data augmentation phase, for data preparation and allowing data sharing beyond single task learning; (2) model construction phase, for constructing the MTL-based model for requirements discovery and requirements annotation tasks; and (3) model training phase, enabling eavesdropping by shared loss function between the two related tasks. Evaluation results from eight open-source projects show that, the proposed multitask learning approach outperforms two state-of-the-art approaches (CNC and FRA) and six common machine learning algorithms, with the precision of 91 % and the recall of 83% for requirements discovery task, and the overall accuracy of 83% for requirements annotation task. The proposed approach provides a novel and effective way to jointly learn two related requirements analysis tasks. We believe that it also sheds light on further directions of exploring multitask learning in solving other software engineering problems.
快速学习和适应不断变化的用户需求的能力是现代商业成功的关键。现有的方法是基于文本挖掘和机器学习技术来分析用户评论和反馈,并且经常受到严重依赖人工编纂规则或训练数据不足的限制。多任务学习(MTL)是许多成功应用程序的有效方法,具有解决与需求分析任务相关的这些限制的潜力。在本文中,我们提出了一种深度的基于mtl的方法,DEMAR,以解决从大量问题报告中发现需求并注释句子以支持自动化需求分析时的这些限制。DEMAR包括三个主要阶段:(1)数据增强阶段,用于数据准备和允许数据共享超越单任务学习;(2)模型构建阶段,为需求发现和需求标注任务构建基于mtl的模型;(3)模型训练阶段,通过两个相关任务之间的共享损失函数实现窃听。来自8个开源项目的评估结果表明,所提出的多任务学习方法优于两种最先进的方法(CNC和FRA)以及6种常见的机器学习算法,在需求发现任务中准确率为91%,召回率为83%,在需求标注任务中总体准确率为83%。该方法为联合学习两个相关的需求分析任务提供了一种新颖有效的方法。我们相信它也为探索多任务学习解决其他软件工程问题的进一步方向提供了启示。
{"title":"A Deep Multitask Learning Approach for Requirements Discovery and Annotation from Open Forum","authors":"Mingyang Li, Lin Shi, Ye Yang, Qing Wang","doi":"10.1145/3324884.3416627","DOIUrl":"https://doi.org/10.1145/3324884.3416627","url":null,"abstract":"The ability in rapidly learning and adapting to evolving user needs is key to modern business successes. Existing methods are based on text mining and machine learning techniques to analyze user comments and feedback, and often constrained by heavy reliance on manually codified rules or insufficient training data. Multitask learning (MTL) is an effective approach with many successful applications, with the potential to address these limitations associated with requirements analysis tasks. In this paper, we propose a deep MTL-based approach, DEMAR, to address these limitations when discovering requirements from massive issue reports and annotating the sentences in support of automated requirements analysis. DEMAR consists of three main phases: (1) data augmentation phase, for data preparation and allowing data sharing beyond single task learning; (2) model construction phase, for constructing the MTL-based model for requirements discovery and requirements annotation tasks; and (3) model training phase, enabling eavesdropping by shared loss function between the two related tasks. Evaluation results from eight open-source projects show that, the proposed multitask learning approach outperforms two state-of-the-art approaches (CNC and FRA) and six common machine learning algorithms, with the precision of 91 % and the recall of 83% for requirements discovery task, and the overall accuracy of 83% for requirements annotation task. The proposed approach provides a novel and effective way to jointly learn two related requirements analysis tasks. We believe that it also sheds light on further directions of exploring multitask learning in solving other software engineering problems.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"42 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131656236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Closer to the Edge: Testing Compilers More Thoroughly by Being Less Conservative About Undefined Behaviour 更接近边缘:通过对未定义行为不那么保守来更彻底地测试编译器
Karine Even-Mendoza, Cristian Cadar, A. Donaldson
Randomised compiler testing techniques require a means of generating programs that are free from undefined behaviour (UB) in order to reliably reveal miscompilation bugs. Existing program generators such as Csmith heavily restrict the form of generated programs in order to achieve DB-freedom. We hypothesise that the idiomatic nature of such programs limits the test coverage they can offer. Our idea is to generate less restricted programs that are still UB-free-programs that get closer to the edge of UB, but that do not quite cross the edge. We present preliminary support for our idea via a prototype tool, Csmithedge, which uses simple dynamic analysis to determine where Csmith has been too conservative in its use of safe math wrappers that guarantee UB-freedom for arithmetic operations. By eliminating redundant wrappers, Csmithedge was able to discover two new miscompilation bugs in GCC that could not be found via intensive testing using regular Csmith, and to achieve substantial differences in code coverage on GCC compared with regular Csmith. CCS CONCEPTS • Software and its engineering →Compilers; Software verification and validation.
随机编译器测试技术需要一种方法来生成没有未定义行为(UB)的程序,以便可靠地揭示错误编译。现有的程序生成器(如Csmith)为了实现db自由,严重限制了生成程序的形式。我们假设这些程序的惯用性质限制了它们所能提供的测试覆盖率。我们的想法是生成限制较少的程序,这些程序仍然是自由的,它们更接近UB的边缘,但不会完全越过UB的边缘。我们通过原型工具Csmithedge为我们的想法提供了初步支持,该工具使用简单的动态分析来确定Csmith在使用安全数学包装器(保证算术操作的usb自由)方面过于保守的地方。通过消除冗余的包装器,Csmithedge能够在GCC中发现两个新的错误编译错误,这是通过使用常规Csmith进行密集测试无法发现的,并且与常规Csmith相比,在GCC上实现了代码覆盖率的实质性差异。CCS CONCEPTS•软件及其工程→编译器;软件验证和确认。
{"title":"Closer to the Edge: Testing Compilers More Thoroughly by Being Less Conservative About Undefined Behaviour","authors":"Karine Even-Mendoza, Cristian Cadar, A. Donaldson","doi":"10.1145/3324884.3418933","DOIUrl":"https://doi.org/10.1145/3324884.3418933","url":null,"abstract":"Randomised compiler testing techniques require a means of generating programs that are free from undefined behaviour (UB) in order to reliably reveal miscompilation bugs. Existing program generators such as Csmith heavily restrict the form of generated programs in order to achieve DB-freedom. We hypothesise that the idiomatic nature of such programs limits the test coverage they can offer. Our idea is to generate less restricted programs that are still UB-free-programs that get closer to the edge of UB, but that do not quite cross the edge. We present preliminary support for our idea via a prototype tool, Csmithedge, which uses simple dynamic analysis to determine where Csmith has been too conservative in its use of safe math wrappers that guarantee UB-freedom for arithmetic operations. By eliminating redundant wrappers, Csmithedge was able to discover two new miscompilation bugs in GCC that could not be found via intensive testing using regular Csmith, and to achieve substantial differences in code coverage on GCC compared with regular Csmith. CCS CONCEPTS • Software and its engineering →Compilers; Software verification and validation.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131059284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Applying Learning Techniques to Oracle Synthesis 将学习技术应用于Oracle合成
F. Molina
Software reliability is a primary concern in the construction of software, and thus a fundamental component in the definition of software quality. Analyzing software reliability requires a specification of the intended behavior of the software under analysis. Unfortunately, software many times lacks such specifications. This issue seriously diminishes the analyzability of software with respect to its reliability. Thus, finding novel techniques to capture the intended software behavior in the form of specifications would allow us to exploit them for automated reliability analysis. Our research focuses on the application of learning techniques to automatically distinguish correct from incorrect software behavior. The aim here is to decrease the developer's effort in specifying oracles, and instead generating them from actual software behaviors.
软件可靠性是软件构建中的主要关注点,因此也是软件质量定义中的基本组成部分。分析软件可靠性需要对被分析软件的预期行为进行规范。不幸的是,软件很多时候缺乏这样的规范。这个问题严重削弱了软件在可靠性方面的可分析性。因此,找到以规格说明的形式捕获预期软件行为的新技术将允许我们利用它们进行自动化可靠性分析。我们的研究重点是应用学习技术来自动区分正确和不正确的软件行为。这里的目的是减少开发人员在指定oracle上的工作,而是从实际的软件行为中生成它们。
{"title":"Applying Learning Techniques to Oracle Synthesis","authors":"F. Molina","doi":"10.1145/3324884.3415287","DOIUrl":"https://doi.org/10.1145/3324884.3415287","url":null,"abstract":"Software reliability is a primary concern in the construction of software, and thus a fundamental component in the definition of software quality. Analyzing software reliability requires a specification of the intended behavior of the software under analysis. Unfortunately, software many times lacks such specifications. This issue seriously diminishes the analyzability of software with respect to its reliability. Thus, finding novel techniques to capture the intended software behavior in the form of specifications would allow us to exploit them for automated reliability analysis. Our research focuses on the application of learning techniques to automatically distinguish correct from incorrect software behavior. The aim here is to decrease the developer's effort in specifying oracles, and instead generating them from actual software behaviors.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130652361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Attend and Represent: A Novel View on Algorithm Selection for Software Verification 参与与代表:软件验证算法选择的新视角
Cedric Richter, H. Wehrheim
Today, a plethora of different software verification tools exist. When having a concrete verification task at hand, software developers thus face the problem of algorithm selection. Existing algorithm selectors for software verification typically use handpicked program features together with (1) either manually designed selection heuristics or (2) machine learned strategies. While the first approach suffers from not being transferable to other selection problems, the second approach lacks interpretability, i.e., insights into reasons for choosing particular tools. In this paper, we propose a novel approach to algorithm selection for software verification. Our approach employs representation learning together with an attention mechanism. Representation learning circumvents feature engineering, i.e., avoids the handpicking of program features. Attention permits a form of interpretability of the learned selectors. We have implemented our approach and have experimentally evaluated and compared it with existing approaches. The evaluation shows that representation learning does not only outperform manual feature engineering, but also enables transferability of the learning model to other selection tasks.
今天,存在着大量不同的软件验证工具。当手头有具体的验证任务时,软件开发人员就面临算法选择的问题。现有的软件验证算法选择器通常使用精心挑选的程序特征以及(1)手动设计的选择启发式或(2)机器学习策略。虽然第一种方法无法转移到其他选择问题,但第二种方法缺乏可解释性,即对选择特定工具的原因的见解。本文提出了一种新的软件验证算法选择方法。我们的方法采用表征学习和注意机制。表示学习规避了特征工程,即避免了手工挑选程序特征。注意允许一种形式的可解释性的学习选择。我们已经实现了我们的方法,并对其进行了实验评估,并与现有方法进行了比较。评估结果表明,表征学习不仅优于人工特征工程,而且能够将学习模型转移到其他选择任务中。
{"title":"Attend and Represent: A Novel View on Algorithm Selection for Software Verification","authors":"Cedric Richter, H. Wehrheim","doi":"10.1145/3324884.3416633","DOIUrl":"https://doi.org/10.1145/3324884.3416633","url":null,"abstract":"Today, a plethora of different software verification tools exist. When having a concrete verification task at hand, software developers thus face the problem of algorithm selection. Existing algorithm selectors for software verification typically use handpicked program features together with (1) either manually designed selection heuristics or (2) machine learned strategies. While the first approach suffers from not being transferable to other selection problems, the second approach lacks interpretability, i.e., insights into reasons for choosing particular tools. In this paper, we propose a novel approach to algorithm selection for software verification. Our approach employs representation learning together with an attention mechanism. Representation learning circumvents feature engineering, i.e., avoids the handpicking of program features. Attention permits a form of interpretability of the learned selectors. We have implemented our approach and have experimentally evaluated and compared it with existing approaches. The evaluation shows that representation learning does not only outperform manual feature engineering, but also enables transferability of the learning model to other selection tasks.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126562805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Speeding up GUI Testing by On-Device Test Generation 通过设备上测试生成加速GUI测试
N. P. Borges, J. Rau, A. Zeller
When generating GUI tests for Android apps, it typically is a separate test computer that generates interactions, which are then executed on an actual Android device. While this approach is efficient in the sense that apps and interactions execute quickly, the communication overhead between test computer and device slows down testing considerably. In this work, we present DD-2, a test generator for Android that tests other apps on the device using Android accessibility services. In our experiments, DD-2 has shown to be 3.2 times faster than its computer-device counterpart, while sharing the same source code.
当为Android应用程序生成GUI测试时,它通常是一个单独的测试计算机,生成交互,然后在实际的Android设备上执行。虽然这种方法在应用程序和交互快速执行的意义上是有效的,但测试计算机和设备之间的通信开销大大降低了测试速度。在这项工作中,我们介绍了DD-2,一个用于Android的测试生成器,它使用Android辅助功能服务测试设备上的其他应用程序。在我们的实验中,DD-2在共享相同源代码的情况下比计算机设备快3.2倍。
{"title":"Speeding up GUI Testing by On-Device Test Generation","authors":"N. P. Borges, J. Rau, A. Zeller","doi":"10.1145/3324884.3415302","DOIUrl":"https://doi.org/10.1145/3324884.3415302","url":null,"abstract":"When generating GUI tests for Android apps, it typically is a separate test computer that generates interactions, which are then executed on an actual Android device. While this approach is efficient in the sense that apps and interactions execute quickly, the communication overhead between test computer and device slows down testing considerably. In this work, we present DD-2, a test generator for Android that tests other apps on the device using Android accessibility services. In our experiments, DD-2 has shown to be 3.2 times faster than its computer-device counterpart, while sharing the same source code.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116977100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Retrieve and Refine: Exemplar-based Neural Comment Generation 检索和精炼:基于范例的神经评论生成
Bolin Wei, Yongming Li, Ge Li, Xin Xia, Zhi Jin
Code comment generation which aims to automatically generate natural language descriptions for source code, is a crucial task in the field of automatic software development. Traditional comment generation methods use manually-crafted templates or information retrieval (IR) techniques to generate summaries for source code. In recent years, neural network-based methods which leveraged acclaimed encoder-decoder deep learning framework to learn comment generation patterns from a large-scale parallel code corpus, have achieved impressive results. However, these emerging methods only take code-related information as input. Software reuse is common in the process of software development, meaning that comments of similar code snippets are helpful for comment generation. Inspired by the IR-based and template-based approaches, in this paper, we propose a neural comment generation approach where we use the existing comments of similar code snippets as exemplars to guide comment generation. Specifically, given a piece of code, we first use an IR technique to retrieve a similar code snippet and treat its comment as an exemplar. Then we design a novel seq2seq neural network that takes the given code, its AST, its similar code, and its exemplar as input, and leverages the information from the exemplar to assist in the target comment generation based on the semantic similarity between the source code and the similar code. We evaluate our approach on a large-scale Java corpus, which contains about 2M samples, and experimental results demonstrate that our model outperforms the state-of-the-art methods by a substantial margin.
代码注释生成是软件自动化开发领域的一项重要任务,其目的是自动生成源代码的自然语言描述。传统的注释生成方法使用手工制作的模板或信息检索(IR)技术来生成源代码摘要。近年来,基于神经网络的方法利用广受欢迎的编码器-解码器深度学习框架从大规模并行代码语料库中学习注释生成模式,取得了令人印象深刻的成果。然而,这些新出现的方法只接受与代码相关的信息作为输入。软件重用在软件开发过程中很常见,这意味着类似代码片段的注释有助于生成注释。受基于ir和基于模板的方法的启发,本文提出了一种神经注释生成方法,该方法使用相似代码片段的现有注释作为示例来指导注释生成。具体来说,给定一段代码,我们首先使用IR技术检索类似的代码片段,并将其注释视为示例。然后,我们设计了一种新的seq2seq神经网络,该网络将给定代码、AST、相似代码和样本作为输入,并利用样本信息来辅助基于源代码和相似代码之间语义相似性的目标注释生成。我们在一个包含大约2M个样本的大规模Java语料库上评估了我们的方法,实验结果表明,我们的模型在很大程度上优于最先进的方法。
{"title":"Retrieve and Refine: Exemplar-based Neural Comment Generation","authors":"Bolin Wei, Yongming Li, Ge Li, Xin Xia, Zhi Jin","doi":"10.1145/3324884.3416578","DOIUrl":"https://doi.org/10.1145/3324884.3416578","url":null,"abstract":"Code comment generation which aims to automatically generate natural language descriptions for source code, is a crucial task in the field of automatic software development. Traditional comment generation methods use manually-crafted templates or information retrieval (IR) techniques to generate summaries for source code. In recent years, neural network-based methods which leveraged acclaimed encoder-decoder deep learning framework to learn comment generation patterns from a large-scale parallel code corpus, have achieved impressive results. However, these emerging methods only take code-related information as input. Software reuse is common in the process of software development, meaning that comments of similar code snippets are helpful for comment generation. Inspired by the IR-based and template-based approaches, in this paper, we propose a neural comment generation approach where we use the existing comments of similar code snippets as exemplars to guide comment generation. Specifically, given a piece of code, we first use an IR technique to retrieve a similar code snippet and treat its comment as an exemplar. Then we design a novel seq2seq neural network that takes the given code, its AST, its similar code, and its exemplar as input, and leverages the information from the exemplar to assist in the target comment generation based on the semantic similarity between the source code and the similar code. We evaluate our approach on a large-scale Java corpus, which contains about 2M samples, and experimental results demonstrate that our model outperforms the state-of-the-art methods by a substantial margin.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124351321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
期刊
2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1