2019 26th Asia-Pacific Software Engineering Conference (APSEC)最新文献_第3页

Integrating Static Program Analysis Tools for Verifying Cautions of Microcontroller 集成静态程序分析工具验证微控制器注意事项

2019 26th Asia-Pacific Software Engineering Conference (APSEC)

Pub Date : 2019-12-01 DOI: 10.1109/APSEC48747.2019.00021

Thuy Nguyen, Toshiaki Aoki, Takashi Tomita, Junpei Endo

Microcontrollers are usually supplied with hardware manuals, where information that requires special attention is emphasized as cautions. Currently, the process of verifying these cautions is performed manually as there is no single tool that can directly handle this task. This research aims at automating the verification process for these cautions as much as possible. Firstly, we investigate two sections which have a considerable number of required cautions in the hardware manual of a popular microcontroller to obtain the typical cautions of microcontrollers. Secondly, we analyze and categorize these cautions into several groups. Subsequently, we propose a semi-automatic approach which uses the assertion-based method and integrates two existing static program analysis tools (i.e., Cobra and Eva plugin of Frama-C) to verify the cautions. To show the applicability of this approach, we conduct two experiments with a benchmark source code and an industrial source code provided by Aisin comCruise Co., Ltd.. The results show that this approach is capable of detecting all violations in the benchmark program and only misses one expected violation in the industrial project.

微控制器通常与硬件手册一起提供，其中需要特别注意的信息被强调为警告。目前，验证这些注意事项的过程是手动执行的，因为没有单一的工具可以直接处理此任务。本研究旨在尽可能自动化这些警告的验证过程。首先，我们研究了流行微控制器硬件手册中具有相当数量所需注意事项的两个部分，以获得微控制器的典型注意事项。其次，我们对这些警告进行了分析和分类。随后，我们提出了一种半自动方法，该方法使用基于断言的方法，并集成现有的两种静态程序分析工具(即Frama-C的Cobra和Eva插件)来验证注意事项。为了证明该方法的适用性，我们使用爱信comCruise有限公司提供的基准源代码和工业源代码进行了两次实验。结果表明，该方法能够检测到基准程序中的所有违规行为，而在工业项目中只遗漏了一个预期违规行为。

{"title":"Integrating Static Program Analysis Tools for Verifying Cautions of Microcontroller","authors":"Thuy Nguyen, Toshiaki Aoki, Takashi Tomita, Junpei Endo","doi":"10.1109/APSEC48747.2019.00021","DOIUrl":"https://doi.org/10.1109/APSEC48747.2019.00021","url":null,"abstract":"Microcontrollers are usually supplied with hardware manuals, where information that requires special attention is emphasized as cautions. Currently, the process of verifying these cautions is performed manually as there is no single tool that can directly handle this task. This research aims at automating the verification process for these cautions as much as possible. Firstly, we investigate two sections which have a considerable number of required cautions in the hardware manual of a popular microcontroller to obtain the typical cautions of microcontrollers. Secondly, we analyze and categorize these cautions into several groups. Subsequently, we propose a semi-automatic approach which uses the assertion-based method and integrates two existing static program analysis tools (i.e., Cobra and Eva plugin of Frama-C) to verify the cautions. To show the applicability of this approach, we conduct two experiments with a benchmark source code and an industrial source code provided by Aisin comCruise Co., Ltd.. The results show that this approach is capable of detecting all violations in the benchmark program and only misses one expected violation in the industrial project.","PeriodicalId":325642,"journal":{"name":"2019 26th Asia-Pacific Software Engineering Conference (APSEC)","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114488798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

More Secure Collaborative APIs Resistant to Flush+Reload and Flush+Flush Attacks on ARMv8-A 更安全的协作api，可抵抗ARMv8-A上的Flush+Reload和Flush+Flush攻击

2019 26th Asia-Pacific Software Engineering Conference (APSEC)

Pub Date : 2019-12-01 DOI: 10.1109/APSEC48747.2019.00062

Jingquan Ge, Neng Gao, Chenyang Tu, Ji Xiang, Zeyi Liu

With the popularity of smart devices such as mobile phones and tablets, the security problem of the widely used ARMv8-A processor has received more and more attention. Flush+Reload and Flush+Flush cache attacks have become two of the most important security threats due to their low noise and high resolution. In order to resist Flush+Reload and Flush+Flush attacks, researchers proposed many defense methods. However, these existing methods have various shortcomings. The runtime defense methods using hardware performance counters cannot detect attacks fast enough, effectively detect Flush+Flush or avoid a high false positive rate. Static code analysis schemes are powerless for obfuscation techniques. The approaches of permanently reducing the resolution can only be utilized on browser products and cannot be applied in the system. In this paper, we design two more secure collaborative APIs—flush operation API and high resolution time API—which can resist Flush+Reload and Flush+Flush attacks. When the flush operation API is called, the high resolution time API temporarily reduces its resolution and automatically restores. Moreover, the flush operation API also has the ability to detect and handle suspected Flush+Reload and Flush+Flush attacks. The attack and performance comparison experiments prove that the two APIs we designed are safer and the performance losses are acceptable.

随着手机、平板电脑等智能设备的普及，广泛使用的ARMv8-A处理器的安全问题越来越受到人们的关注。Flush+Reload和Flush+Flush缓存攻击由于其低噪声和高分辨率而成为两种最重要的安全威胁。为了抵御Flush+Reload和Flush+Flush攻击，研究人员提出了许多防御方法。然而，这些现有的方法都有各种各样的缺点。使用硬件性能计数器的运行时防御方法检测攻击的速度不够快，无法有效检测Flush+Flush或避免高误报率。静态代码分析方案对混淆技术无能为力。永久性降低分辨率的方法只能在浏览器产品上使用，不能在系统中应用。在本文中，我们设计了两个更安全的协作API - Flush操作API和高分辨率时间API -可以抵御Flush+Reload和Flush+Flush攻击。调用刷新操作API时，高分辨率时间API会暂时降低其分辨率并自动恢复。此外，flush操作API还能够检测和处理可疑的flush +Reload和flush + flush攻击。攻击和性能对比实验证明，我们设计的两种api更安全，性能损失是可以接受的。

{"title":"More Secure Collaborative APIs Resistant to Flush+Reload and Flush+Flush Attacks on ARMv8-A","authors":"Jingquan Ge, Neng Gao, Chenyang Tu, Ji Xiang, Zeyi Liu","doi":"10.1109/APSEC48747.2019.00062","DOIUrl":"https://doi.org/10.1109/APSEC48747.2019.00062","url":null,"abstract":"With the popularity of smart devices such as mobile phones and tablets, the security problem of the widely used ARMv8-A processor has received more and more attention. Flush+Reload and Flush+Flush cache attacks have become two of the most important security threats due to their low noise and high resolution. In order to resist Flush+Reload and Flush+Flush attacks, researchers proposed many defense methods. However, these existing methods have various shortcomings. The runtime defense methods using hardware performance counters cannot detect attacks fast enough, effectively detect Flush+Flush or avoid a high false positive rate. Static code analysis schemes are powerless for obfuscation techniques. The approaches of permanently reducing the resolution can only be utilized on browser products and cannot be applied in the system. In this paper, we design two more secure collaborative APIs—flush operation API and high resolution time API—which can resist Flush+Reload and Flush+Flush attacks. When the flush operation API is called, the high resolution time API temporarily reduces its resolution and automatically restores. Moreover, the flush operation API also has the ability to detect and handle suspected Flush+Reload and Flush+Flush attacks. The attack and performance comparison experiments prove that the two APIs we designed are safer and the performance losses are acceptable.","PeriodicalId":325642,"journal":{"name":"2019 26th Asia-Pacific Software Engineering Conference (APSEC)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115162777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Anatomizing Android Malwares 剖析Android恶意软件

2019 26th Asia-Pacific Software Engineering Conference (APSEC)

Pub Date : 2019-12-01 DOI: 10.1109/APSEC48747.2019.00067

Anand Tirkey, R. Mohapatra, L. Kumar

Android OS being the popular choice of majority users also faces the constant risk of breach of confidentiality, integrity and availability (CIA). Effective mitigation efforts needs to identified in order to protect and uphold the CIA triad model, within the android ecosystem. In this paper, we propose a novel method of android malware classification using Object-Oriented Software Metrics and machine learning algorithms. First, android apps are decompiled and Object-Oriented Metrics are obtained. VirusTotal service is used to tag an app either as malware or benign. Object-Oriented Metrics and malware tag are clubbed together into a dataset. Eighty different machine-learned models are trained over five thousand seven hundred and seventy four android apps. We evaluate the performance and stability of these models using it's malware classification accuracy and AUC (area under ROC curve) values. Our method yields an accuracy and AUC of 99.83% and 1.0 respectively.

Android操作系统作为大多数用户的流行选择，也面临着违反机密性、完整性和可用性(CIA)的持续风险。需要确定有效的缓解措施，以便在机器人生态系统中保护和维护中央情报局的三位一体模式。本文提出了一种基于面向对象软件度量和机器学习算法的android恶意软件分类新方法。首先，对android应用程序进行反编译，获得面向对象的度量。VirusTotal服务用于标记应用程序为恶意软件或良性。面向对象的度量和恶意软件标签被组合成一个数据集。八十个不同的机器学习模型在五千七百七十四个安卓应用程序上进行了训练。我们用它的恶意软件分类精度和AUC (ROC曲线下面积)值来评估这些模型的性能和稳定性。该方法的准确度和AUC分别为99.83%和1.0。

引用次数: 1

DeepTLE: Learning Code-Level Features to Predict Code Performance before It Runs DeepTLE:学习代码级功能，在运行前预测代码性能

2019 26th Asia-Pacific Software Engineering Conference (APSEC)

Pub Date : 2019-12-01 DOI: 10.1109/APSEC48747.2019.00042

Meiling Zhou, Jie Chen, Haiyang Hu, JiaCheng Yu, Zhongjin Li, Hua Hu

With the continuous expansion of the software market and the updating of the maturity of the software development process, the performance requirements of software users are becoming increasingly prominent. Performance issues are essentially related to the source code. For solving the same problem, different programmers may write completely different "correct" code with the same functionality but have different performance. Most online judge system on programming make use of automated grading systems, usually rely on test results to quantify the correctness and performance for the submitted source code. However, traditional dynamic testing takes a lot of time, and the discovery of performance problems is usually after the fact even for those small scale programs. Therefore, we proposed DeepTLE which is used to effectively predict the performance of submitted source code before it runs. DeepTLE can automatically learn the semantic and structural features of the source code. In order to verify the effect of our approach, we applied it to the source code collected from the program competition website to predict if the source code would be time limit exceed or not without running its test cases. Experiment results show that our method can save 96% of the time cost compared to the dynamic testing, and the accuracy of the prediction reaches 82%.

随着软件市场的不断扩大和软件开发过程成熟度的不断更新，软件用户对性能的要求也日益突出。性能问题本质上与源代码有关。为了解决相同的问题，不同的程序员可能会编写完全不同的“正确”代码，具有相同的功能，但具有不同的性能。大多数在线编程评判系统都采用自动评分系统，通常依靠测试结果来量化所提交源代码的正确性和性能。然而，传统的动态测试需要花费大量的时间，并且即使对于那些小规模的程序，性能问题的发现通常也是在事后。因此，我们提出了DeepTLE，用于在提交的源代码运行之前有效地预测其性能。DeepTLE可以自动学习源代码的语义和结构特征。为了验证我们的方法的效果，我们将其应用于从程序竞赛网站收集的源代码，以预测源代码在不运行其测试用例的情况下是否会超过时间限制。实验结果表明，与动态测试相比，该方法可节省96%的时间成本，预测准确率达到82%。

{"title":"DeepTLE: Learning Code-Level Features to Predict Code Performance before It Runs","authors":"Meiling Zhou, Jie Chen, Haiyang Hu, JiaCheng Yu, Zhongjin Li, Hua Hu","doi":"10.1109/APSEC48747.2019.00042","DOIUrl":"https://doi.org/10.1109/APSEC48747.2019.00042","url":null,"abstract":"With the continuous expansion of the software market and the updating of the maturity of the software development process, the performance requirements of software users are becoming increasingly prominent. Performance issues are essentially related to the source code. For solving the same problem, different programmers may write completely different \"correct\" code with the same functionality but have different performance. Most online judge system on programming make use of automated grading systems, usually rely on test results to quantify the correctness and performance for the submitted source code. However, traditional dynamic testing takes a lot of time, and the discovery of performance problems is usually after the fact even for those small scale programs. Therefore, we proposed DeepTLE which is used to effectively predict the performance of submitted source code before it runs. DeepTLE can automatically learn the semantic and structural features of the source code. In order to verify the effect of our approach, we applied it to the source code collected from the program competition website to predict if the source code would be time limit exceed or not without running its test cases. Experiment results show that our method can save 96% of the time cost compared to the dynamic testing, and the accuracy of the prediction reaches 82%.","PeriodicalId":325642,"journal":{"name":"2019 26th Asia-Pacific Software Engineering Conference (APSEC)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125424724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Semi-Automatic Repair of Over-Constrained Models for Combinatorial Robustness Testing 组合鲁棒性检验中过度约束模型的半自动修复

2019 26th Asia-Pacific Software Engineering Conference (APSEC)

Pub Date : 2019-12-01 DOI: 10.1109/APSEC48747.2019.00024

Konrad Fögen, H. Lichter

Combinatorial robustness testing is an approach to generate separate test inputs for positive and negative test scenarios. The test model is enriched with semantic information to distinguish valid from invalid values and value combinations. Unfortunately, it is easy to create over-constrained models and invalid values or invalid value combinations do not appear in the final test suite. In this paper, we extend previous work on manual repair and develop a technique to semi-automatically repair over-constrained models. The technique is evaluated with benchmark models and the results indicate a small computational overhead.

组合稳健性测试是一种为正测试和负测试场景生成单独测试输入的方法。该测试模型丰富了语义信息，以区分有效值和无效值以及值的组合。不幸的是，很容易创建过度约束的模型，并且无效值或无效值组合不会出现在最终的测试套件中。在本文中，我们扩展了先前的人工修复工作，并开发了一种半自动修复超约束模型的技术。使用基准模型对该技术进行了评估，结果表明计算开销很小。

引用次数: 3

A Cloud-Based Solution for Testing Applications' Compatibility and Portability on Fragmented Android Platform 基于云的Android平台应用兼容性和可移植性测试解决方案

2019 26th Asia-Pacific Software Engineering Conference (APSEC)

Pub Date : 2019-12-01 DOI: 10.1109/APSEC48747.2019.00030

Ammar Lanui, T. Chiew

Testing is a vital activity in software development. The ISO/IEC has defined a standard for system and software quality models called ISO/IEC 25010:2011 to be a guideline and scope for testing any applications. Testing of mobile applications according to this standard, however, is more challenging than other types of software. The diversity of Android devices and various versions of Android operating system, for example, has created a large fragmentation of the Android platform. This fragmentation hinders testing of Android applications especially in relation to portability and compatibility. Existing solutions are either neglecting portability and compatibility issues or lack flexibility in fulfilling needs of the different organizations. We propose a cloud testing model to address the fragmentation of Android platform and provide automated application testing services on the actual devices. The model can be configured in the public, private or hybrid setups to suit individual organizations' needs and budget. A prototype was built based on the model. 10 Android testers used the prototype and the Android Emulator to perform mobile application testing. Results show that the model has the potential to manage the challenging portability and compatibility testing on the Android platform in a flexible and scalable manner.

测试是软件开发中的一项重要活动。ISO/IEC为系统和软件质量模型定义了一个标准，称为ISO/IEC 25010:2011，作为测试任何应用程序的指导方针和范围。然而，根据这一标准测试移动应用程序比其他类型的软件更具挑战性。例如，Android设备的多样性和Android操作系统的不同版本造成了Android平台的严重分裂。这种碎片化阻碍了Android应用程序的测试，尤其是在可移植性和兼容性方面。现有的解决方案要么忽略了可移植性和兼容性问题，要么在满足不同组织的需求方面缺乏灵活性。我们提出了一个云测试模型，解决Android平台碎片化的问题，在实际设备上提供自动化的应用测试服务。该模型可以在公共、私有或混合设置中进行配置，以适应各个组织的需求和预算。在这个模型的基础上制造了一个原型。10名Android测试人员使用原型和Android Emulator进行移动应用程序测试。结果表明，该模型具有以灵活和可扩展的方式管理Android平台上具有挑战性的可移植性和兼容性测试的潜力。

{"title":"A Cloud-Based Solution for Testing Applications' Compatibility and Portability on Fragmented Android Platform","authors":"Ammar Lanui, T. Chiew","doi":"10.1109/APSEC48747.2019.00030","DOIUrl":"https://doi.org/10.1109/APSEC48747.2019.00030","url":null,"abstract":"Testing is a vital activity in software development. The ISO/IEC has defined a standard for system and software quality models called ISO/IEC 25010:2011 to be a guideline and scope for testing any applications. Testing of mobile applications according to this standard, however, is more challenging than other types of software. The diversity of Android devices and various versions of Android operating system, for example, has created a large fragmentation of the Android platform. This fragmentation hinders testing of Android applications especially in relation to portability and compatibility. Existing solutions are either neglecting portability and compatibility issues or lack flexibility in fulfilling needs of the different organizations. We propose a cloud testing model to address the fragmentation of Android platform and provide automated application testing services on the actual devices. The model can be configured in the public, private or hybrid setups to suit individual organizations' needs and budget. A prototype was built based on the model. 10 Android testers used the prototype and the Android Emulator to perform mobile application testing. Results show that the model has the potential to manage the challenging portability and compatibility testing on the Android platform in a flexible and scalable manner.","PeriodicalId":325642,"journal":{"name":"2019 26th Asia-Pacific Software Engineering Conference (APSEC)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132762767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

A Language-Based Multi-View Approach for Combining Functional and Security Models 结合功能模型和安全模型的基于语言的多视图方法

2019 26th Asia-Pacific Software Engineering Conference (APSEC)

Pub Date : 2019-12-01 DOI: 10.1109/APSEC48747.2019.00064

Hui Zhao, F. Mallet, L. Apvrille

The design flaws and attacks on Cyber-Physical Systems (CPSs) can lead to severe consequences. Thus, security and safety (S&S) issues should be taken into account with functional design as early as possible during the developing process. However, it's rare to see "one-size-fits-all" modeling language and/or design tool. One way to solve this issue is to integrate different nature models into one model system, but this requires a unified semantic among modeling languages. We explore a model-based approach for systems engineering that facilitates the composition of several heterogeneous artifacts (called views) into a sound and consistent system model. Rather than trying to extend either SysML or SysML-sec into more expressive languages to add the missing features, we extract proper subsets of both languages to build a view adequate for conducting a security and safety analysis of Capella (SysML-based) functional models. Our language is generic enough to extract proper subsets of languages and combine them to build views for different experts. Moreover, it maintains a global consistency between the different views.

网络物理系统(cps)的设计缺陷和攻击可能导致严重的后果。因此，在开发过程中，应尽早在功能设计中考虑安全与安全(S&S)问题。然而，很少看到“一刀切”的建模语言和/或设计工具。解决这个问题的一种方法是将不同的自然模型集成到一个模型系统中，但是这需要建模语言之间的统一语义。我们为系统工程探索了一种基于模型的方法，它促进了将几个异质工件(称为视图)组合成一个健全和一致的系统模型。我们没有尝试将SysML或SysML-sec扩展为更具表现力的语言来添加缺失的特性，而是提取这两种语言的适当子集来构建一个视图，以便对Capella(基于SysML的)功能模型进行安全性和安全性分析。我们的语言足够通用，可以提取语言的适当子集，并将它们组合起来，为不同的专家构建视图。此外，它还保持了不同视图之间的全局一致性。

{"title":"A Language-Based Multi-View Approach for Combining Functional and Security Models","authors":"Hui Zhao, F. Mallet, L. Apvrille","doi":"10.1109/APSEC48747.2019.00064","DOIUrl":"https://doi.org/10.1109/APSEC48747.2019.00064","url":null,"abstract":"The design flaws and attacks on Cyber-Physical Systems (CPSs) can lead to severe consequences. Thus, security and safety (S&S) issues should be taken into account with functional design as early as possible during the developing process. However, it's rare to see \"one-size-fits-all\" modeling language and/or design tool. One way to solve this issue is to integrate different nature models into one model system, but this requires a unified semantic among modeling languages. We explore a model-based approach for systems engineering that facilitates the composition of several heterogeneous artifacts (called views) into a sound and consistent system model. Rather than trying to extend either SysML or SysML-sec into more expressive languages to add the missing features, we extract proper subsets of both languages to build a view adequate for conducting a security and safety analysis of Capella (SysML-based) functional models. Our language is generic enough to extract proper subsets of languages and combine them to build views for different experts. Moreover, it maintains a global consistency between the different views.","PeriodicalId":325642,"journal":{"name":"2019 26th Asia-Pacific Software Engineering Conference (APSEC)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134514871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

How Compact Will My System Be? A Fully-Automated Way to Calculate LoC Reduced by Clone Refactoring 我的系统有多紧凑?通过克隆重构减少LoC的全自动计算方法

2019 26th Asia-Pacific Software Engineering Conference (APSEC)

Pub Date : 2019-12-01 DOI: 10.1109/APSEC48747.2019.00046

Tasuku Nakagawa, Yoshiki Higo, Junnosuke Matsumoto, S. Kusumoto

A code clone (in short, clone) is a code fragment that is identical or similar to other code fragments in source code. The presence of clone is known as bad smell, which is phenomena of source code to be refactored. A motivation of refactoring (merging) clones is to reduce the size of source code. An existing study proposed a technique to estimate reduced lines of code by merging clones; however, there are two issues in the existing technique: (1) the existing technique does not consider the refactorability of clones in spite that it is difficult or even impossible to merge some clones due to the limitation of programming languages; (2) in the case that multiple clones are overlapping, the existing technique only considers one of them can be merged. Due to the above issues, estimated reducible LoC is occasionally different from the actual number. Consequently, in this research, we propose a new technique to calculate a reducible LoC. The proposed technique is free from the two issues, and it calculates a reducible LoC fully automatically. The proposed technique performs a loop processing of (a) detecting clones, (b) merging them, (c) compiling the edited source files, and (d) testing them. After finishing the loop, reducible LoC is calculated from the edited source files. This paper also includes comparison results of the proposed technique and the existing one. In the comparisons, we confirmed that a reducible LoC which was calculated with considering refactorability is 25% of a reducible LoC which was estimated without considering refactorability. We also confirmed that the proposed technique was able to merge clones that were not counted in the existing technique.

代码克隆(简称克隆)是源代码中与其他代码片段相同或相似的代码片段。克隆的存在被称为臭味，这是需要重构的源代码的现象。重构(合并)克隆的一个动机是减少源代码的大小。现有的一项研究提出了一种技术，通过合并克隆来估计减少的代码行数;然而，现有技术存在两个问题:(1)现有技术没有考虑克隆的可重构性，尽管由于编程语言的限制，一些克隆很难甚至不可能合并;(2)在多个克隆重叠的情况下，现有技术只考虑其中一个克隆可以合并。由于上述问题，估计的可还原LoC有时与实际数字不同。因此，在本研究中，我们提出了一种计算可约LoC的新技术。所提出的技术不存在这两个问题，并且可以完全自动地计算可约LoC。所建议的技术执行以下循环处理:(a)检测克隆，(b)合并它们，(c)编译编辑过的源文件，以及(d)测试它们。完成循环后，从编辑的源文件计算可还原LoC。本文还将所提出的技术与现有技术进行了比较。在比较中，我们证实了在考虑可重构性的情况下计算的可还原LoC是在不考虑可重构性的情况下估计的可还原LoC的25%。我们还证实，所提出的技术能够合并在现有技术中未计数的克隆。

{"title":"How Compact Will My System Be? A Fully-Automated Way to Calculate LoC Reduced by Clone Refactoring","authors":"Tasuku Nakagawa, Yoshiki Higo, Junnosuke Matsumoto, S. Kusumoto","doi":"10.1109/APSEC48747.2019.00046","DOIUrl":"https://doi.org/10.1109/APSEC48747.2019.00046","url":null,"abstract":"A code clone (in short, clone) is a code fragment that is identical or similar to other code fragments in source code. The presence of clone is known as bad smell, which is phenomena of source code to be refactored. A motivation of refactoring (merging) clones is to reduce the size of source code. An existing study proposed a technique to estimate reduced lines of code by merging clones; however, there are two issues in the existing technique: (1) the existing technique does not consider the refactorability of clones in spite that it is difficult or even impossible to merge some clones due to the limitation of programming languages; (2) in the case that multiple clones are overlapping, the existing technique only considers one of them can be merged. Due to the above issues, estimated reducible LoC is occasionally different from the actual number. Consequently, in this research, we propose a new technique to calculate a reducible LoC. The proposed technique is free from the two issues, and it calculates a reducible LoC fully automatically. The proposed technique performs a loop processing of (a) detecting clones, (b) merging them, (c) compiling the edited source files, and (d) testing them. After finishing the loop, reducible LoC is calculated from the edited source files. This paper also includes comparison results of the proposed technique and the existing one. In the comparisons, we confirmed that a reducible LoC which was calculated with considering refactorability is 25% of a reducible LoC which was estimated without considering refactorability. We also confirmed that the proposed technique was able to merge clones that were not counted in the existing technique.","PeriodicalId":325642,"journal":{"name":"2019 26th Asia-Pacific Software Engineering Conference (APSEC)","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114805664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Neural Comment Generation for Source Code with Auxiliary Code Classification Task 基于辅助代码分类任务的源代码神经注释生成

2019 26th Asia-Pacific Software Engineering Conference (APSEC)

Pub Date : 2019-12-01 DOI: 10.1109/APSEC48747.2019.00076

Minghao Chen, Xiaojun Wan

Code comments help program developers understand programs, read and navigate source code, thus resulting in more efficient software maintenance. Unfortunately, many codes are not commented adequately, or the code comments are missing. So developers have to spend additional time in reading source code. In this paper, we propose a new approach to automatically generating comments for source codes. Following the intuition behind the traditional sequence-to-sequence (Seq2Seq) model for machine translation, we propose a tree-to-sequence (Tree2Seq) model for code comment generation, which leverages an encoder to capture the structure information of source code. More importantly, code classification is involved as an auxiliary task for aiding the Tree2Seq model. We build a multi-task learning model to achieve this goal. We evaluate our models on a benchmark dataset with automatic metrics like BLEU, ROUGE, and METEOR. Experimental results show that our proposed Tree2Seq model outperforms traditional Seq2Seq model with attention, and our proposed multi-task learning model outperforms the state-of-the-art approaches by a substantial margin.

代码注释帮助程序开发人员理解程序，阅读和导航源代码，从而导致更有效的软件维护。不幸的是，许多代码没有充分注释，或者代码注释丢失。因此，开发人员不得不花费额外的时间来阅读源代码。在本文中，我们提出了一种自动生成源代码注释的新方法。根据机器翻译的传统序列到序列(Seq2Seq)模型背后的直觉，我们提出了用于代码注释生成的树到序列(Tree2Seq)模型，该模型利用编码器捕获源代码的结构信息。更重要的是，代码分类作为辅助Tree2Seq模型的辅助任务。我们建立了一个多任务学习模型来实现这一目标。我们使用BLEU、ROUGE和METEOR等自动指标在基准数据集上评估我们的模型。实验结果表明，我们提出的Tree2Seq模型在注意力方面优于传统的Seq2Seq模型，并且我们提出的多任务学习模型在很大程度上优于最先进的方法。

{"title":"Neural Comment Generation for Source Code with Auxiliary Code Classification Task","authors":"Minghao Chen, Xiaojun Wan","doi":"10.1109/APSEC48747.2019.00076","DOIUrl":"https://doi.org/10.1109/APSEC48747.2019.00076","url":null,"abstract":"Code comments help program developers understand programs, read and navigate source code, thus resulting in more efficient software maintenance. Unfortunately, many codes are not commented adequately, or the code comments are missing. So developers have to spend additional time in reading source code. In this paper, we propose a new approach to automatically generating comments for source codes. Following the intuition behind the traditional sequence-to-sequence (Seq2Seq) model for machine translation, we propose a tree-to-sequence (Tree2Seq) model for code comment generation, which leverages an encoder to capture the structure information of source code. More importantly, code classification is involved as an auxiliary task for aiding the Tree2Seq model. We build a multi-task learning model to achieve this goal. We evaluate our models on a benchmark dataset with automatic metrics like BLEU, ROUGE, and METEOR. Experimental results show that our proposed Tree2Seq model outperforms traditional Seq2Seq model with attention, and our proposed multi-task learning model outperforms the state-of-the-art approaches by a substantial margin.","PeriodicalId":325642,"journal":{"name":"2019 26th Asia-Pacific Software Engineering Conference (APSEC)","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114979094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

SoReady: An Extension of the Test and Defect Coverage-Based Analytics Model for Pull-Based Software Development SoReady:基于拉式软件开发的基于测试和缺陷覆盖率的分析模型的扩展

2019 26th Asia-Pacific Software Engineering Conference (APSEC)

Pub Date : 2019-12-01 DOI: 10.1109/APSEC48747.2019.00011

Sharifah Mashita Syed-Mohamad, Nur Asyraf Md Akhir

Pull-based software development is a distributed development model that offers an opportunity to review a pull request before it gets merged into the main repository. A pull request addresses new features, bug fixing, and maintenance issues submitted by both integrators or contributors. It appears that many empirical studies are conducted to discover how pull request evaluation is done, and to our knowledge, limited research exists for assessing release readiness of pull requests. Studies also reported that the failure rate of pull-requests rapidly increases when there are many forks created. It is therefore, questions worth exploring are whether the code review really contributing to the code quality, and how to determine the release readiness of pull requests? In our previous work, test and defect coverage-based analytics model (TDCAM) has been proven to be suitable to determine the readiness of releases for software that is rapidly evolving, in which this is also a characteristic of pull-based software development. In this paper, the TDCAM has been extended to include pull request coverage indicators. The proposed model, namely as SoReady and the visualization analysis presented herein has enabled five developers in a commercial setting to make informed and evidence-based decisions regarding the test status of each pull request and overall reliability of an open source software through a prototype dashboard.

基于拉的软件开发是一种分布式开发模型，它提供了在合并到主存储库之前检查拉请求的机会。拉取请求处理由集成商或贡献者提交的新特性、bug修复和维护问题。似乎进行了许多实证研究来发现如何进行拉取请求评估，据我们所知，有限的研究存在于评估拉取请求的释放准备情况。研究还报告说，当创建了许多分叉时，拉取请求的失败率会迅速增加。因此，值得探讨的问题是代码审查是否真的对代码质量有贡献，以及如何确定拉取请求的发布准备情况?在我们之前的工作中，基于测试和缺陷覆盖率的分析模型(TDCAM)已经被证明适合于确定快速发展的软件发布的准备情况，其中这也是基于拉的软件开发的一个特征。在本文中，TDCAM已经扩展到包括拉请求覆盖指标。所提出的模型，即SoReady和本文提出的可视化分析，使商业环境中的五个开发人员能够通过原型仪表板对每个拉取请求的测试状态和开源软件的整体可靠性做出明智的、基于证据的决策。

{"title":"SoReady: An Extension of the Test and Defect Coverage-Based Analytics Model for Pull-Based Software Development","authors":"Sharifah Mashita Syed-Mohamad, Nur Asyraf Md Akhir","doi":"10.1109/APSEC48747.2019.00011","DOIUrl":"https://doi.org/10.1109/APSEC48747.2019.00011","url":null,"abstract":"Pull-based software development is a distributed development model that offers an opportunity to review a pull request before it gets merged into the main repository. A pull request addresses new features, bug fixing, and maintenance issues submitted by both integrators or contributors. It appears that many empirical studies are conducted to discover how pull request evaluation is done, and to our knowledge, limited research exists for assessing release readiness of pull requests. Studies also reported that the failure rate of pull-requests rapidly increases when there are many forks created. It is therefore, questions worth exploring are whether the code review really contributing to the code quality, and how to determine the release readiness of pull requests? In our previous work, test and defect coverage-based analytics model (TDCAM) has been proven to be suitable to determine the readiness of releases for software that is rapidly evolving, in which this is also a characteristic of pull-based software development. In this paper, the TDCAM has been extended to include pull request coverage indicators. The proposed model, namely as SoReady and the visualization analysis presented herein has enabled five developers in a commercial setting to make informed and evidence-based decisions regarding the test status of each pull request and overall reliability of an open source software through a prototype dashboard.","PeriodicalId":325642,"journal":{"name":"2019 26th Asia-Pacific Software Engineering Conference (APSEC)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132236511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0