Pub Date : 2024-01-09DOI: 10.1007/s10515-023-00408-7
Donies Samet, Farah Barika Ktata, Khaled Ghedira
Security is a very important challenge in mobile agent systems due to the strong dependence of agents on the platform and vice versa. According to recent studies, most current mobile agent platforms suffer from significant limitations in terms of security when they face Denial of Service (DOS) attacks. Current security solutions even provided by the mobile agent platforms or by the literature focus essentially on individual attacks and are mainly based on static models that present a lack of the permissions definition and are not detailed enough to face collaborative DOS attacks executed by multiple agents or users. This paper presents a security framework that adds security defenses to mobile agent platforms. The proposed security framework implements a standard security model described using MA-UML (Mobile Agent-Unified Modeling Language) notations. The framework lets the administrator (of agents’ place) define a precise and fine-grained authorization policy to defend against DOS attacks. The authorization enforcement in the proposed framework is dynamic : the authorization decisions executed by the proposed framework are based upon run-time parameters like the amount of activity of an agent. We implement an experiment on a mobile agent system of e-marketplaces. Given that we focus essentially on the availability criterion, the performance of the proposed framework on a place is evaluated against DOS and DDOS attacks and investigated in terms of duration of execution that is the availability of the place.
在移动代理系统中,安全性是一个非常重要的挑战,因为代理对平台有很强的依赖性,反之亦然。根据最近的研究,目前大多数移动代理平台在面对拒绝服务(DOS)攻击时,在安全性方面都存在很大的局限性。目前的安全解决方案,即使是由移动代理平台或文献提供的,也主要侧重于单个攻击,而且主要基于静态模型,缺乏权限定义,不够详细,无法应对由多个代理或用户执行的协作式 DOS 攻击。本文提出了一种安全框架,可为移动代理平台增加安全防御功能。所提出的安全框架实现了一个使用 MA-UML(移动代理统一建模语言)符号描述的标准安全模型。该框架允许(代理位置的)管理员定义精确和细粒度的授权策略,以防御 DOS 攻击。拟议框架中的授权执行是动态的:拟议框架根据运行时的参数(如代理的活动量)执行授权决策。我们在电子市场的移动代理系统上进行了实验。鉴于我们主要关注的是可用性标准,我们针对 DOS 和 DDOS 攻击评估了拟议框架在某一地点的性能,并根据执行持续时间(即该地点的可用性)进行了调查。
{"title":"A security framework for mobile agent systems","authors":"Donies Samet, Farah Barika Ktata, Khaled Ghedira","doi":"10.1007/s10515-023-00408-7","DOIUrl":"10.1007/s10515-023-00408-7","url":null,"abstract":"<div><p>Security is a very important challenge in mobile agent systems due to the strong dependence of agents on the platform and vice versa. According to recent studies, most current mobile agent platforms suffer from significant limitations in terms of security when they face Denial of Service (DOS) attacks. Current security solutions even provided by the mobile agent platforms or by the literature focus essentially on individual attacks and are mainly based on static models that present a lack of the permissions definition and are not detailed enough to face collaborative DOS attacks executed by multiple agents or users. This paper presents a security framework that adds security defenses to mobile agent platforms. The proposed security framework implements a standard security model described using MA-UML (Mobile Agent-Unified Modeling Language) notations. The framework lets the administrator (of agents’ place) define a precise and fine-grained authorization policy to defend against DOS attacks. The authorization enforcement in the proposed framework is dynamic : the authorization decisions executed by the proposed framework are based upon run-time parameters like the amount of activity of an agent. We implement an experiment on a mobile agent system of e-marketplaces. Given that we focus essentially on the availability criterion, the performance of the proposed framework on a place is evaluated against DOS and DDOS attacks and investigated in terms of duration of execution that is the availability of the place.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"31 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139399877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-08DOI: 10.1007/s10515-023-00410-z
Zeeshan Anwar, Hammad Afzal
Various development tools have been introduced and the choice of suitable development tool depends on the particular context like the type of application to be developed, the development process and application domain, etc. The real challenge is to deliver new features at the right time with a faster development cycle. The selection of suitable development tools will help developers to save time and effort. In this research, we will explore software engineering repositories (like StackOverflow) to collect feedback from developers about development tools. This will explore which features in a development tool are most important, which features are missing, and which features require changes. The answers to these questions can be found by mining the community question-answering sites (CQA). We will use user feedback to innovate the new features in the development tool. Various techniques of Big Data, Data Mining, Deep Learning, and Transformers including Generative Pre-Training Transformer will be used in our research. Some of the major techniques include (i) data collection from CQA sites like StackOverflow, (ii) data preprocessing (iii) categories the data into various topics using topic modeling (iv) sentiment analysis of data to get positive or negative aspects of features (v) ranking of users and their feedback. The output of this research will categorize the users feedback into various ideas, this will help organizations to decide which features are required, which features are not required, which features are difficult or confusing, and which new features should be introduced into a new release.
{"title":"Mining crowd sourcing repositories for open innovation in software engineering","authors":"Zeeshan Anwar, Hammad Afzal","doi":"10.1007/s10515-023-00410-z","DOIUrl":"10.1007/s10515-023-00410-z","url":null,"abstract":"<div><p>Various development tools have been introduced and the choice of suitable development tool depends on the particular context like the type of application to be developed, the development process and application domain, etc. The real challenge is to deliver new features at the right time with a faster development cycle. The selection of suitable development tools will help developers to save time and effort. In this research, we will explore software engineering repositories (like StackOverflow) to collect feedback from developers about development tools. This will explore which features in a development tool are most important, which features are missing, and which features require changes. The answers to these questions can be found by mining the community question-answering sites (CQA). We will use user feedback to innovate the new features in the development tool. Various techniques of Big Data, Data Mining, Deep Learning, and Transformers including Generative Pre-Training Transformer will be used in our research. Some of the major techniques include (i) data collection from CQA sites like StackOverflow, (ii) data preprocessing (iii) categories the data into various topics using topic modeling (iv) sentiment analysis of data to get positive or negative aspects of features (v) ranking of users and their feedback. The output of this research will categorize the users feedback into various ideas, this will help organizations to decide which features are required, which features are not required, which features are difficult or confusing, and which new features should be introduced into a new release.\u0000</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"31 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139399897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Correction to: A class integration test order generation approach based on Sarsa algorithm","authors":"Yun Li, Yanmei Zhang, Yanru Ding, Shujuan Jiang, Guan Yuan","doi":"10.1007/s10515-023-00412-x","DOIUrl":"10.1007/s10515-023-00412-x","url":null,"abstract":"","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"31 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139081477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-27DOI: 10.1007/s10515-023-00405-w
Zohreh Mafi, Seyed-Hassan Mirian-Hosseinabadi
The large number of unit tests produced in the test-driven development (TDD) method and the iterative execution of these tests extend the regression test execution time in TDD. This study aims to reduce test execution time in TDD. We propose a TDD-based approach that creates traceable code elements and connects them to relevant test cases to support regression test selection during the TDD process. Our proposed hybrid technique combines text and syntax program differences to select related test cases using the nature of TDD. We use a change detection algorithm to detect program changes. Our experience is reported with a tool called RichTest, which implements this technique. In order to evaluate our work, seven TDD projects have been developed. The implementation results indicate that the RichTest plugin significantly decreases the number of test executions and also the time of regression testing despite considering the overhead time. The test suite effectively enables fault detection because the selected test cases are related to the modified partitions. Moreover, the test cases cover the entire modified partitions; accordingly, the selection algorithm is safe. The concept is particularly designed for the TDD method. Although this idea is applicable in any programming language, it is already implemented as a plugin in Java Eclipse.
{"title":"Regression test selection in test-driven development","authors":"Zohreh Mafi, Seyed-Hassan Mirian-Hosseinabadi","doi":"10.1007/s10515-023-00405-w","DOIUrl":"10.1007/s10515-023-00405-w","url":null,"abstract":"<div><p>The large number of unit tests produced in the test-driven development (TDD) method and the iterative execution of these tests extend the regression test execution time in TDD. This study aims to reduce test execution time in TDD. We propose a TDD-based approach that creates traceable code elements and connects them to relevant test cases to support regression test selection during the TDD process. Our proposed hybrid technique combines text and syntax program differences to select related test cases using the nature of TDD. We use a change detection algorithm to detect program changes. Our experience is reported with a tool called RichTest, which implements this technique. In order to evaluate our work, seven TDD projects have been developed. The implementation results indicate that the RichTest plugin significantly decreases the number of test executions and also the time of regression testing despite considering the overhead time. The test suite effectively enables fault detection because the selected test cases are related to the modified partitions. Moreover, the test cases cover the entire modified partitions; accordingly, the selection algorithm is safe. The concept is particularly designed for the TDD method. Although this idea is applicable in any programming language, it is already implemented as a plugin in Java Eclipse.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"31 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2023-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139069288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-21DOI: 10.1007/s10515-023-00407-8
Muneera Bano, Rashina Hoda, Didar Zowghi, Christoph Treude
The recent surge in the integration of Large Language Models (LLMs) like ChatGPT into qualitative research in software engineering, much like in other professional domains, demands a closer inspection. This vision paper seeks to explore the opportunities of using LLMs in qualitative research to address many of its legacy challenges as well as potential new concerns and pitfalls arising from the use of LLMs. We share our vision for the evolving role of the qualitative researcher in the age of LLMs and contemplate how they may utilize LLMs at various stages of their research experience.
{"title":"Large language models for qualitative research in software engineering: exploring opportunities and challenges","authors":"Muneera Bano, Rashina Hoda, Didar Zowghi, Christoph Treude","doi":"10.1007/s10515-023-00407-8","DOIUrl":"10.1007/s10515-023-00407-8","url":null,"abstract":"<div><p>The recent surge in the integration of Large Language Models (LLMs) like ChatGPT into qualitative research in software engineering, much like in other professional domains, demands a closer inspection. This vision paper seeks to explore the opportunities of using LLMs in qualitative research to address many of its legacy challenges as well as potential new concerns and pitfalls arising from the use of LLMs. We share our vision for the evolving role of the qualitative researcher in the age of LLMs and contemplate how they may utilize LLMs at various stages of their research experience.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"31 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2023-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138949097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Class integration test order generation is a key step in integration testing, researching this problem can help find unknown bugs and improve the efficiency of software testing. The challenge of this problem is ordering the classes to be integrated to minimize the cost of required stubs. However, the existing approaches of generating class integration test orders cannot satisfy this requirement well. Considering the excellent performance of reinforcement learning in sequence decision problems, this paper proposes a class integration test order generation approach based on Sarsa algorithm, which is a data-driven model-free reinforcement learning algorithm. This approach takes the stubbing complexity as the indicator to evaluate the stubbing cost and uses it to measure the quality of a class integration test order. The Sarsa algorithm is used to train the agent, and three indicators such as test return, dependency complexity, and the number of cycles are integrated into the design of the reward function to evaluate the merits of the current action. By recording an action path of the agent from its initial state to its termination state, a class integration test order can be obtained. The experimental results on 10 systems show that the class integration test order generation approach based on Sarsa algorithm can generate the class integration test orders with lower stubbing cost.
{"title":"A class integration test order generation approach based on Sarsa algorithm","authors":"Yun Li, Yanmei Zhang, Yanru Ding, Shujuan Jiang, Guan Yuan","doi":"10.1007/s10515-023-00406-9","DOIUrl":"10.1007/s10515-023-00406-9","url":null,"abstract":"<div><p>Class integration test order generation is a key step in integration testing, researching this problem can help find unknown bugs and improve the efficiency of software testing. The challenge of this problem is ordering the classes to be integrated to minimize the cost of required stubs. However, the existing approaches of generating class integration test orders cannot satisfy this requirement well. Considering the excellent performance of reinforcement learning in sequence decision problems, this paper proposes a class integration test order generation approach based on Sarsa algorithm, which is a data-driven model-free reinforcement learning algorithm. This approach takes the stubbing complexity as the indicator to evaluate the stubbing cost and uses it to measure the quality of a class integration test order. The Sarsa algorithm is used to train the agent, and three indicators such as test return, dependency complexity, and the number of cycles are integrated into the design of the reward function to evaluate the merits of the current action. By recording an action path of the agent from its initial state to its termination state, a class integration test order can be obtained. The experimental results on 10 systems show that the class integration test order generation approach based on Sarsa algorithm can generate the class integration test orders with lower stubbing cost.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"31 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2023-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138581507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-06DOI: 10.1007/s10515-023-00404-x
Xingzi Yu, Wei Tang, Tianlei Xiong, Wengang Chen, Jie He, Bin Yang, Zhengwei Qi
The lack of flexibility and safety in C language development has been criticized for a long time, causing detriments to the development cycle and software quality in the embedded systems domain. TypeScript, as an optionally-typed dynamic language, offers the flexibility and safety that developers desire. With the advancement of Ahead-of-Time (AOT) compilation technologies for TypeScript and JavaScript, it has become feasible to write embedded applications using TypeScript. Despite the availability of writing AOT compiled programs with TypeScript, implementing a compiler toolchain for this purpose requires substantial effort. To simplify the design of languages and compilers, this paper presents a new compiler toolchain design methodology called TS(^-), which advocates the generation of target intermediate language code (such as C) from TypeScript rather than the construction of higher-level compiler tools and type systems on top of the intermediate language. TS(^-) not only simplifies the design of the system but also provides developers with a quasi-native TypeScript development experience. This paper also presents Ts2Wasm, a prototype that implements TS(^-) and allows compiling a language subset of TypeScript to WebAssembly (WASM). The tests in the TypeScript repository show that Ts2Wasm provides 3.8× as many features compared to the intermediate language (AssemblyScript). Regarding performance, Ts2Wasm offers a significant speed-up of 1.4× to 19×. Meanwhile, it imposes over 65% less memory overhead compared to Node.js in most cases.
长期以来,C语言开发缺乏灵活性和安全性一直受到批评,对嵌入式系统领域的开发周期和软件质量造成了不利影响。TypeScript作为一种可选类型的动态语言,提供了开发人员所期望的灵活性和安全性。随着TypeScript和JavaScript的AOT (Ahead-of-Time)编译技术的进步,使用TypeScript编写嵌入式应用已经成为可能。尽管可以用TypeScript编写AOT编译程序,但是实现一个编译器工具链还是需要付出很大的努力。为了简化语言和编译器的设计,本文提出了一种新的编译器工具链设计方法TS (^-),它提倡从TypeScript生成目标中间语言代码(如C语言),而不是在中间语言之上构建更高级的编译器工具和类型系统。TS (^-)不仅简化了系统的设计,还为开发人员提供了一种准原生的TypeScript开发体验。本文还介绍了Ts2Wasm,一个实现TS (^-)的原型,并允许将TypeScript的语言子集编译为WebAssembly (WASM)。TypeScript存储库中的测试显示,Ts2Wasm提供的特性是中间语言(AssemblyScript)的3.8倍。在性能方面,Ts2Wasm提供了1.4到19倍的显著加速。同时,它对65岁以上的人征税% less memory overhead compared to Node.js in most cases.
{"title":"Enhancing embedded systems development with TS(^-)","authors":"Xingzi Yu, Wei Tang, Tianlei Xiong, Wengang Chen, Jie He, Bin Yang, Zhengwei Qi","doi":"10.1007/s10515-023-00404-x","DOIUrl":"10.1007/s10515-023-00404-x","url":null,"abstract":"<div><p>The lack of flexibility and safety in C language development has been criticized for a long time, causing detriments to the development cycle and software quality in the embedded systems domain. TypeScript, as an optionally-typed dynamic language, offers the flexibility and safety that developers desire. With the advancement of Ahead-of-Time (AOT) compilation technologies for TypeScript and JavaScript, it has become feasible to write embedded applications using TypeScript. Despite the availability of writing AOT compiled programs with TypeScript, implementing a compiler toolchain for this purpose requires substantial effort. To simplify the design of languages and compilers, this paper presents a new compiler toolchain design methodology called TS<span>(^-)</span>, which advocates the generation of target intermediate language code (such as C) from TypeScript rather than the construction of higher-level compiler tools and type systems on top of the intermediate language. TS<span>(^-)</span> not only simplifies the design of the system but also provides developers with a quasi-native TypeScript development experience. This paper also presents <span>Ts2Wasm</span>, a prototype that implements TS<span>(^-)</span> and allows compiling a language subset of TypeScript to WebAssembly (WASM). The tests in the TypeScript repository show that <span>Ts2Wasm</span> provides 3.8× as many features compared to the intermediate language (AssemblyScript). Regarding performance, <span>Ts2Wasm</span> offers a significant speed-up of 1.4× to 19×. Meanwhile, it imposes over 65% less memory overhead compared to Node.js in most cases.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"31 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2023-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138507038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-21DOI: 10.1007/s10515-023-00403-y
Christina Antοniou, Nick Bassiliades
The most popular technique for specification requirements is natural language. The disadvantage of natural language is ambiguity. Boilerplates are syntactic patterns which limit the ambiguity problem associated with using natural language to specify system/software requirements. Also, using boilerplates is considered a useful tool for inexperienced engineers to define requirements. Using linguistic boilerplates, constrains the natural language syntactically. Furthermore, a domain-specific ontology is used to constrain requirements semantically, as well. In requirements specification, using ontologies helps to restrict the vocabulary to entities, properties, and property relationships which are semantically related. The above results in avoiding or making fewer mistakes. This work makes use of the combination of boilerplate and ontology. Usually, the attributes of boilerplates are completed with the help of the ontology. The contribution of this paper is that the whole boilerplates is stored in the ontology and attributes and fixed elements are part of the ontology. This combination helps to correct semantically and syntactically requirement construction. This paper proposes a tool based on a domain-specific ontology and a set of predefined generic linguistic boilerplates for requirements engineering. We create a domain-specific ontology and a minimal set of boilerplates for an ATM (Automated Teller Machine). We carried out an experiment in order to obtain evidence for the effectiveness and efficiency of our method. The experiment took the form of a case study for the ATM domain and our proposed method was evaluated by users. The contribution and novelty of our methodology is that we created a tool for defining requirements that integrates boilerplate templates and an ontology. We exploit the boilerplate language syntax, mapping them to Resource Description Framework triples which have also a linguistic nature.
{"title":"Α tool for requirements engineering using ontologies and boilerplates","authors":"Christina Antοniou, Nick Bassiliades","doi":"10.1007/s10515-023-00403-y","DOIUrl":"10.1007/s10515-023-00403-y","url":null,"abstract":"<div><p>The most popular technique for specification requirements is natural language. The disadvantage of natural language is ambiguity. Boilerplates are syntactic patterns which limit the ambiguity problem associated with using natural language to specify system/software requirements. Also, using boilerplates is considered a useful tool for inexperienced engineers to define requirements. Using linguistic boilerplates, constrains the natural language syntactically. Furthermore, a domain-specific ontology is used to constrain requirements semantically, as well. In requirements specification, using ontologies helps to restrict the vocabulary to entities, properties, and property relationships which are semantically related. The above results in avoiding or making fewer mistakes. This work makes use of the combination of boilerplate and ontology. Usually, the attributes of boilerplates are completed with the help of the ontology. The contribution of this paper is that the whole boilerplates is stored in the ontology and attributes and fixed elements are part of the ontology. This combination helps to correct semantically and syntactically requirement construction. This paper proposes a tool based on a domain-specific ontology and a set of predefined generic linguistic boilerplates for requirements engineering. We create a domain-specific ontology and a minimal set of boilerplates for an ATM (Automated Teller Machine). We carried out an experiment in order to obtain evidence for the effectiveness and efficiency of our method. The experiment took the form of a case study for the ATM domain and our proposed method was evaluated by users. The contribution and novelty of our methodology is that we created a tool for defining requirements that integrates boilerplate templates and an ontology. We exploit the boilerplate language syntax, mapping them to Resource Description Framework triples which have also a linguistic nature.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"31 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2023-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10515-023-00403-y.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138431466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-21DOI: 10.1007/s10515-023-00402-z
Rong Cao, Liang Bao, Panpan Zhangsun, Chase Wu, Shouxin Wei, Ren Sun, Ran Li, Zhe Zhang
As software systems become increasingly large and complex, automated parameter tuning of software systems (PTSS) has been the focus of research and many tuning algorithms have been proposed recently. However, due to the lack of a unified platform for comparing and reproducing existing tuning algorithms, it remains a significant challenge for a user to choose an appropriate algorithm for a given software system. There are multiple reasons for this challenge, including diverse experimental conditions, lack of evaluations for different tasks, and excessive evaluation costs of tuning algorithms. In this paper, we propose an extensible and efficient benchmark, referred to as PTSSBench, which provides a unified platform for supporting a comparative study of different tuning algorithms via surrogate models and actual systems. We demonstrate the usability and efficiency of PTSSBench through comparative experiments of six state-of-the-art tuning algorithms from a holistic perspective and a task-oriented perspective. The experimental results show the necessity and effectiveness of parameter tuning for software systems and indicate that the PTSS problem remains an open problem. Moreover, PTSSBench allows extensive runs and in-depth analyses of parameter tuning algorithms, hence providing an efficient and effective way for researchers to develop new tuning algorithms and for users to choose appropriate tuning algorithms for their systems. The proposed PTSSBench benchmark together with the experimental results is made publicly available online as an open-source project.
{"title":"PTSSBench: a performance evaluation platform in support of automated parameter tuning of software systems","authors":"Rong Cao, Liang Bao, Panpan Zhangsun, Chase Wu, Shouxin Wei, Ren Sun, Ran Li, Zhe Zhang","doi":"10.1007/s10515-023-00402-z","DOIUrl":"10.1007/s10515-023-00402-z","url":null,"abstract":"<div><p>As software systems become increasingly large and complex, automated parameter tuning of software systems (PTSS) has been the focus of research and many tuning algorithms have been proposed recently. However, due to the lack of a unified platform for comparing and reproducing existing tuning algorithms, it remains a significant challenge for a user to choose an appropriate algorithm for a given software system. There are multiple reasons for this challenge, including diverse experimental conditions, lack of evaluations for different tasks, and excessive evaluation costs of tuning algorithms. In this paper, we propose an extensible and efficient benchmark, referred to as PTSSBench, which provides a unified platform for supporting a comparative study of different tuning algorithms via surrogate models and actual systems. We demonstrate the usability and efficiency of PTSSBench through comparative experiments of six state-of-the-art tuning algorithms from a holistic perspective and a task-oriented perspective. The experimental results show the necessity and effectiveness of parameter tuning for software systems and indicate that the PTSS problem remains an open problem. Moreover, PTSSBench allows extensive runs and in-depth analyses of parameter tuning algorithms, hence providing an efficient and effective way for researchers to develop new tuning algorithms and for users to choose appropriate tuning algorithms for their systems. The proposed PTSSBench benchmark together with the experimental results is made publicly available online as an open-source project.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"31 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2023-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138431481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-19DOI: 10.1007/s10515-023-00401-0
Di Wu, Yang Feng, Hongyu Zhang, Baowen Xu
API tutorials are crucial resources as they often provide detailed explanations of how to utilize APIs. Typically, an API tutorial is segmented into a number of consecutive fragments.. If a fragment explains API usage, we regard it as a relevant fragment of the API. Recognizing relevant fragments can aid developers in comprehending, learning, and using APIs. Recently, some studies have presented relevant fragments recognition approaches, which mainly focused on using API tutorials or Stack Overflow to train the recognition model. API references are also important API learning resources as they contain abundant API knowledge. Considering the similarity between API tutorials and API references (both provide API knowledge), we believe that using API knowledge from API references could help recognize relevant tutorial fragments of APIs effectively. However, it is non-trivial to leverage API references to build a supervised learning-based recognition model. Two major problems are the lack of labeled API references and the unavailability of engineered features of API references. We propose a supervised learning based approach named RRTR (which stands for Recognize Relevant Tutorial fragments using API References) to address the above problems. For the problem of lacking labeled API references, RRTR designs heuristic rules to automatically collect relevant and irrelevant API references for APIs. Regarding the unavailable engineered features issue, we adopt the pre-trained SBERT model (SBERT stands for Sentence-BERT) to automatically learn semantic features for API references. More specifically, we first automatically generate labeled (leftlangle API, ARE rightrangle) pairs (ARE stands for an API reference) via our heuristic rules of API references. We then use SBERT to automatically learn semantic features for the collected pairs and train a supervised learning based recognition model. Finally, we can recognize the relevant tutorial fragments of APIs based on the trained model. To evaluate the effectiveness of RRTR, we collected Java and Android API reference datasets containing a total of 20,680 labeled (leftlangle API, ARE rightrangle) pairs. Experimental results demonstrate that RRTR outperforms state-of-the-art approaches in terms of F-Measure on two datasets. In addition, we conducted a user study to investigate the practicality of RRTR and the results further illustrate the effectiveness of RRTR in practice. The proposed RRTR approach can effectively recognize relevant fragments of APIs with API references by solving the problems of lacking labeled API references and engineered features of API references.
API教程是至关重要的资源,因为它们通常提供如何使用API的详细解释。通常,一个API教程被分割成许多连续的片段。如果一个片段解释了API的使用,我们将其视为API的一个相关片段。识别相关的片段可以帮助开发人员理解、学习和使用api。最近,一些研究提出了相关的片段识别方法,主要集中在使用API教程或Stack Overflow来训练识别模型。API参考文献也是重要的API学习资源,因为它们包含了丰富的API知识。考虑到API教程和API参考文献之间的相似性(都提供API知识),我们认为使用API参考文献中的API知识可以有效地识别相关的API教程片段。然而,利用API引用来构建基于监督学习的识别模型并非易事。两个主要问题是缺乏标记的API引用和API引用的工程特性不可用。我们提出了一种基于监督学习的方法,名为RRTR(即使用API引用识别相关教程片段)来解决上述问题。针对缺乏标记API引用的问题,RRTR设计了启发式规则,自动收集API的相关和不相关API引用。针对工程特征不可用的问题,我们采用预训练的SBERT模型(SBERT代表Sentence-BERT)来自动学习API引用的语义特征。更具体地说,我们首先通过API引用的启发式规则自动生成带标签的(leftlangle API, ARE rightrangle)对(ARE表示API引用)。然后,我们使用SBERT自动学习收集到的对的语义特征,并训练一个基于监督学习的识别模型。最后,我们可以基于训练好的模型来识别相关的api教程片段。为了评估RRTR的有效性,我们收集了Java和Android API参考数据集,共包含20,680个标记的(leftlangle API, ARE rightrangle)对。实验结果表明,RRTR在两个数据集的F-Measure方面优于最先进的方法。此外,我们还对RRTR的实用性进行了用户研究,结果进一步说明了RRTR在实践中的有效性。提出的RRTR方法能够有效地识别具有API引用的API的相关片段,解决了API引用缺乏标记和工程化特征的问题。
{"title":"Automatic recognizing relevant fragments of APIs using API references","authors":"Di Wu, Yang Feng, Hongyu Zhang, Baowen Xu","doi":"10.1007/s10515-023-00401-0","DOIUrl":"10.1007/s10515-023-00401-0","url":null,"abstract":"<div><p>API tutorials are crucial resources as they often provide detailed explanations of how to utilize APIs. Typically, an API tutorial is segmented into a number of consecutive fragments.. If a fragment explains API usage, we regard it as a relevant fragment of the API. Recognizing relevant fragments can aid developers in comprehending, learning, and using APIs. Recently, some studies have presented relevant fragments recognition approaches, which mainly focused on using API tutorials or Stack Overflow to train the recognition model. API references are also important API learning resources as they contain abundant API knowledge. Considering the similarity between API tutorials and API references (both provide API knowledge), we believe that using API knowledge from API references could help recognize relevant tutorial fragments of APIs effectively. However, it is non-trivial to leverage API references to build a supervised learning-based recognition model. Two major problems are the lack of labeled API references and the unavailability of engineered features of API references. We propose a supervised learning based approach named RRTR (which stands for Recognize Relevant Tutorial fragments using API References) to address the above problems. For the problem of lacking labeled API references, RRTR designs heuristic rules to automatically collect relevant and irrelevant API references for APIs. Regarding the unavailable engineered features issue, we adopt the pre-trained SBERT model (SBERT stands for Sentence-BERT) to automatically learn semantic features for API references. More specifically, we first automatically generate labeled <span>(leftlangle API, ARE rightrangle)</span> pairs (ARE stands for an API reference) via our heuristic rules of API references. We then use SBERT to automatically learn semantic features for the collected pairs and train a supervised learning based recognition model. Finally, we can recognize the relevant tutorial fragments of APIs based on the trained model. To evaluate the effectiveness of RRTR, we collected Java and Android API reference datasets containing a total of 20,680 labeled <span>(leftlangle API, ARE rightrangle)</span> pairs. Experimental results demonstrate that RRTR outperforms state-of-the-art approaches in terms of F-Measure on two datasets. In addition, we conducted a user study to investigate the practicality of RRTR and the results further illustrate the effectiveness of RRTR in practice. The proposed RRTR approach can effectively recognize relevant fragments of APIs with API references by solving the problems of lacking labeled API references and engineered features of API references.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"31 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2023-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138138587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}