Journal of Software-Evolution and Process最新文献_第2页

Influencing Factors' Analysis for the Performance of Parallel Evolutionary Test Case Generation for Web Applications

IF 1.7 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Software-Evolution and Process

Pub Date : 2025-02-05 DOI: 10.1002/smr.2751

Weiwei Wang, Shukai Zhang, Kepeng Qiu, Xuejun Liu, Xiaodan Li, Ruilian Zhao

Evolutionary test case generation plays a vital role in ensuring software quality and reliability. Since Web applications involve a large number of interactions between client and server, the dynamic evolutionary test case generation is very time-consuming, which makes it difficult to apply in actual projects. Obviously, parallelization provides a feasible way to improve the efficiency and effectiveness of evolutionary test generation. In our previous research, the idea of parallelism has been introduced into the evolutionary test generation for Web applications. However, its performance is affected by many factors, such as migration scale, migration frequency, the number of browser processes and subpopulations, and so on. The analysis of influencing factors can guide enhancing the performance of evolutionary test generation. For this reason, this paper analyzes the factors that influence parallel evolutionary algorithms and how they affect the performance of test generation for Web applications. At the same time, different parallel evolutionary test generation methods are designed and implemented. Experiments are conducted on open-source Web applications to generate test cases that meet the server-side sensitive paths coverage criterion, providing guidance and suggestions for the parameter setting of parallel evolutionary test case generation for Web applications. The experimental results show that (1) compared with the global parallelization model, the evolutionary algorithm based on the parallel island model has a greater improvement in test case generation performance. In more detail, when generating test cases with the same server-side sensitive paths coverage, the number of iterations required is reduced by 49.6%, and the time cost is reduced by 58.7%; (2) for the test case generation based on the parallel island model, if the migration scale is large, appropriately increasing the migration frequency can reduce its time cost; (3) if the number of subpopulations is fixed, appropriately increasing the number of browser processes can reduce the time cost of Web application test case evolution, but the number of browser processes should not be too large; otherwise, it may increase the time cost.

{"title":"Influencing Factors' Analysis for the Performance of Parallel Evolutionary Test Case Generation for Web Applications","authors":"Weiwei Wang, Shukai Zhang, Kepeng Qiu, Xuejun Liu, Xiaodan Li, Ruilian Zhao","doi":"10.1002/smr.2751","DOIUrl":"https://doi.org/10.1002/smr.2751","url":null,"abstract":"<div>\u0000 \u0000 <p>Evolutionary test case generation plays a vital role in ensuring software quality and reliability. Since Web applications involve a large number of interactions between client and server, the dynamic evolutionary test case generation is very time-consuming, which makes it difficult to apply in actual projects. Obviously, parallelization provides a feasible way to improve the efficiency and effectiveness of evolutionary test generation. In our previous research, the idea of parallelism has been introduced into the evolutionary test generation for Web applications. However, its performance is affected by many factors, such as migration scale, migration frequency, the number of browser processes and subpopulations, and so on. The analysis of influencing factors can guide enhancing the performance of evolutionary test generation. For this reason, this paper analyzes the factors that influence parallel evolutionary algorithms and how they affect the performance of test generation for Web applications. At the same time, different parallel evolutionary test generation methods are designed and implemented. Experiments are conducted on open-source Web applications to generate test cases that meet the server-side sensitive paths coverage criterion, providing guidance and suggestions for the parameter setting of parallel evolutionary test case generation for Web applications. The experimental results show that (1) compared with the global parallelization model, the evolutionary algorithm based on the parallel island model has a greater improvement in test case generation performance. In more detail, when generating test cases with the same server-side sensitive paths coverage, the number of iterations required is reduced by 49.6%, and the time cost is reduced by 58.7%; (2) for the test case generation based on the parallel island model, if the migration scale is large, appropriately increasing the migration frequency can reduce its time cost; (3) if the number of subpopulations is fixed, appropriately increasing the number of browser processes can reduce the time cost of Web application test case evolution, but the number of browser processes should not be too large; otherwise, it may increase the time cost.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 2","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143248387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Structuring Semantic-Aware Relations Between Bugs and Patches for Accurate Patch Evaluation

IF 1.7 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Software-Evolution and Process

Pub Date : 2025-02-02 DOI: 10.1002/smr.70001

Lingxiao Zhao, Hui Li, Yongqian Chen, Xiaowei Pan, Shikai Guo

Patches can help fix security vulnerabilities and optimize software performance, thereby enhancing the quality and security of the software. Unfortunately, patches generated by automated program repair tools are not always correct, as they may introduce new bugs or fail to fully rectify the original issue. Various methods for evaluating patch correctness have been proposed. However, most methods face the challenge of capturing long-distance dependencies in patch correctness evaluation, which leads to a decline in the predictive performance of the models. To address the challenge, this paper presents a method named Qamhaen to evaluate the correctness of patches generated by APR. Specifically, text embedding of bugs and patches component address the challenge of long-distance dependencies across functions in patch correctness evaluation by using bug reports and patch descriptions as inputs instead of code snippets. BERT is employed for pretraining to capture these dependencies, followed by an additional multihead self-attention mechanism for further feature extraction. Similarity evaluator component devises a similarity calculation to assess the effectiveness of patch descriptions in resolving issues outlined in bug reports. Comprehensive experiments are conducted on a dataset containing 9135 patches and a patch correctness assessment metric, and extensive experiments demonstrate that Qamhaen outperforms baseline methods in terms of overall performance across AUC, F1, +Recall, -Recall, and Precision. For example, compared to the baseline, Qamhaen achieves an F1 of 0.691, representing improvements of 24.2%, 22.1%, and 6.3% over the baseline methods, respectively.

{"title":"Structuring Semantic-Aware Relations Between Bugs and Patches for Accurate Patch Evaluation","authors":"Lingxiao Zhao, Hui Li, Yongqian Chen, Xiaowei Pan, Shikai Guo","doi":"10.1002/smr.70001","DOIUrl":"https://doi.org/10.1002/smr.70001","url":null,"abstract":"<div>\u0000 \u0000 <p>Patches can help fix security vulnerabilities and optimize software performance, thereby enhancing the quality and security of the software. Unfortunately, patches generated by automated program repair tools are not always correct, as they may introduce new bugs or fail to fully rectify the original issue. Various methods for evaluating patch correctness have been proposed. However, most methods face the challenge of capturing long-distance dependencies in patch correctness evaluation, which leads to a decline in the predictive performance of the models. To address the challenge, this paper presents a method named Qamhaen to evaluate the correctness of patches generated by APR. Specifically, text embedding of bugs and patches component address the challenge of long-distance dependencies across functions in patch correctness evaluation by using bug reports and patch descriptions as inputs instead of code snippets. BERT is employed for pretraining to capture these dependencies, followed by an additional multihead self-attention mechanism for further feature extraction. Similarity evaluator component devises a similarity calculation to assess the effectiveness of patch descriptions in resolving issues outlined in bug reports. Comprehensive experiments are conducted on a dataset containing 9135 patches and a patch correctness assessment metric, and extensive experiments demonstrate that Qamhaen outperforms baseline methods in terms of overall performance across <i>AUC</i>, <i>F1</i>, <i>+Recall</i>, <i>-Recall</i>, and <i>Precision</i>. For example, compared to the baseline, Qamhaen achieves an <i>F1</i> of 0.691, representing improvements of 24.2%, 22.1%, and 6.3% over the baseline methods, respectively.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 2","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143110874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Exploring ChatGPT's Potential in Java API Method Recommendation: An Empirical Study

IF 1.7 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Software-Evolution and Process

Pub Date : 2025-01-28 DOI: 10.1002/smr.2765

Ye Wang, Weihao Xue, Qiao Huang, Bo Jiang, Hua Zhang

As software development grows increasingly complex, application programming interface (API) plays a significant role in enhancing development efficiency and code quality. However, the explosive growth in the number of APIs makes it impossible for developers to become familiar with all of them. In actual development scenarios, developers may spend a significant amount of time searching for suitable APIs, which could severely impact the development process. Recently, the OpenAI's large language model (LLM) based application—ChatGPT has shown exceptional performance across various software development tasks, responding swiftly to instructions and generating high-quality textual responses, suggesting its potential in API recommendation tasks. Thus, this paper presents an empirical study to investigate the performance of ChatGPT in query-based API recommendation tasks. Specifically, we utilized the existing benchmark APIBENCH-Q and the newly constructed dataset as evaluation datasets, selecting the state-of-the-art models BIKER and MULAREC for comparison with ChatGPT. Our research findings demonstrate that ChatGPT outperforms existing approaches in terms of success rate, mean reciprocal rank (MRR), and mean average precision (MAP). Through a manual examination of samples in which ChatGPT exceeds baseline performance and those where it provides incorrect answers, we further substantiate ChatGPT's advantages over the baselines and identify several issues contributing to its suboptimal performance. To address these issues and enhance ChatGPT's recommendation capabilities, we employed two strategies: (1) utilizing a more advanced LLM (GPT-4) and (2) exploring a new approach—MACAR, which is based on the Chain of Thought methodology. The results indicate that both strategies are effective.

{"title":"Exploring ChatGPT's Potential in Java API Method Recommendation: An Empirical Study","authors":"Ye Wang, Weihao Xue, Qiao Huang, Bo Jiang, Hua Zhang","doi":"10.1002/smr.2765","DOIUrl":"https://doi.org/10.1002/smr.2765","url":null,"abstract":"<div>\u0000 \u0000 <p>As software development grows increasingly complex, application programming interface (API) plays a significant role in enhancing development efficiency and code quality. However, the explosive growth in the number of APIs makes it impossible for developers to become familiar with all of them. In actual development scenarios, developers may spend a significant amount of time searching for suitable APIs, which could severely impact the development process. Recently, the OpenAI's large language model (LLM) based application—ChatGPT has shown exceptional performance across various software development tasks, responding swiftly to instructions and generating high-quality textual responses, suggesting its potential in API recommendation tasks. Thus, this paper presents an empirical study to investigate the performance of ChatGPT in query-based API recommendation tasks. Specifically, we utilized the existing benchmark APIBENCH-Q and the newly constructed dataset as evaluation datasets, selecting the state-of-the-art models BIKER and MULAREC for comparison with ChatGPT. Our research findings demonstrate that ChatGPT outperforms existing approaches in terms of success rate, mean reciprocal rank (MRR), and mean average precision (MAP). Through a manual examination of samples in which ChatGPT exceeds baseline performance and those where it provides incorrect answers, we further substantiate ChatGPT's advantages over the baselines and identify several issues contributing to its suboptimal performance. To address these issues and enhance ChatGPT's recommendation capabilities, we employed two strategies: (1) utilizing a more advanced LLM (GPT-4) and (2) exploring a new approach—MACAR, which is based on the Chain of Thought methodology. The results indicate that both strategies are effective.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143120196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Process-Technology Fit Decisions: Evidence From an Expert Panel and Case Studies

IF 1.7 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Software-Evolution and Process

Pub Date : 2025-01-27 DOI: 10.1002/smr.70000

Tahir Ahmad, Amy Van Looy

Business process management (BPM) combined with new technologies can trigger both incremental and disruptive improvements in how organizations operate. More specifically, today's fourth industrial revolution can bring rapid changes in an organization's process dynamics. Our study explores differences between possible process-technology “fit” and “unfit” situations in BPM innovative projects. We extend relevant past studies and theories using a mix of qualitative techniques consisting of expert panel interviews and a case design using two field studies. Our findings reveal that, although alternative process-technology “fit” and “no-fit” situations exist, elements such as creativity, efficiency, integration, user friendliness, and proper task monitoring turn out to be the most promising factors to gain a process-technology fit. Novelty in our work includes discovering “fit” and “no-fit” factors in terms of process-technology alignment, and the development of a decision framework with a generic set of suggestions for BPM practitioners and decision makers. Our mixed-method approach is based on qualitative results by emphasizing in-depth insights and lessons learned rather than building a generalizable theory. We intend to guide managers and decision makers to help them think about possible directions, as suggested by our experts and case participants at the time of their technology adoption in a BPM context.

{"title":"Process-Technology Fit Decisions: Evidence From an Expert Panel and Case Studies","authors":"Tahir Ahmad, Amy Van Looy","doi":"10.1002/smr.70000","DOIUrl":"https://doi.org/10.1002/smr.70000","url":null,"abstract":"<div>\u0000 \u0000 <p>Business process management (BPM) combined with new technologies can trigger both incremental and disruptive improvements in how organizations operate. More specifically, today's fourth industrial revolution can bring rapid changes in an organization's process dynamics. Our study explores differences between possible process-technology “fit” and “unfit” situations in BPM innovative projects. We extend relevant past studies and theories using a mix of qualitative techniques consisting of expert panel interviews and a case design using two field studies. Our findings reveal that, although alternative process-technology “fit” and “no-fit” situations exist, elements such as creativity, efficiency, integration, user friendliness, and proper task monitoring turn out to be the most promising factors to gain a process-technology fit. Novelty in our work includes discovering “fit” and “no-fit” factors in terms of process-technology alignment, and the development of a decision framework with a generic set of suggestions for BPM practitioners and decision makers. Our mixed-method approach is based on qualitative results by emphasizing in-depth insights and lessons learned rather than building a generalizable theory. We intend to guide managers and decision makers to help them think about possible directions, as suggested by our experts and case participants at the time of their technology adoption in a BPM context.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143119857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Ensuring Confidentiality in Supply Chains With an Application to Life-Cycle Assessment

IF 1.7 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Software-Evolution and Process

Pub Date : 2025-01-27 DOI: 10.1002/smr.2763

Achim D. Brucker, Sakine Yalman

Modern supply chains of goods and services rely heavily on close collaborations between the partners within these supply chains. Consequently, there is a demand for IT systems that support collaborations between business partners, for instance, allowing for joint computations for global optimizations (in contrast to local optimizations that each partner can do on their own). Still, businesses are very reluctant to share data or connect their enterprise systems to allow for such joint computation. The topmost factor that businesses name as reason for not collaborating, is their security concern in general and, in particular, the confidentiality of business critical data. While there are techniques (e.g., homomorphic encryption or secure multiparty computation) that allow joint computations and, at the same time, that are protecting the confidentiality of the data that flows into such a joint computation, they are not widely used. One of the main problems that prevent their adoption is their perceived performance overhead. In this paper, we address this problem by an approach that utilized the structure of supply chains by decomposing global computations into local groups, and applying secure multiparty computation within each group. This results in a scalable (resulting in a significant smaller runtime overhead than traditional approaches) and secure (i.e., protecting the confidentiality of data provided by supply chain partners) approach for joint computations within supply chains. We evaluate our approach using life-cycle assessment (LCA) as a case study. Our experiments show that, for instance, secure LCA computations even in supply chains with 15 partners are possible within less than two minutes, while traditional approaches using secure multiparty computation need more than a day.

{"title":"Ensuring Confidentiality in Supply Chains With an Application to Life-Cycle Assessment","authors":"Achim D. Brucker, Sakine Yalman","doi":"10.1002/smr.2763","DOIUrl":"https://doi.org/10.1002/smr.2763","url":null,"abstract":"<div>\u0000 \u0000 <p>Modern supply chains of goods and services rely heavily on close collaborations between the partners within these supply chains. Consequently, there is a demand for IT systems that support collaborations between business partners, for instance, allowing for joint computations for global optimizations (in contrast to local optimizations that each partner can do on their own). Still, businesses are very reluctant to share data or connect their enterprise systems to allow for such joint computation. The topmost factor that businesses name as reason for not collaborating, is their security concern in general and, in particular, the confidentiality of business critical data. While there are techniques (e.g., homomorphic encryption or secure multiparty computation) that allow joint computations <i>and</i>, at the same time, that are protecting the confidentiality of the data that flows into such a joint computation, they are not widely used. One of the main problems that prevent their adoption is their perceived performance overhead. In this paper, we address this problem by an approach that utilized the structure of supply chains by decomposing global computations into local groups, and applying secure multiparty computation within each group. This results in a scalable (resulting in a significant smaller runtime overhead than traditional approaches) <i>and</i> secure (i.e., protecting the confidentiality of data provided by supply chain partners) approach for joint computations within supply chains. We evaluate our approach using life-cycle assessment (LCA) as a case study. Our experiments show that, for instance, secure LCA computations even in supply chains with 15 partners are possible within less than two minutes, while traditional approaches using secure multiparty computation need more than a day.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143119858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Dissecting Code Features: An Evolutionary Analysis of Kernel Versus Nonkernel Code in Operating Systems

IF 1.7 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Software-Evolution and Process

Pub Date : 2025-01-25 DOI: 10.1002/smr.2752

Yangyang Zhao, Chenglin Li, Zhifei Chen, Zuohua Ding

Understanding the evolution of software systems is crucial for advancing software engineering practices. Many studies have been devoted to exploring software evolution. However, they primarily treat software as an entire entity and overlook the inherent differences between subsystems, which may lead to biased conclusions. In this study, we attempt to explore variations between subsystems by investigating the code feature differences between kernel and nonkernel components from an evolutionary perspective. Based on three operating systems as case studies, we examine multiple dimensions, including the code churn characteristics and code inherent characteristics. The main findings are as follows: (1) The proportion of kernel code remains relatively small, and exhibits consistent stability across the majority of versions as systems evolve. (2) Kernel code exhibits higher stability in contrast to nonkernel code, characterized by a lower modification rate and finer modification granularity. The patterns of modification activities are similar in both kernel and nonkernel code, with a preference of changing code and a tendency to avoid the combination of adding and deleting code. (3) The cumulative code size and complexity of kernel files show an upward trajectory as the system evolves. (4) Kernel files exhibit a significantly higher code density and complexity than nonkernel files, featuring a greater number of code line, comments, and statements, along with a larger program length, vocabulary, and volume. Conversely, kernel functions prioritize modularity and maintainability, with a significantly smaller size and lower complexity than nonkernel functions. These insights contribute to a deeper understanding of the dynamics within operating system codebases and highlight the necessity of targeted maintenance strategies for different subsystems.

{"title":"Dissecting Code Features: An Evolutionary Analysis of Kernel Versus Nonkernel Code in Operating Systems","authors":"Yangyang Zhao, Chenglin Li, Zhifei Chen, Zuohua Ding","doi":"10.1002/smr.2752","DOIUrl":"https://doi.org/10.1002/smr.2752","url":null,"abstract":"<div>\u0000 \u0000 <p>Understanding the evolution of software systems is crucial for advancing software engineering practices. Many studies have been devoted to exploring software evolution. However, they primarily treat software as an entire entity and overlook the inherent differences between subsystems, which may lead to biased conclusions. In this study, we attempt to explore variations between subsystems by investigating the code feature differences between kernel and nonkernel components from an evolutionary perspective. Based on three operating systems as case studies, we examine multiple dimensions, including the code churn characteristics and code inherent characteristics. The main findings are as follows: (1) The proportion of kernel code remains relatively small, and exhibits consistent stability across the majority of versions as systems evolve. (2) Kernel code exhibits higher stability in contrast to nonkernel code, characterized by a lower modification rate and finer modification granularity. The patterns of modification activities are similar in both kernel and nonkernel code, with a preference of changing code and a tendency to avoid the combination of adding and deleting code. (3) The cumulative code size and complexity of kernel files show an upward trajectory as the system evolves. (4) Kernel files exhibit a significantly higher code density and complexity than nonkernel files, featuring a greater number of code line, comments, and statements, along with a larger program length, vocabulary, and volume. Conversely, kernel functions prioritize modularity and maintainability, with a significantly smaller size and lower complexity than nonkernel functions. These insights contribute to a deeper understanding of the dynamics within operating system codebases and highlight the necessity of targeted maintenance strategies for different subsystems.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143119356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Better Knowledge Enhancement for Privacy-Preserving Cross-Project Defect Prediction

IF 1.7 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Software-Evolution and Process

Pub Date : 2025-01-19 DOI: 10.1002/smr.2761

Yuying Wang, Yichen Li, Haozhao Wang, Lei Zhao, Xiaofang Zhang

Cross-project defect prediction (CPDP) poses a nontrivial challenge to construct a reliable defect predictor by leveraging data from other projects, particularly when data owners are concerned about data privacy. In recent years, federated learning (FL) has become an emerging paradigm to guarantee privacy information by collaborative training a global model among multiple parties without sharing raw data. While the direct application of FL to the CPDP task offers a promising solution to address privacy concerns, the data heterogeneity arising from proprietary projects across different companies or organizations will bring troubles for model training. In this paper, we study the privacy-preserving CPDP with data heterogeneity under the FL framework. To address this problem, we propose a novel knowledge enhancement approach named FedDP with two simple but effective solutions: 1. local heterogeneity awareness and 2. global knowledge distillation. Specifically, we employ open-source project data as the distillation dataset and optimize the global model with the heterogeneity-aware local model ensemble via knowledge distillation. Experimental results on 19 projects from two datasets demonstrate that our method significantly outperforms baselines.

{"title":"Better Knowledge Enhancement for Privacy-Preserving Cross-Project Defect Prediction","authors":"Yuying Wang, Yichen Li, Haozhao Wang, Lei Zhao, Xiaofang Zhang","doi":"10.1002/smr.2761","DOIUrl":"https://doi.org/10.1002/smr.2761","url":null,"abstract":"<div>\u0000 \u0000 <p>Cross-project defect prediction (CPDP) poses a nontrivial challenge to construct a reliable defect predictor by leveraging data from other projects, particularly when data owners are concerned about data privacy. In recent years, federated learning (FL) has become an emerging paradigm to guarantee privacy information by collaborative training a global model among multiple parties without sharing raw data. While the direct application of FL to the CPDP task offers a promising solution to address privacy concerns, the data heterogeneity arising from proprietary projects across different companies or organizations will bring troubles for model training. In this paper, we study the privacy-preserving CPDP with data heterogeneity under the FL framework. To address this problem, we propose a novel knowledge enhancement approach named <b>FedDP</b> with two simple but effective solutions: 1. local heterogeneity awareness and 2. global knowledge distillation. Specifically, we employ open-source project data as the distillation dataset and optimize the global model with the heterogeneity-aware local model ensemble via knowledge distillation. Experimental results on 19 projects from two datasets demonstrate that our method significantly outperforms baselines.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143117384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Quality-in-Use in Practice: A Study for Context-Aware Software Systems in Pervasive Environments

IF 1.7 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Software-Evolution and Process

Pub Date : 2025-01-19 DOI: 10.1002/smr.2764

Sergio Salomón, Rafael Duque, Santos Bringas, Káthia Marçal de Oliveira

Software quality models have increasingly emphasized human factors and user needs. In 2011, ISO/IEC 25010 introduced the quality-in-use (QinU) model, designed to evaluate software quality as an outcome of a user utilizing a system through the evaluation of five characteristics: effectiveness, efficiency, freedom from risk, satisfaction, and context coverage. As a generic reference, this standard has been customized for various software types (e.g., web portals and artificial intelligence systems). This article presents a customization for context-aware software systems (CASSs), which are software systems that interpret and use context information (regarding the user, the software application features, and the environment) to adapt their functionalities. We are particularly interested in CASS for pervasive, or ubiquitous, environments. To address this goal, each QinU characteristic was analyzed by professionals from the academy and industry, taking into account the CASS features for pervasive/ubiquitous environments. A cyclical process of definition, revision, and improvement based on measurement theory was carried out before empirical validation in case studies. As the main result, a novel set of QinU measures specifically tailored for CASSs in a pervasive environment is provided, considering not only the classic explicit user interactions (e.g., mouse clicks and text input) but also the implicit interactions during everyday activities (e.g., walking or driving), captured through sensors and processed to support the user (e.g., recommending nearby museums and providing driving guidance). This set of measures supports CASS assessment and improvements, offering more accurate and context-sensitive quality measurement.

{"title":"Quality-in-Use in Practice: A Study for Context-Aware Software Systems in Pervasive Environments","authors":"Sergio Salomón, Rafael Duque, Santos Bringas, Káthia Marçal de Oliveira","doi":"10.1002/smr.2764","DOIUrl":"https://doi.org/10.1002/smr.2764","url":null,"abstract":"<div>\u0000 \u0000 <p>Software quality models have increasingly emphasized human factors and user needs. In 2011, ISO/IEC 25010 introduced the quality-in-use (QinU) model, designed to evaluate software quality as an outcome of a user utilizing a system through the evaluation of five characteristics: effectiveness, efficiency, freedom from risk, satisfaction, and context coverage. As a generic reference, this standard has been customized for various software types (e.g., web portals and artificial intelligence systems). This article presents a customization for context-aware software systems (CASSs), which are software systems that interpret and use context information (regarding the user, the software application features, and the environment) to adapt their functionalities. We are particularly interested in CASS for pervasive, or ubiquitous, environments. To address this goal, each QinU characteristic was analyzed by professionals from the academy and industry, taking into account the CASS features for pervasive/ubiquitous environments. A cyclical process of definition, revision, and improvement based on measurement theory was carried out before empirical validation in case studies. As the main result, a novel set of QinU measures specifically tailored for CASSs in a pervasive environment is provided, considering not only the classic explicit user interactions (e.g., mouse clicks and text input) but also the implicit interactions during everyday activities (e.g., walking or driving), captured through sensors and processed to support the user (e.g., recommending nearby museums and providing driving guidance). This set of measures supports CASS assessment and improvements, offering more accurate and context-sensitive quality measurement.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143117383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Cloud and Edge Computing as Effective Trends in Business Model Innovation: A Bibliometric Review

IF 1.7 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Software-Evolution and Process

Pub Date : 2025-01-16 DOI: 10.1002/smr.2754

Y. Sun

Developing an information technology (IT) infrastructure for business and organizational operations has gained importance concurrently with the emergence of edge and cloud computing technologies. On the other hand, innovation is crucial for businesses because consumers demand better service and lower ownership costs. Several firms use the cloud as their innovation platform to expand and stay competitive. In the commercial world, cloud computing is swiftly embraced by small, medium, and big businesses in sectors including aerospace, automotive, logistics, financial services, textiles, and health. With all the benefits cloud computing has for business improvement, some issues and problems prevent organizations and companies from migrating to the cloud. Due to these evidentiary gaps, I combined quantitative and qualitative analytical techniques to synthesize the existing literature and identify possible directions for future studies that may have an impact. The primary aim of this research is to provide a bibliometric procedure to review the state of the art on this scope and provide a roadmap for future studies; then, it provides insight into how the adoption and sustained usage of cloud sourcing could and edge encourage the creation of novel business models and impact a company's competitive advantage. The publications were reviewed using a bibliometric approach that divided the papers into four categories: cloud computing data security, business risk management techniques, resource allocation, and business performance. According to the results, adopting cloud and edge computing service delivery models is connected with increased revenue and less capital investment in IT assets for the company. Also, the results showed that one of the main obstacles to implementing cloud and edge computing technology in organizations is low security and lack of sufficient internal resources.

{"title":"Cloud and Edge Computing as Effective Trends in Business Model Innovation: A Bibliometric Review","authors":"Y. Sun","doi":"10.1002/smr.2754","DOIUrl":"https://doi.org/10.1002/smr.2754","url":null,"abstract":"<div>\u0000 \u0000 <p>Developing an information technology (IT) infrastructure for business and organizational operations has gained importance concurrently with the emergence of edge and cloud computing technologies. On the other hand, innovation is crucial for businesses because consumers demand better service and lower ownership costs. Several firms use the cloud as their innovation platform to expand and stay competitive. In the commercial world, cloud computing is swiftly embraced by small, medium, and big businesses in sectors including aerospace, automotive, logistics, financial services, textiles, and health. With all the benefits cloud computing has for business improvement, some issues and problems prevent organizations and companies from migrating to the cloud. Due to these evidentiary gaps, I combined quantitative and qualitative analytical techniques to synthesize the existing literature and identify possible directions for future studies that may have an impact. The primary aim of this research is to provide a bibliometric procedure to review the state of the art on this scope and provide a roadmap for future studies; then, it provides insight into how the adoption and sustained usage of cloud sourcing could and edge encourage the creation of novel business models and impact a company's competitive advantage. The publications were reviewed using a bibliometric approach that divided the papers into four categories: cloud computing data security, business risk management techniques, resource allocation, and business performance. According to the results, adopting cloud and edge computing service delivery models is connected with increased revenue and less capital investment in IT assets for the company. Also, the results showed that one of the main obstacles to implementing cloud and edge computing technology in organizations is low security and lack of sufficient internal resources.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143115639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Understanding Remote Work Experience: Insights Into Well-Being

IF 1.7 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Software-Evolution and Process

Pub Date : 2025-01-16 DOI: 10.1002/smr.2757

Aurora Vizcaíno, Julio Suárez, Darja Šmite, Félix O. García

Background

After the pandemic, software engineers were forced to work remotely, in many cases without prior experience of doing so.

Objective

The objective of this work is to analyze the factors that influence engineers' motivation, stress and performance when working remotely after the pandemic, and to what level.

Methods

A significant number (around 1000) of Latin-American software development professionals from different countries who work remotely were surveyed in order to study the factors that affect them and how when they work in this manner. The data collected from the survey were then statistically analyzed using the partial least square-structural equation modeling (PLS-SEM) method.

Conclusions

The analysis of the data made it possible to conclude that there are direct negative effects of stress on performance and direct positive effects of motivation on performance. In addition, we found that skills, experience, and teamwork behavior, such as trust, communication, and knowledge sharing, play an important role when working remotely.

{"title":"Understanding Remote Work Experience: Insights Into Well-Being","authors":"Aurora Vizcaíno, Julio Suárez, Darja Šmite, Félix O. García","doi":"10.1002/smr.2757","DOIUrl":"https://doi.org/10.1002/smr.2757","url":null,"abstract":"<div>\u0000 \u0000 \u0000 <section>\u0000 \u0000 <h3> Background</h3>\u0000 \u0000 <p>After the pandemic, software engineers were forced to work remotely, in many cases without prior experience of doing so.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Objective</h3>\u0000 \u0000 <p>The objective of this work is to analyze the factors that influence engineers' motivation, stress and performance when working remotely after the pandemic, and to what level.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Methods</h3>\u0000 \u0000 <p>A significant number (around 1000) of Latin-American software development professionals from different countries who work remotely were surveyed in order to study the factors that affect them and how when they work in this manner. The data collected from the survey were then statistically analyzed using the partial least square-structural equation modeling (PLS-SEM) method.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Conclusions</h3>\u0000 \u0000 <p>The analysis of the data made it possible to conclude that there are direct negative effects of stress on performance and direct positive effects of motivation on performance. In addition, we found that skills, experience, and teamwork behavior, such as trust, communication, and knowledge sharing, play an important role when working remotely.</p>\u0000 </section>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/smr.2757","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143115640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0