Software process simulation (SPS) has become an effective tool for software process management and improvement. However, its adoption in industry is less than what the research community expected due to the burden of measurement cost and the high demand for domain knowledge. The difficulty of extracting appropriate metrics with real data from process enactment is one of the great challenges. We aim to provide evidence-based support of the process metrics for software process (simulation) modeling. A systematic literature review was performed by extending our previous review series to draw a comprehensive understanding of the metrics for process modeling following our proposed ontology of metrics in SPS. We identify 131 process modeling studies that collectively involve 1975 raw metrics and classified them into 21 categories using the coding technique. We found product and process external metrics are not used frequently in SPS modeling while resource external metrics are widely used. We analyze the causal relationships between metrics. We find that the models exhibit significant diversity, as no pairwise relationship between metrics accounts for more than 10% SPS models. We identify 17 data issues may encounter in measurement and 10 coping strategies. The results of this study provide process modelers with an evidence-based reference of the identification and the use of metrics in SPS modeling and further contribute to the development of the body of knowledge on software metrics in the context of process modeling. Furthermore, this study is not limited to process simulation but can be extended to software process modeling, in general. Taking simulation metrics as standards and references can further motivate and guide software developers to improve the collection, governance, and application of process data in practice.
This paper presents a procedure for and evaluation of using a semantic similarity metric as a loss function for neural source code summarization. Code summarization is the task of writing natural language descriptions of source code. Neural code summarization refers to automated techniques for generating these descriptions using neural networks. Almost all current approaches involve neural networks as either standalone models or as part of a pretrained large language models, for example, GPT, Codex, and LLaMA. Yet almost all also use a categorical cross-entropy (CCE) loss function for network optimization. Two problems with CCE are that (1) it computes loss over each word prediction one-at-a-time, rather than evaluating a whole sentence, and (2) it requires a perfect prediction, leaving no room for partial credit for synonyms. In this paper, we extend our previous work on semantic similarity metrics to show a procedure for using semantic similarity as a loss function to alleviate this problem, and we evaluate this procedure in several settings in both metrics-driven and human studies. In essence, we propose to use a semantic similarity metric to calculate loss over the whole output sentence prediction per training batch, rather than just loss for each word. We also propose to combine our loss with CCE for each word, which streamlines the training process compared to baselines. We evaluate our approach over several baselines and report improvement in the vast majority of conditions.
Testing is one of the most time-consuming and unpredictable processes within the software development life cycle. As a result, many test case optimization (TCO) techniques have been proposed to make this process more scalable. Object Constraint Language (OCL) was initially introduced as a constraint language to provide additional details to Unified Modeling Language models. However, as OCL continues to evolve, an increasing number of systems are being expressed by this language. Despite this growth, a noticeable research gap exists for the testing of systems whose specifications are expressed in OCL. In our previous work, we verified the effectiveness and efficiency of performing the test case prioritization (TCP) process for these systems. In this study, we extend our previous work by integrating the test case minimization (TCM) process to determine whether TCM can also benefit the testing process under the context of OCL. The evaluation of TCO approaches often relies on well-established metrics such as the average percentage of fault detection (APFD). However, the suitability of APFD for model-based testing (MBT) is not ideal. This paper addresses this limitation by proposing a modification to the APFD metric to enhance its viability for MBT scenarios. We conducted four case studies to evaluate the feasibility of integrating the TCM and TCP processes into our proposed approach. In these studies, we applied the multi-objective optimization algorithm NSGA-II and the genetic algorithm independently to the TCM and TCP processes. The objective was to assess the effectiveness and efficiency of combining TCM and TCP in enhancing the testing phase. Through experimental analysis, the results highlight the benefits of integrating TCM and TCP in the context of OCL-based testing, providing valuable insights for practitioners and researchers aiming to optimize their testing efforts. Specifically, the main contributions of this work include the following: (1) we introduce the integration of the TCM process into the TCO process for systems expressed by OCL. This integration benefits the testing process further by reducing redundant test cases while ensuring sufficient coverage. (2) We comprehensively analyze the limitations associated with the commonly used metric, APFD, and then, a modified version of the APFD metric has been proposed to overcome these weaknesses. (3). We systematically evaluate the effectiveness and efficiency of OCL-based TCO processes on four real-world case studies with different complexities.