Test case prioritization techniques improve the fault detection rate by adjusting the execution sequence of test cases. For static black-box test case prioritization techniques, existing methods generally improve the fault detection rate by increasing the early diversity of execution sequences based on string distance differences. However, such methods have a high time overhead and are less stable. This paper proposes a novel test case prioritization method (DC-TCP) based on density-based spatial clustering of applications with noise (DBSCAN) and combination policies. By introducing a combination strategy to model the inputs to generate a mapping model, the test inputs are mapped to consistent types to improve generality. The DBSCAN method is then used to refine the classification of test cases further, and finally, the Firefly search strategy is introduced to improve the effectiveness of sequence merging. Extensive experimental results demonstrate that the proposed DC-TCP method outperforms other methods in terms of the average percentage of faults detected and exhibits advantages in terms of time efficiency when compared to several existing static black-box sorting methods.
{"title":"Exploiting DBSCAN and Combination Strategy to Prioritize the Test Suite in Regression Testing","authors":"Zikang Zhang, Jinfu Chen, Yuechao Gu, Zhehao Li, Rexford Nii Ayitey Sosu","doi":"10.1049/2024/9942959","DOIUrl":"10.1049/2024/9942959","url":null,"abstract":"<p>Test case prioritization techniques improve the fault detection rate by adjusting the execution sequence of test cases. For static black-box test case prioritization techniques, existing methods generally improve the fault detection rate by increasing the early diversity of execution sequences based on string distance differences. However, such methods have a high time overhead and are less stable. This paper proposes a novel test case prioritization method (DC-TCP) based on density-based spatial clustering of applications with noise (DBSCAN) and combination policies. By introducing a combination strategy to model the inputs to generate a mapping model, the test inputs are mapped to consistent types to improve generality. The DBSCAN method is then used to refine the classification of test cases further, and finally, the Firefly search strategy is introduced to improve the effectiveness of sequence merging. Extensive experimental results demonstrate that the proposed DC-TCP method outperforms other methods in terms of the average percentage of faults detected and exhibits advantages in terms of time efficiency when compared to several existing static black-box sorting methods.</p>","PeriodicalId":50378,"journal":{"name":"IET Software","volume":"2024 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/9942959","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141096405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the continuous advancement of autonomous driving technology, visual analysis techniques have emerged as a prominent research topic. The data generated by autonomous driving is large-scale and time-varying, yet more than existing visual analytics methods are required to deal with such complex data effectively. Time-varying diagrams can be used to model and visualize the dynamic relationships in various complex systems and can visually describe the data trends in autonomous driving systems. To this end, this paper introduces a time-varying graph-based method for visual analysis in autonomous driving. The proposed method employs a graph structure to represent the relative positional relationships between the target and obstacle interferences. By incorporating the time dimension, a time-varying graph model is constructed. The method explores the characteristic changes of nodes in the graph at different time instances, establishing feature expressions that differentiate target and obstacle motion patterns. The analysis demonstrates that the feature vector centrality in the time-varying graph effectively captures the distinctions in motion patterns between targets and obstacles. These features can be utilized for accurate target and obstacle recognition, achieving high recognition accuracy. To evaluate the proposed time-varying graph-based visual analytic autopilot method, a comparative study is conducted against traditional visual analytic methods such as the frame differencing method and advanced visual analytic methods like visual lidar odometry and mapping. Robustness, accuracy, and resource consumption experiments are performed using the publicly available KITTI dataset to analyze and compare the three methods. The experimental results show that the proposed time-varying graph-based method exhibits superior accuracy and robustness. This study offers valuable insights and solution ideas for developing deep integration between intelligent networked vehicles and intelligent transportation. It provides a reference for advancing intelligent transportation systems and their integration with autonomous driving technologies.
{"title":"An Expository Examination of Temporally Evolving Graph-Based Approaches for the Visual Investigation of Autonomous Driving","authors":"Li Wan, Wenzhi Cheng","doi":"10.1049/2024/5802816","DOIUrl":"10.1049/2024/5802816","url":null,"abstract":"<p>With the continuous advancement of autonomous driving technology, visual analysis techniques have emerged as a prominent research topic. The data generated by autonomous driving is large-scale and time-varying, yet more than existing visual analytics methods are required to deal with such complex data effectively. Time-varying diagrams can be used to model and visualize the dynamic relationships in various complex systems and can visually describe the data trends in autonomous driving systems. To this end, this paper introduces a time-varying graph-based method for visual analysis in autonomous driving. The proposed method employs a graph structure to represent the relative positional relationships between the target and obstacle interferences. By incorporating the time dimension, a time-varying graph model is constructed. The method explores the characteristic changes of nodes in the graph at different time instances, establishing feature expressions that differentiate target and obstacle motion patterns. The analysis demonstrates that the feature vector centrality in the time-varying graph effectively captures the distinctions in motion patterns between targets and obstacles. These features can be utilized for accurate target and obstacle recognition, achieving high recognition accuracy. To evaluate the proposed time-varying graph-based visual analytic autopilot method, a comparative study is conducted against traditional visual analytic methods such as the frame differencing method and advanced visual analytic methods like visual lidar odometry and mapping. Robustness, accuracy, and resource consumption experiments are performed using the publicly available KITTI dataset to analyze and compare the three methods. The experimental results show that the proposed time-varying graph-based method exhibits superior accuracy and robustness. This study offers valuable insights and solution ideas for developing deep integration between intelligent networked vehicles and intelligent transportation. It provides a reference for advancing intelligent transportation systems and their integration with autonomous driving technologies.</p>","PeriodicalId":50378,"journal":{"name":"IET Software","volume":"2024 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/5802816","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140225546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the increasing number of software projects, within-project defect prediction (WPDP) has already been unable to meet the demand, and cross-project defect prediction (CPDP) is playing an increasingly significant role in the area of software engineering. The classic CPDP methods mainly concentrated on applying metric features to predict defects. However, these approaches failed to consider the rich semantic information, which usually contains the relationship between software defects and context. Since traditional methods are unable to exploit this characteristic, their performance is often unsatisfactory. In this paper, a transfer long short-term memory (TLSTM) network model is first proposed. Transfer semantic features are extracted by adding a transfer learning algorithm to the long short-term memory (LSTM) network. Then, the traditional metric features and semantic features are combined for CPDP. First, the abstract syntax trees (AST) are generated based on the source codes. Second, the AST node contents are converted into integer vectors as inputs to the TLSTM model. Then, the semantic features of the program can be extracted by TLSTM. On the other hand, transferable metric features are extracted by transfer component analysis (TCA). Finally, the semantic features and metric features are combined and input into the logical regression (LR) classifier for training. The presented TLSTM model performs better on the f-measure indicator than other machine and deep learning models, according to the outcomes of several open-source projects of the PROMISE repository. The TLSTM model built with a single feature achieves 0.7% and 2.1% improvement on Log4j-1.2 and Xalan-2.7, respectively. When using combined features to train the prediction model, we call this model a transfer long short-term memory for defect prediction (DPTLSTM). DPTLSTM achieves a 2.9% and 5% improvement on Synapse-1.2 and Xerces-1.4.4, respectively. Both prove the superiority of the proposed model on the CPDP task. This is because LSTM capture long-term dependencies in sequence data and extract features that contain source code structure and context information. It can be concluded that: (1) the TLSTM model has the advantage of preserving information, which can better retain the semantic features related to software defects; (2) compared with the CPDP model trained with traditional metric features, the performance of the model can validly enhance by combining semantic features and metric features.
{"title":"Cross-Project Defect Prediction Using Transfer Learning with Long Short-Term Memory Networks","authors":"Hongwei Tao, Lianyou Fu, Qiaoling Cao, Xiaoxu Niu, Haoran Chen, Songtao Shang, Yang Xian","doi":"10.1049/2024/5550801","DOIUrl":"10.1049/2024/5550801","url":null,"abstract":"<p>With the increasing number of software projects, within-project defect prediction (WPDP) has already been unable to meet the demand, and cross-project defect prediction (CPDP) is playing an increasingly significant role in the area of software engineering. The classic CPDP methods mainly concentrated on applying metric features to predict defects. However, these approaches failed to consider the rich semantic information, which usually contains the relationship between software defects and context. Since traditional methods are unable to exploit this characteristic, their performance is often unsatisfactory. In this paper, a transfer long short-term memory (TLSTM) network model is first proposed. Transfer semantic features are extracted by adding a transfer learning algorithm to the long short-term memory (LSTM) network. Then, the traditional metric features and semantic features are combined for CPDP. First, the abstract syntax trees (AST) are generated based on the source codes. Second, the AST node contents are converted into integer vectors as inputs to the TLSTM model. Then, the semantic features of the program can be extracted by TLSTM. On the other hand, transferable metric features are extracted by transfer component analysis (TCA). Finally, the semantic features and metric features are combined and input into the logical regression (LR) classifier for training. The presented TLSTM model performs better on the <i>f</i>-measure indicator than other machine and deep learning models, according to the outcomes of several open-source projects of the PROMISE repository. The TLSTM model built with a single feature achieves 0.7% and 2.1% improvement on Log4j-1.2 and Xalan-2.7, respectively. When using combined features to train the prediction model, we call this model a transfer long short-term memory for defect prediction (DPTLSTM). DPTLSTM achieves a 2.9% and 5% improvement on Synapse-1.2 and Xerces-1.4.4, respectively. Both prove the superiority of the proposed model on the CPDP task. This is because LSTM capture long-term dependencies in sequence data and extract features that contain source code structure and context information. It can be concluded that: (1) the TLSTM model has the advantage of preserving information, which can better retain the semantic features related to software defects; (2) compared with the CPDP model trained with traditional metric features, the performance of the model can validly enhance by combining semantic features and metric features.</p>","PeriodicalId":50378,"journal":{"name":"IET Software","volume":"2024 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/5550801","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140233101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the rapidly evolving landscape of social media, the demand for precise sentiment analysis (SA) on multimodal data has become increasingly pivotal. This paper introduces a sophisticated data lake architecture tailored for efficient multimodal emotion feature extraction, addressing the challenges posed by diverse data types. The proposed framework encompasses a robust storage solution and an innovative SA model, multilevel spatial attention fusion (MLSAF), adept at handling text and visual data concurrently. The data lake architecture comprises five layers, facilitating real-time and offline data collection, storage, processing, standardized interface services, and data mining analysis. The MLSAF model, integrated into the data lake architecture, utilizes a novel approach to SA. It employs a text-guided spatial attention mechanism, fusing textual and visual features to discern subtle emotional interplays. The model’s end-to-end learning approach and attention modules contribute to its efficacy in capturing nuanced sentiment expressions. Empirical evaluations on established multimodal sentiment datasets, MVSA-Single and MVSA-Multi, validate the proposed methodology’s effectiveness. Comparative analyses with state-of-the-art models showcase the superior performance of our approach, with an accuracy improvement of 6% on MVSA-Single and 1.6% on MVSA-Multi. This research significantly contributes to optimizing SA in social media data by offering a versatile and potent framework for data management and analysis. The integration of MLSAF with a scalable data lake architecture presents a strategic innovation poised to navigate the evolving complexities of social media data analytics.
在快速发展的社交媒体环境中,对多模态数据进行精确情感分析(SA)的需求变得越来越重要。本文介绍了一种为高效多模态情感特征提取量身定制的复杂数据湖架构,以应对不同数据类型带来的挑战。所提出的框架包括一个强大的存储解决方案和一个创新的 SA 模型--多级空间注意力融合(MLSAF),该模型善于同时处理文本和视觉数据。数据湖架构由五层组成,便于实时和离线数据收集、存储、处理、标准化接口服务和数据挖掘分析。集成到数据湖架构中的 MLSAF 模型采用了一种新颖的 SA 方法。它采用文本引导的空间注意力机制,融合文本和视觉特征来辨别微妙的情感交织。该模型的端到端学习方法和注意力模块有助于有效捕捉细微的情感表达。在已建立的多模态情感数据集 MVSA-Single 和 MVSA-Multi 上进行的实证评估验证了所提出方法的有效性。与最先进模型的对比分析表明,我们的方法性能优越,在 MVSA-Single 和 MVSA-Multi 数据集上的准确率分别提高了 6% 和 1.6%。这项研究为数据管理和分析提供了一个多功能的有效框架,为优化社交媒体数据中的 SA 做出了重大贡献。将 MLSAF 与可扩展的数据湖架构整合在一起,是一项战略性创新,有助于驾驭社交媒体数据分析不断变化的复杂性。
{"title":"Design and Efficacy of a Data Lake Architecture for Multimodal Emotion Feature Extraction in Social Media","authors":"Yuanyuan Fan, Xifeng Mi","doi":"10.1049/2024/6819714","DOIUrl":"10.1049/2024/6819714","url":null,"abstract":"<p>In the rapidly evolving landscape of social media, the demand for precise sentiment analysis (SA) on multimodal data has become increasingly pivotal. This paper introduces a sophisticated data lake architecture tailored for efficient multimodal emotion feature extraction, addressing the challenges posed by diverse data types. The proposed framework encompasses a robust storage solution and an innovative SA model, multilevel spatial attention fusion (MLSAF), adept at handling text and visual data concurrently. The data lake architecture comprises five layers, facilitating real-time and offline data collection, storage, processing, standardized interface services, and data mining analysis. The MLSAF model, integrated into the data lake architecture, utilizes a novel approach to SA. It employs a text-guided spatial attention mechanism, fusing textual and visual features to discern subtle emotional interplays. The model’s end-to-end learning approach and attention modules contribute to its efficacy in capturing nuanced sentiment expressions. Empirical evaluations on established multimodal sentiment datasets, MVSA-Single and MVSA-Multi, validate the proposed methodology’s effectiveness. Comparative analyses with state-of-the-art models showcase the superior performance of our approach, with an accuracy improvement of 6% on MVSA-Single and 1.6% on MVSA-Multi. This research significantly contributes to optimizing SA in social media data by offering a versatile and potent framework for data management and analysis. The integration of MLSAF with a scalable data lake architecture presents a strategic innovation poised to navigate the evolving complexities of social media data analytics.</p>","PeriodicalId":50378,"journal":{"name":"IET Software","volume":"2024 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/6819714","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141096394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Developer question and answering communities rely on experts to provide helpful answers. However, these communities face a shortage of experts. To cultivate more experts, the community needs to quantify and analyze the rules of the influence of extrinsic motivations on the ongoing contributions of those developers who can become experts in the future (potential experts). Currently, there is a lack of potential expert-centred research on community incentives. To address this gap, we propose a motivational impact model with self-determination theory-based hypotheses to explore the impact of five extrinsic motivations (badge, status, learning, reputation, and reciprocity) for potential experts. We develop a status-based timeline partitioning method to count information on the sustained contributions of potential experts from Stack Overflow data and propose a multifactor assessment model to examine the motivational impact model to determine the relationship between potential experts’ extrinsic motivations and sustained contributions. Our results show that (i) badge and reciprocity promote the continuous contributions of potential experts while reputation and status reduce their contributions; (ii) status significantly affects the impact of reciprocity on potential experts’ contributions; (iii) the difference in the influence of extrinsic motivations on potential experts and active developers lies in the influence of reputation, learning, and status and its moderating effect. Based on these findings, we recommend that community managers identify potential experts early and optimize reputation and status incentives to incubate more experts.
{"title":"Unveiling the Dynamics of Extrinsic Motivations in Shaping Future Experts’ Contributions to Developer Q&A Communities","authors":"Yi Yang, Xinjun Mao, Menghan Wu","doi":"10.1049/2024/8354862","DOIUrl":"10.1049/2024/8354862","url":null,"abstract":"<p>Developer question and answering communities rely on experts to provide helpful answers. However, these communities face a shortage of experts. To cultivate more experts, the community needs to quantify and analyze the rules of the influence of extrinsic motivations on the ongoing contributions of those developers who can become experts in the future (potential experts). Currently, there is a lack of potential expert-centred research on community incentives. To address this gap, we propose a motivational impact model with self-determination theory-based hypotheses to explore the impact of five extrinsic motivations (badge, status, learning, reputation, and reciprocity) for potential experts. We develop a status-based timeline partitioning method to count information on the sustained contributions of potential experts from Stack Overflow data and propose a multifactor assessment model to examine the motivational impact model to determine the relationship between potential experts’ extrinsic motivations and sustained contributions. Our results show that (i) badge and reciprocity promote the continuous contributions of potential experts while reputation and status reduce their contributions; (ii) status significantly affects the impact of reciprocity on potential experts’ contributions; (iii) the difference in the influence of extrinsic motivations on potential experts and active developers lies in the influence of reputation, learning, and status and its moderating effect. Based on these findings, we recommend that community managers identify potential experts early and optimize reputation and status incentives to incubate more experts.</p>","PeriodicalId":50378,"journal":{"name":"IET Software","volume":"2024 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/2024/8354862","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139794398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}