首页 > 最新文献

Information and Software Technology最新文献

英文 中文
Wise recommender: LLMs refined by iterative critics 明智的推荐人:经过反复批评的法学硕士
IF 4.3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-04-01 Epub Date: 2026-01-09 DOI: 10.1016/j.infsof.2026.108021
Zhisheng Yang , Xiaofei Xu , Ke Deng , Li Li

Context:

Large Language Models (LLMs) have been applied to recommendation tasks, giving rise to the new paradigm of LLM-as-Recommendation Systems (LLM-as-RS). Existing methods fall into two categories: tuning and non-tuning. While tuning strategies offer better task alignment, they are expensive and require specialized training. Non-tuning strategies are easier to deploy but often lack task-specific knowledge, limiting their effectiveness.

Objective:

This study aims to enhance the recommendation quality of non-tuning LLM-based systems by addressing their lack of task awareness.

Method:

We propose a novel approach, Critique-based LLMs as Recommendation Systems (Critic-LLM-RS), which introduces an independent machine learning model—the Recommendation Critic—to provide feedback on LLM-generated recommendations and guide the LLM toward improved recommendation strategies.

Results:

Experiments on multiple real-world datasets demonstrate that Critic-LLM-RS significantly outperforms existing non-tuning approaches, regardless of whether open-source or proprietary LLMs are used.

Conclusion:

Critic-LLM-RS enhances the task adaptability of non-tuning LLMs through a collaborative feedback mechanism, offering a new solution for building efficient and easily deployable recommendation systems.
背景:大型语言模型(llm)已经被应用于推荐任务,产生了llm -as-推荐系统(LLM-as-RS)的新范式。现有的方法分为两类:调优和非调优。虽然调优策略提供了更好的任务一致性,但它们是昂贵的,并且需要专门的培训。非调优策略更容易部署,但通常缺乏特定于任务的知识,从而限制了它们的有效性。目的:本研究旨在通过解决非调优llm系统缺乏任务意识的问题来提高其推荐质量。方法:我们提出了一种新颖的方法,基于批评的法学硕士作为推荐系统(critical -LLM- rs),它引入了一个独立的机器学习模型——推荐批评——来对法学硕士生成的建议提供反馈,并指导法学硕士改进推荐策略。结果:在多个真实世界数据集上的实验表明,无论使用开源还是专有的llm, critical - llm - rs都明显优于现有的非调优方法。结论:critical - llm - rs通过协同反馈机制增强了非调优llm的任务适应性,为构建高效且易于部署的推荐系统提供了新的解决方案。
{"title":"Wise recommender: LLMs refined by iterative critics","authors":"Zhisheng Yang ,&nbsp;Xiaofei Xu ,&nbsp;Ke Deng ,&nbsp;Li Li","doi":"10.1016/j.infsof.2026.108021","DOIUrl":"10.1016/j.infsof.2026.108021","url":null,"abstract":"<div><h3>Context:</h3><div>Large Language Models (LLMs) have been applied to recommendation tasks, giving rise to the new paradigm of LLM-as-Recommendation Systems (LLM-as-RS). Existing methods fall into two categories: tuning and non-tuning. While tuning strategies offer better task alignment, they are expensive and require specialized training. Non-tuning strategies are easier to deploy but often lack task-specific knowledge, limiting their effectiveness.</div></div><div><h3>Objective:</h3><div>This study aims to enhance the recommendation quality of non-tuning LLM-based systems by addressing their lack of task awareness.</div></div><div><h3>Method:</h3><div>We propose a novel approach, Critique-based LLMs as Recommendation Systems (Critic-LLM-RS), which introduces an independent machine learning model—the Recommendation Critic—to provide feedback on LLM-generated recommendations and guide the LLM toward improved recommendation strategies.</div></div><div><h3>Results:</h3><div>Experiments on multiple real-world datasets demonstrate that Critic-LLM-RS significantly outperforms existing non-tuning approaches, regardless of whether open-source or proprietary LLMs are used.</div></div><div><h3>Conclusion:</h3><div>Critic-LLM-RS enhances the task adaptability of non-tuning LLMs through a collaborative feedback mechanism, offering a new solution for building efficient and easily deployable recommendation systems.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"192 ","pages":"Article 108021"},"PeriodicalIF":4.3,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145979054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
VulATMHD: Joint adaptive triplet mining and hybrid distillation for type-aware vulnerability classification VulATMHD:面向类型感知漏洞分类的联合自适应三元组挖掘和混合蒸馏
IF 4.3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-04-01 Epub Date: 2026-01-13 DOI: 10.1016/j.infsof.2026.108037
Xuanye Wang , Lu Lu

Context:

Vulnerability detection leveraging pre-trained models has achieved notable success, but its coarse-grained outputs fail to provide security engineers with vulnerability type information. Recent type-aware Software Vulnerability Classification (SVC) methods mitigate this gap, but often neglect inter-type semantic relationships and exhibit limited knowledge transfer, resulting in suboptimal learned representations.

Objective:

To address these limitations, this study proposes VulATMHD, a novel type-aware SVC framework that integrates adaptive triplet mining with hybrid distillation.

Methods:

VulATMHD first groups vulnerability types based on common weakness enumeration abstract types. It then constructs a multi-teacher architecture, with each teacher assigned to a specific group. Adaptive triplet mining is introduced to guide feature learning, yielding feature representations that are intra-class compact and inter-class separable. Since each teacher is optimized for intra-group performance, VulATMHD further introduces a hybrid distillation strategy to transfer both feature representations and label distributions from the teacher ensemble to a pre-trained student.

Results:

Empirical evaluations on the BigVul dataset show that, compared to baseline methods, VulATMHD improves Accuracy and weighted F1-score by 4.7%–29.9% and 5.7%–34.1%, respectively. Moreover, VulATMHD is compatible with various pre-trained models, such as CodeBERT, CodeT5+, and GraphCodeBERT.

Conclusion:

The proposed VulATMHD outperforms state-of-the-art SVC methods and exhibits superior robustness and scalability in downstream tasks, highlighting its potential for practical applications.
背景:利用预训练模型的漏洞检测已经取得了显著的成功,但其粗粒度输出无法为安全工程师提供漏洞类型信息。最近的类型感知软件漏洞分类(SVC)方法缓解了这一差距,但往往忽略了类型间的语义关系,并且表现出有限的知识转移,导致次优学习表征。为了解决这些局限性,本研究提出了一种新的类型感知SVC框架VulATMHD,该框架将自适应三重态挖掘与混合蒸馏相结合。方法:VulATMHD首先根据常见漏洞枚举抽象类型对漏洞类型进行分组。然后,它构建了一个多教师架构,每个教师分配到一个特定的组。引入自适应三元组挖掘来指导特征学习,生成类内紧凑和类间可分离的特征表示。由于每个教师都针对组内性能进行了优化,因此VulATMHD进一步引入了混合蒸馏策略,将特征表示和标签分布从教师集合转移到预训练的学生。结果:在BigVul数据集上的实证评估表明,与基线方法相比,VulATMHD方法的准确率和加权f1得分分别提高了4.7% ~ 29.9%和5.7% ~ 34.1%。此外,VulATMHD兼容各种预训练模型,如CodeBERT、CodeT5+和GraphCodeBERT。结论:所提出的VulATMHD方法优于当前最先进的SVC方法,在下游任务中表现出优越的鲁棒性和可扩展性,突出了其实际应用潜力。
{"title":"VulATMHD: Joint adaptive triplet mining and hybrid distillation for type-aware vulnerability classification","authors":"Xuanye Wang ,&nbsp;Lu Lu","doi":"10.1016/j.infsof.2026.108037","DOIUrl":"10.1016/j.infsof.2026.108037","url":null,"abstract":"<div><h3>Context:</h3><div>Vulnerability detection leveraging pre-trained models has achieved notable success, but its coarse-grained outputs fail to provide security engineers with vulnerability type information. Recent type-aware Software Vulnerability Classification (SVC) methods mitigate this gap, but often neglect inter-type semantic relationships and exhibit limited knowledge transfer, resulting in suboptimal learned representations.</div></div><div><h3>Objective:</h3><div>To address these limitations, this study proposes VulATMHD, a novel type-aware SVC framework that integrates adaptive triplet mining with hybrid distillation.</div></div><div><h3>Methods:</h3><div>VulATMHD first groups vulnerability types based on common weakness enumeration abstract types. It then constructs a multi-teacher architecture, with each teacher assigned to a specific group. Adaptive triplet mining is introduced to guide feature learning, yielding feature representations that are intra-class compact and inter-class separable. Since each teacher is optimized for intra-group performance, VulATMHD further introduces a hybrid distillation strategy to transfer both feature representations and label distributions from the teacher ensemble to a pre-trained student.</div></div><div><h3>Results:</h3><div>Empirical evaluations on the BigVul dataset show that, compared to baseline methods, VulATMHD improves Accuracy and weighted F1-score by 4.7%–29.9% and 5.7%–34.1%, respectively. Moreover, VulATMHD is compatible with various pre-trained models, such as CodeBERT, CodeT5<span>+</span>, and GraphCodeBERT.</div></div><div><h3>Conclusion:</h3><div>The proposed VulATMHD outperforms state-of-the-art SVC methods and exhibits superior robustness and scalability in downstream tasks, highlighting its potential for practical applications.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"192 ","pages":"Article 108037"},"PeriodicalIF":4.3,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145979056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
xPriMES: Explainable reinforcement learning-guided mutation strategy with dual-environment interaction for evading black-box malware detectors 基于双环境交互的可解释强化学习引导突变策略逃避黑盒恶意软件检测器
IF 4.3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-04-01 Epub Date: 2026-01-05 DOI: 10.1016/j.infsof.2026.108019
Phan The Duy, Nguyen Manh Cuong, Ha Trieu Yen Vy, Le Tuan Luong, Nguyen Tran Duc Anh, Nghi Hoang Khoa, Van-Hau Pham
Malware continues to evolve, exposing weaknesses in conventional detectors and motivating realistic adversarial evaluations. Prior RL-based evasion methods often rely on partial model access or feature-level perturbations, limiting realism under strict black-box constraints. We propose xPriMES, a dual-environment reinforcement learning framework that generates functionality-preserving binary mutations for malware evasion in black-box settings. A LightGBM surrogate provides continuous confidence feedback for dense reward shaping, while the real target detector supplies binary feedback — used both for episode termination and for issuing the final reward — ensuring learning remains grounded in real evasion outcomes. The agent employs Thompson sampling and SHAP-guided prioritized replay to focus exploration on feature-relevant mutations and accelerate convergence. Experiments on multiple static detectors (LightGBM, RF+CNN, MalConv, CNN, KNN) demonstrate up to 97.4% evasion success, surpassing PSP-Mal under equivalent conditions. Further tests on VirusTotal confirm the transferability and real-world impact of the adversarial samples. These findings show that integrating explainable guidance with surrogate-assisted RL yields interpretable and effective black-box evasion while preserving functionality. We conclude with implications for defensive hardening and discuss limitations related to surrogate fidelity and the focus on static detection.
恶意软件不断发展,暴露了传统检测器的弱点,并激发了现实的对抗性评估。先前基于强化学习的逃避方法通常依赖于部分模型访问或特征级扰动,在严格的黑盒约束下限制了真实感。我们提出了xPriMES,这是一个双环境强化学习框架,可以生成功能保留的二进制突变,以便在黑盒设置中规避恶意软件。LightGBM代理为密集的奖励形成提供连续的置信度反馈,而真实的目标检测器提供二元反馈——用于插曲终止和发出最终奖励——确保学习仍然基于真实的逃避结果。该智能体采用汤普森采样和shap引导的优先重播,将探索重点放在与特征相关的突变上,加速收敛。在多个静态检测器(LightGBM, RF+CNN, MalConv, CNN, KNN)上的实验表明,在同等条件下,规避成功率高达97.4%,超过了sps - mal。VirusTotal的进一步测试证实了对抗性样本的可转移性和现实世界的影响。这些发现表明,将可解释的指导与代理辅助RL相结合,可以在保留功能的同时产生可解释和有效的黑匣子规避。我们总结了防御强化的含义,并讨论了与代理保真度和静态检测相关的限制。
{"title":"xPriMES: Explainable reinforcement learning-guided mutation strategy with dual-environment interaction for evading black-box malware detectors","authors":"Phan The Duy,&nbsp;Nguyen Manh Cuong,&nbsp;Ha Trieu Yen Vy,&nbsp;Le Tuan Luong,&nbsp;Nguyen Tran Duc Anh,&nbsp;Nghi Hoang Khoa,&nbsp;Van-Hau Pham","doi":"10.1016/j.infsof.2026.108019","DOIUrl":"10.1016/j.infsof.2026.108019","url":null,"abstract":"<div><div>Malware continues to evolve, exposing weaknesses in conventional detectors and motivating realistic adversarial evaluations. Prior RL-based evasion methods often rely on partial model access or feature-level perturbations, limiting realism under strict black-box constraints. We propose xPriMES, a dual-environment reinforcement learning framework that generates functionality-preserving binary mutations for malware evasion in black-box settings. A LightGBM surrogate provides continuous confidence feedback for dense reward shaping, while the real target detector supplies binary feedback — used both for episode termination and for issuing the final reward — ensuring learning remains grounded in real evasion outcomes. The agent employs Thompson sampling and SHAP-guided prioritized replay to focus exploration on feature-relevant mutations and accelerate convergence. Experiments on multiple static detectors (LightGBM, RF+CNN, MalConv, CNN, KNN) demonstrate up to 97.4% evasion success, surpassing PSP-Mal under equivalent conditions. Further tests on VirusTotal confirm the transferability and real-world impact of the adversarial samples. These findings show that integrating explainable guidance with surrogate-assisted RL yields interpretable and effective black-box evasion while preserving functionality. We conclude with implications for defensive hardening and discuss limitations related to surrogate fidelity and the focus on static detection.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"192 ","pages":"Article 108019"},"PeriodicalIF":4.3,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145928208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EdgeSim: Firmware vulnerability detection with control transfer-enhanced binary code similarity detection EdgeSim:固件漏洞检测与控制传输增强的二进制代码相似性检测
IF 4.3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-04-01 Epub Date: 2026-01-17 DOI: 10.1016/j.infsof.2026.108020
Li Liu, Shen Wang, Xunzhi Jiang

Context:

The widespread adoption of Internet of Things (IoT) devices has amplified the impact of vulnerabilities in embedded firmware. Binary code similarity detection (BCSD), a static analysis technique that compares functions without source code, plays an important role in firmware vulnerability detection. However, existing control flow graph (CFG)-based methods directly aggregate basic block features to learn structural information, neglecting the rich semantics in control transfers between basic blocks (i.e., CFG edges). This limitation leads to degraded performance under diverse compilation settings.

Objective:

To address the limitations of existing CFG-based similarity detection methods, this paper proposes a novel binary similarity detection method, EdgeSim, which extracts and utilizes control transfer information between basic blocks for the first time in CFG-based BCSD.

Method:

EdgeSim employs a language model to extract semantic features of both basic blocks and the control transfer relationships between them. Basic block semantics are used as node features, while control transfer semantics are incorporated as edge features in CFGs. Furthermore, we design a novel edge feature-enhanced graph neural network (EGNN) to aggregate features of nodes and edges in CFG, leveraging control transfer information between basic blocks to learn more comprehensive graph embeddings of functions.

Results:

Experimental evaluations on datasets covering diverse architectures, optimization levels, and compilers demonstrate that EdgeSim improves the Recall@1 by over 25% compared to baseline approaches in one-to-many function search tasks under cross-compilation conditions. Additionally, in real-world firmware vulnerability search experiments, EdgeSim outperforms baselines in identifying all vulnerability functions while maintaining the highest mean reciprocal rank (MRR) metric and the lowest false positive rate (FPR).

Conclusion:

The experimental results indicate that integrating control transfer semantics substantially enhances CFG-based function representations. EdgeSim consistently delivers superior performance in binary similarity detection and firmware vulnerability discovery across diverse compilation environments.
背景:物联网(IoT)设备的广泛采用放大了嵌入式固件漏洞的影响。二进制代码相似度检测(Binary code similarity detection, BCSD)是一种无需源代码就能对功能进行比较的静态分析技术,在固件漏洞检测中发挥着重要作用。然而,现有的基于控制流图(CFG)的方法直接聚合基本块特征来学习结构信息,忽略了基本块(即CFG边)之间控制传递的丰富语义。这个限制导致在不同的编译设置下性能下降。目的:针对现有基于cfg的相似度检测方法的局限性,提出了一种新的二元相似度检测方法EdgeSim,该方法在基于cfg的BCSD中首次提取并利用了基本块之间的控制传递信息。方法:EdgeSim采用语言模型提取两个基本块的语义特征及其之间的控制转移关系。在cfg中,基本块语义作为节点特征,控制转移语义作为边缘特征。此外,我们设计了一种新的边缘特征增强图神经网络(EGNN)来聚合CFG中的节点和边的特征,利用基本块之间的控制传递信息来学习更全面的函数图嵌入。结果:对涵盖不同架构、优化级别和编译器的数据集进行的实验评估表明,在交叉编译条件下,EdgeSim在一对多函数搜索任务中比基线方法提高了Recall@1 25%以上。此外,在真实的固件漏洞搜索实验中,EdgeSim在识别所有漏洞函数方面优于基线,同时保持最高的平均倒数秩(MRR)度量和最低的假阳性率(FPR)。结论:实验结果表明,集成控制传递语义大大增强了基于cfg的函数表示。EdgeSim在不同的编译环境中始终如一地在二进制相似性检测和固件漏洞发现方面提供卓越的性能。
{"title":"EdgeSim: Firmware vulnerability detection with control transfer-enhanced binary code similarity detection","authors":"Li Liu,&nbsp;Shen Wang,&nbsp;Xunzhi Jiang","doi":"10.1016/j.infsof.2026.108020","DOIUrl":"10.1016/j.infsof.2026.108020","url":null,"abstract":"<div><h3>Context:</h3><div>The widespread adoption of Internet of Things (IoT) devices has amplified the impact of vulnerabilities in embedded firmware. Binary code similarity detection (BCSD), a static analysis technique that compares functions without source code, plays an important role in firmware vulnerability detection. However, existing control flow graph (CFG)-based methods directly aggregate basic block features to learn structural information, neglecting the rich semantics in control transfers between basic blocks (i.e., CFG edges). This limitation leads to degraded performance under diverse compilation settings.</div></div><div><h3>Objective:</h3><div>To address the limitations of existing CFG-based similarity detection methods, this paper proposes a novel binary similarity detection method, EdgeSim, which extracts and utilizes control transfer information between basic blocks for the first time in CFG-based BCSD.</div></div><div><h3>Method:</h3><div>EdgeSim employs a language model to extract semantic features of both basic blocks and the control transfer relationships between them. Basic block semantics are used as node features, while control transfer semantics are incorporated as edge features in CFGs. Furthermore, we design a novel edge feature-enhanced graph neural network (EGNN) to aggregate features of nodes and edges in CFG, leveraging control transfer information between basic blocks to learn more comprehensive graph embeddings of functions.</div></div><div><h3>Results:</h3><div>Experimental evaluations on datasets covering diverse architectures, optimization levels, and compilers demonstrate that EdgeSim improves the Recall@1 by over 25% compared to baseline approaches in one-to-many function search tasks under cross-compilation conditions. Additionally, in real-world firmware vulnerability search experiments, EdgeSim outperforms baselines in identifying all vulnerability functions while maintaining the highest mean reciprocal rank (MRR) metric and the lowest false positive rate (FPR).</div></div><div><h3>Conclusion:</h3><div>The experimental results indicate that integrating control transfer semantics substantially enhances CFG-based function representations. EdgeSim consistently delivers superior performance in binary similarity detection and firmware vulnerability discovery across diverse compilation environments.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"192 ","pages":"Article 108020"},"PeriodicalIF":4.3,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146038347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Gated transformer network for multivariate security patch identification with mixture-of-experts 基于混合专家的门控变压器网络多变量安全补丁识别
IF 4.3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-04-01 Epub Date: 2025-12-31 DOI: 10.1016/j.infsof.2025.108006
Jiajun Tong , Zhixiao Wang , Xiaobin Rui

Context:

Security patch identification is an important task in continuous integration and deployment, which helps software developers detect security issues and code vulnerabilities. Recent studies have confirmed that using both commit message and code diff information are beneficial to identification performance. However, existing works still face the problems of poor model representation ability and low model robustness, both of which affect the quality of commit representation, resulting in bad identification performance.

Objective:

We propose a gated transformer network for multivariate security patch identification with mixture-of-experts.

Method:

To improve the representation capability of the model and the quality of the commit representations, we provided a bi-encoder to utilize prior knowledge to enhance distinctive features for commit message and code diff respectively. To improve the robustness of the model and further improve the quality of commit representations, we designed a gated layer to learn the weight of each expert, and dynamically assign weights to different features.

Results:

Extensive experiments show that our framework has effectively improved the model representation ability, and the robustness of the model, providing high-quality commit representations, and achieves the state-of-the-art performance.

Conclusion:

Our approach provides a bi-encoder to obtain the embedding of each feature by two experts, and then explore the difference between them, by setting different weights through the gated layer. It not only improves the model representation ability but also improves the robustness of the model, thus having favorable applicability in real-world scenarios. The code and data are shared in https://github.com/AppleMax1992/ensemble_commit.
背景:安全补丁识别是持续集成和部署中的一项重要任务,它可以帮助软件开发人员发现安全问题和代码漏洞。最近的研究已经证实,同时使用提交消息和代码差异信息有助于提高识别性能。然而,现有的工作仍然面临着模型表示能力差和模型鲁棒性低等问题,这些问题都会影响提交表示的质量,导致识别性能不佳。目的:提出一种基于混合专家的多变量安全补丁识别的门控变压器网络。方法:为了提高模型的表示能力和提交表示的质量,我们提供了一个双编码器,利用先验知识分别增强提交消息和代码差异的显著特征。为了提高模型的鲁棒性和进一步提高提交表示的质量,我们设计了一个门控层来学习每个专家的权重,并动态地为不同的特征分配权重。结果:大量的实验表明,我们的框架有效地提高了模型的表示能力和模型的鲁棒性,提供了高质量的提交表示,达到了最先进的性能。结论:我们的方法提供了一个双编码器,通过两个专家获得每个特征的嵌入,然后通过门控层设置不同的权重来探索它们之间的差异。它不仅提高了模型的表示能力,而且提高了模型的鲁棒性,因此在实际场景中具有良好的适用性。代码和数据在https://github.com/AppleMax1992/ensemble_commit中共享。
{"title":"Gated transformer network for multivariate security patch identification with mixture-of-experts","authors":"Jiajun Tong ,&nbsp;Zhixiao Wang ,&nbsp;Xiaobin Rui","doi":"10.1016/j.infsof.2025.108006","DOIUrl":"10.1016/j.infsof.2025.108006","url":null,"abstract":"<div><h3>Context:</h3><div>Security patch identification is an important task in continuous integration and deployment, which helps software developers detect security issues and code vulnerabilities. Recent studies have confirmed that using both commit message and code diff information are beneficial to identification performance. However, existing works still face the problems of poor model representation ability and low model robustness, both of which affect the quality of commit representation, resulting in bad identification performance.</div></div><div><h3>Objective:</h3><div>We propose a gated transformer network for multivariate security patch identification with mixture-of-experts.</div></div><div><h3>Method:</h3><div>To improve the representation capability of the model and the quality of the commit representations, we provided a bi-encoder to utilize prior knowledge to enhance distinctive features for commit message and code diff respectively. To improve the robustness of the model and further improve the quality of commit representations, we designed a gated layer to learn the weight of each expert, and dynamically assign weights to different features.</div></div><div><h3>Results:</h3><div>Extensive experiments show that our framework has effectively improved the model representation ability, and the robustness of the model, providing high-quality commit representations, and achieves the state-of-the-art performance.</div></div><div><h3>Conclusion:</h3><div>Our approach provides a bi-encoder to obtain the embedding of each feature by two experts, and then explore the difference between them, by setting different weights through the gated layer. It not only improves the model representation ability but also improves the robustness of the model, thus having favorable applicability in real-world scenarios. The code and data are shared in <span><span>https://github.com/AppleMax1992/ensemble_commit</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"192 ","pages":"Article 108006"},"PeriodicalIF":4.3,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145891160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
VulSEG: Enhanced graph-based vulnerability detection system with advanced text embedding VulSEG:增强的基于图形的漏洞检测系统,具有高级文本嵌入
IF 4.3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-04-01 Epub Date: 2025-12-31 DOI: 10.1016/j.infsof.2025.108007
Wenjing Cai , Xin Liu , Lipeng Gao
In the field of software security, the detection of vulnerabilities in source code has become increasingly important. Traditional methods based on feature engineering and statistical models are inefficient when dealing with complex code structures and large-scale data, while deep learning approaches have shown significant potential. Many detection methods involve converting source code into images for analysis. Although scalable, convolutional neural networks often fail to fully comprehend the complex structure and semantic relationships in the code, resulting in inadequate capture of high-level semantic features, which affects the accuracy of detection. This study introduces an innovative vulnerability detection framework, VulSEG, which significantly improves detection accuracy while maintaining high scalability. We combine the Program Dependence Graph (PDG), Control Flow Graph (CFG), and Context Dependency Graph (CDG) to create a context-enhanced graph representation. Additionally, we develop a composite feature encoding strategy that integrates Syntax Tree (AST) encoding with deep semantic security coding (Word2Vec + Complexity- and Security-Weighted TF-IDF, CSW-TF-IDF) to enhance the understanding of code complexity and the accuracy of predicting potential vulnerabilities. By incorporating the Text Convolutional Neural Network (TextCNN) and Bidirectional Long Short-Term Memory (BiLSTM) models, we further enhance feature extraction and long-sequence dependency handling capabilities. The experimental results show that, compared to state-of-the-art methods, our approach improves accuracy by 11.8%.
在软件安全领域,源代码漏洞的检测变得越来越重要。传统的基于特征工程和统计模型的方法在处理复杂的代码结构和大规模数据时效率低下,而深度学习方法显示出巨大的潜力。许多检测方法涉及将源代码转换为图像进行分析。虽然具有可扩展性,但卷积神经网络往往不能完全理解代码中复杂的结构和语义关系,导致无法充分捕获高级语义特征,从而影响检测的准确性。本研究引入了一种创新的漏洞检测框架VulSEG,在保持高可扩展性的同时显著提高了检测精度。我们结合程序依赖图(PDG)、控制流图(CFG)和上下文依赖图(CDG)来创建一个上下文增强的图表示。此外,我们开发了一种复合特征编码策略,该策略将语法树(AST)编码与深度语义安全编码(Word2Vec +复杂性和安全加权TF-IDF, CSW-TF-IDF)集成在一起,以增强对代码复杂性的理解和预测潜在漏洞的准确性。通过结合文本卷积神经网络(TextCNN)和双向长短期记忆(BiLSTM)模型,进一步增强了特征提取和长序列依赖处理能力。实验结果表明,与现有方法相比,该方法的准确率提高了11.8%。
{"title":"VulSEG: Enhanced graph-based vulnerability detection system with advanced text embedding","authors":"Wenjing Cai ,&nbsp;Xin Liu ,&nbsp;Lipeng Gao","doi":"10.1016/j.infsof.2025.108007","DOIUrl":"10.1016/j.infsof.2025.108007","url":null,"abstract":"<div><div>In the field of software security, the detection of vulnerabilities in source code has become increasingly important. Traditional methods based on feature engineering and statistical models are inefficient when dealing with complex code structures and large-scale data, while deep learning approaches have shown significant potential. Many detection methods involve converting source code into images for analysis. Although scalable, convolutional neural networks often fail to fully comprehend the complex structure and semantic relationships in the code, resulting in inadequate capture of high-level semantic features, which affects the accuracy of detection. This study introduces an innovative vulnerability detection framework, <em>VulSEG</em>, which significantly improves detection accuracy while maintaining high scalability. We combine the <em>Program Dependence Graph (PDG)</em>, <em>Control Flow Graph (CFG)</em>, and <em>Context Dependency Graph (CDG)</em> to create a context-enhanced graph representation. Additionally, we develop a composite feature encoding strategy that integrates <em>Syntax Tree (AST)</em> encoding with deep semantic security coding <em>(Word2Vec + Complexity- and Security-Weighted TF-IDF, CSW-TF-IDF)</em> to enhance the understanding of code complexity and the accuracy of predicting potential vulnerabilities. By incorporating the <em>Text Convolutional Neural Network (TextCNN)</em> and <em>Bidirectional Long Short-Term Memory (BiLSTM)</em> models, we further enhance feature extraction and long-sequence dependency handling capabilities. The experimental results show that, compared to state-of-the-art methods, our approach improves accuracy by 11.8%.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"192 ","pages":"Article 108007"},"PeriodicalIF":4.3,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145928209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Artifact validity under varying agent configurations in LLM-assisted software development: A comparative analysis llm辅助软件开发中不同代理配置下工件有效性的比较分析
IF 4.3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-04-01 Epub Date: 2026-01-06 DOI: 10.1016/j.infsof.2026.108022
Dae-Kyoo Kim

Context:

The integration of large language models (LLMs) into software engineering has advanced toward agent-based automation across the development lifecycle. However, the comparative effectiveness of different multi-agent orchestration strategies remains underexplored.

Objective:

This study examines how three agent configuration strategies – Task-Specialized (TS), Phase-Specialized (PS), and Process-Generalist (PG) – impact the validity of software artifacts generated across key development tasks.

Methods:

Using a unified LLM backend within a structured orchestration framework, we evaluate the three configurations across nine core software engineering tasks – covering requirements analysis, design modeling, implementation, and testing – within three application domains: Tour Reservation System (TORS), Smart Wallet System (SWS), and Food Order and Delivery System (FODS). Artifact validity is measured using structural and semantic criteria.

Result:

No configuration consistently outperforms the others across all tasks. The overall average validity score is 0.56, with zero standard deviation, indicating uniformly constrained performance. Validity is highest in early requirements tasks (0.63–0.85), moderate in implementation and testing (0.61), and lowest in modeling tasks (0.25–0.42). TS agents perform well in modeling tasks due to focused specialization; PS agents benefit from contextual continuity in tasks like operation identification and test design, though performance varies; PG agents offer stable but less tailored performance across the pipeline. All configurations perform best in the TORS domain, which features simple and modular requirements.

Conclusions:

Artifact quality appears more influenced by the LLM’s capabilities than orchestration strategy alone. However, task- and domain-specific variations suggest that adaptive or hybrid orchestration strategies – tailored to both task type and domain context – can enhance the effectiveness of agent-assisted software development. These findings support the need for more targeted specialization strategies and possibly domain-adapted LLMs.
上下文:将大型语言模型(llm)集成到软件工程中已经在整个开发生命周期中向基于代理的自动化发展。然而,不同的多代理编排策略的比较有效性仍然没有得到充分的研究。目的:本研究考察了三种代理配置策略——任务专门化(TS)、阶段专门化(PS)和过程通用化(PG)——如何影响跨关键开发任务生成的软件工件的有效性。方法:在结构化编排框架内使用统一的LLM后端,我们在三个应用领域(旅游预订系统(TORS)、智能钱包系统(SWS)和食品订购和配送系统(FODS))中评估了九个核心软件工程任务的三种配置,包括需求分析、设计建模、实现和测试。工件有效性是使用结构和语义标准来测量的。结果:在所有任务中,没有任何配置始终优于其他配置。总体平均效度得分为0.56,标准差为零,表明绩效受到均匀约束。有效性在早期需求任务中是最高的(0.63-0.85),在实现和测试中是中等的(0.61),在建模任务中是最低的(0.25-0.42)。由于集中的专业化,TS代理在建模任务中表现良好;PS代理受益于操作识别和测试设计等任务的上下文连续性,尽管性能有所不同;PG剂在整个管道中提供稳定但不太定制的性能。所有配置在TORS域中表现最佳,其特点是需求简单且模块化。结论:工件质量似乎更受LLM能力的影响,而不是单独的编制策略。然而,特定于任务和领域的变化表明,自适应或混合编排策略——针对任务类型和领域上下文定制——可以增强代理辅助软件开发的有效性。这些发现支持需要更有针对性的专业化策略和可能的领域适应法学硕士。
{"title":"Artifact validity under varying agent configurations in LLM-assisted software development: A comparative analysis","authors":"Dae-Kyoo Kim","doi":"10.1016/j.infsof.2026.108022","DOIUrl":"10.1016/j.infsof.2026.108022","url":null,"abstract":"<div><h3>Context:</h3><div>The integration of large language models (LLMs) into software engineering has advanced toward agent-based automation across the development lifecycle. However, the comparative effectiveness of different multi-agent orchestration strategies remains underexplored.</div></div><div><h3>Objective:</h3><div>This study examines how three agent configuration strategies – Task-Specialized (TS), Phase-Specialized (PS), and Process-Generalist (PG) – impact the validity of software artifacts generated across key development tasks.</div></div><div><h3>Methods:</h3><div>Using a unified LLM backend within a structured orchestration framework, we evaluate the three configurations across nine core software engineering tasks – covering requirements analysis, design modeling, implementation, and testing – within three application domains: Tour Reservation System (TORS), Smart Wallet System (SWS), and Food Order and Delivery System (FODS). Artifact validity is measured using structural and semantic criteria.</div></div><div><h3>Result:</h3><div>No configuration consistently outperforms the others across all tasks. The overall average validity score is 0.56, with zero standard deviation, indicating uniformly constrained performance. Validity is highest in early requirements tasks (0.63–0.85), moderate in implementation and testing (0.61), and lowest in modeling tasks (0.25–0.42). TS agents perform well in modeling tasks due to focused specialization; PS agents benefit from contextual continuity in tasks like operation identification and test design, though performance varies; PG agents offer stable but less tailored performance across the pipeline. All configurations perform best in the TORS domain, which features simple and modular requirements.</div></div><div><h3>Conclusions:</h3><div>Artifact quality appears more influenced by the LLM’s capabilities than orchestration strategy alone. However, task- and domain-specific variations suggest that adaptive or hybrid orchestration strategies – tailored to both task type and domain context – can enhance the effectiveness of agent-assisted software development. These findings support the need for more targeted specialization strategies and possibly domain-adapted LLMs.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"192 ","pages":"Article 108022"},"PeriodicalIF":4.3,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145928230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FlowRepair: Search-based automated program repair of CPS controllers modeled in Simulink-Stateflow FlowRepair:在simulink - statflow中建模的CPS控制器的基于搜索的自动程序修复
IF 4.3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-04-01 Epub Date: 2026-01-06 DOI: 10.1016/j.infsof.2025.108010
Aitor Arrieta , Pablo Valle , Shaukat Ali

Context:

Stateflow models are widely used in the industry to model the high-level control logic of Cyber–Physical Systems (CPSs) in Simulink. Many approaches exist to test Simulink models, but once a fault is detected, the process to repair it remains manual. Such a manual process increases the software development cost. Automated Program Repair (APR) techniques can significantly reduce this cost by automatically generating patches that fix bugs. However, current approaches face scalability issues to be applicable in the CPS context.

Objectives:

The goal of this paper is to propose an APR method which is scalable for Stateflow models.

Method:

We propose an automated search-based approach called FlowRepair, explicitly designed to repair Stateflow models. The novelty of FlowRepair includes, (1) a new algorithm that combines global and local search for patch generation; (2) a definition of novel repair objectives specifically tailored for repairing CPSs; (3) a set of mutation operators to repair Stateflow models automatically; and (4) an evaluation on a new dataset encompassing 19 faulty stateflow models with real bugs.

Results:

Our results suggest that (1) FlowRepair can fix bugs in stateflow models; (2) FlowRepair surpasses or performs similarly to a baseline APR technique inspired by a well-known CPS program repair approach.

Conclusion:

This paper presents the first tool for APR CPSs whose high-level control program is developed in Simulink-Staflow. The results show that the approach is effective and scalable to such complex systems.
背景:状态流模型在工业界被广泛用于模拟Simulink中的信息物理系统(cps)的高级控制逻辑。有许多方法可以测试Simulink模型,但是一旦检测到故障,修复过程仍然是手动的。这样的手工过程增加了软件开发成本。自动程序修复(APR)技术可以通过自动生成修复错误的补丁来显著降低这一成本。然而,当前的方法面临着可伸缩性问题,难以适用于CPS上下文中。目的:本文的目标是提出一种针对状态流模型可扩展的APR方法。方法:我们提出了一种名为FlowRepair的基于自动搜索的方法,明确设计用于修复状态流模型。FlowRepair的新颖性包括:(1)结合全局和局部搜索来生成补丁的新算法;(2)定义专门针对cps修复的新型修复目标;(3)一组自动修复statflow模型的突变算子;(4)对包含19个错误状态流模型的新数据集进行评估。结果:我们的研究结果表明:(1)FlowRepair可以修复状态流模型中的错误;(2) FlowRepair的性能优于或类似于受知名CPS程序修复方法启发的基准APR技术。结论:本文提出了首个在Simulink-Staflow中开发APR cps高级控制程序的工具。结果表明,该方法对此类复杂系统具有良好的可扩展性和有效性。
{"title":"FlowRepair: Search-based automated program repair of CPS controllers modeled in Simulink-Stateflow","authors":"Aitor Arrieta ,&nbsp;Pablo Valle ,&nbsp;Shaukat Ali","doi":"10.1016/j.infsof.2025.108010","DOIUrl":"10.1016/j.infsof.2025.108010","url":null,"abstract":"<div><h3>Context:</h3><div>Stateflow models are widely used in the industry to model the high-level control logic of Cyber–Physical Systems (CPSs) in Simulink. Many approaches exist to test Simulink models, but once a fault is detected, the process to repair it remains manual. Such a manual process increases the software development cost. Automated Program Repair (APR) techniques can significantly reduce this cost by automatically generating patches that fix bugs. However, current approaches face scalability issues to be applicable in the CPS context.</div></div><div><h3>Objectives:</h3><div>The goal of this paper is to propose an APR method which is scalable for Stateflow models.</div></div><div><h3>Method:</h3><div>We propose an automated search-based approach called <span>FlowRepair</span>, explicitly designed to repair Stateflow models. The novelty of <span>FlowRepair</span> includes, (1) a new algorithm that combines global and local search for patch generation; (2) a definition of novel repair objectives specifically tailored for repairing CPSs; (3) a set of mutation operators to repair Stateflow models automatically; and (4) an evaluation on a new dataset encompassing 19 faulty stateflow models with real bugs.</div></div><div><h3>Results:</h3><div>Our results suggest that (1) <span>FlowRepair</span> can fix bugs in stateflow models; (2) <span>FlowRepair</span> surpasses or performs similarly to a baseline APR technique inspired by a well-known CPS program repair approach.</div></div><div><h3>Conclusion:</h3><div>This paper presents the first tool for APR CPSs whose high-level control program is developed in Simulink-Staflow. The results show that the approach is effective and scalable to such complex systems.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"192 ","pages":"Article 108010"},"PeriodicalIF":4.3,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145928337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CSVD-AES: Cross-project software vulnerability detection based on active learning with metric fusion CSVD-AES:基于主动学习和度量融合的跨项目软件漏洞检测
IF 4.3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-04-01 Epub Date: 2026-01-05 DOI: 10.1016/j.infsof.2026.108015
Zhidan Yuan , Xiang Chen , Juan Zhang , Weiming Zeng

Context:

Previous studies on Cross-Project Software Vulnerability Detection (CSVD) have shown that leveraging a small number of labeled modules from the target project can enhance the performance of CSVD. However, how to systematically select representative modules for labeling has not received sufficient attention. In addition, program modules can be measured using either expert or semantic metrics. There has been insufficient attention given to whether considering both metrics simultaneously helps in selecting representative modules.

Objective:

To address these challenges, we introduce a novel approach CSVD-AES. This method aims to fuse expert and semantic metrics and employs the active learning to select the most representative modules for labeling.

Methods:

CSVD-AES consists of three phases: the code representation phase, the active learning phase, and the model construction phase. In the code representation phase, a self-attention mechanism is used to fuse the metrics. In the active learning phase, an uncertainty sampling strategy is employed to select the most representative modules for labeling. In the model construction phase, the weighted cross-entropy (WCE) loss function is applied to address the class imbalance issue in the labeled modules. The metric fusion helps active learning identify representative modules. Since selecting modules can exacerbate the class imbalance issue in the labeled modules, we employ a sampling balancing strategy during the active learning phase to address this problem.

Results:

CSVD-AES is evaluated through a comprehensive study on four real-world projects. The results demonstrate that CSVD-AES outperforms five state-of-the-art baselines, achieving AUC improvements ranging from 4.0% to 24.4%. A series of ablation experiments verify the rationality of the CSVD-AES component settings.

Conclusion:

CSVD-AES effectively addresses the challenges in the field of CSVD by combining active learning and metric fusion, significantly advancing the development of this field.
背景:先前关于跨项目软件漏洞检测(CSVD)的研究表明,利用目标项目中的少量标记模块可以提高CSVD的性能。然而,如何系统地选择有代表性的模块进行标注却没有得到足够的重视。此外,程序模块可以使用专家或语义度量来度量。同时考虑这两个指标是否有助于选择有代表性的模块,这一点一直没有得到足够的重视。目的:为了解决这些挑战,我们引入了一种新的方法CSVD-AES。该方法旨在融合专家指标和语义指标,采用主动学习选择最具代表性的模块进行标注。方法:CSVD-AES包括三个阶段:代码表示阶段、主动学习阶段和模型构建阶段。在代码表示阶段,使用自关注机制来融合度量。在主动学习阶段,采用不确定性采样策略选择最具代表性的模块进行标注。在模型构建阶段,采用加权交叉熵(WCE)损失函数来解决标记模块中的类不平衡问题。度量融合有助于主动学习识别有代表性的模块。由于选择模块会加剧标记模块中的类不平衡问题,我们在主动学习阶段采用抽样平衡策略来解决这个问题。结果:通过对四个实际项目的综合研究,对CSVD-AES进行了评价。结果表明,CSVD-AES优于5个最先进的基线,实现了4.0%至24.4%的AUC改进。一系列烧蚀实验验证了CSVD-AES组件设置的合理性。结论:CSVD- aes结合主动学习和度量融合,有效解决了CSVD领域面临的挑战,显著推进了该领域的发展。
{"title":"CSVD-AES: Cross-project software vulnerability detection based on active learning with metric fusion","authors":"Zhidan Yuan ,&nbsp;Xiang Chen ,&nbsp;Juan Zhang ,&nbsp;Weiming Zeng","doi":"10.1016/j.infsof.2026.108015","DOIUrl":"10.1016/j.infsof.2026.108015","url":null,"abstract":"<div><h3>Context:</h3><div>Previous studies on Cross-Project Software Vulnerability Detection (CSVD) have shown that leveraging a small number of labeled modules from the target project can enhance the performance of CSVD. However, how to systematically select representative modules for labeling has not received sufficient attention. In addition, program modules can be measured using either expert or semantic metrics. There has been insufficient attention given to whether considering both metrics simultaneously helps in selecting representative modules.</div></div><div><h3>Objective:</h3><div>To address these challenges, we introduce a novel approach CSVD-AES. This method aims to fuse expert and semantic metrics and employs the active learning to select the most representative modules for labeling.</div></div><div><h3>Methods:</h3><div>CSVD-AES consists of three phases: the code representation phase, the active learning phase, and the model construction phase. In the code representation phase, a self-attention mechanism is used to fuse the metrics. In the active learning phase, an uncertainty sampling strategy is employed to select the most representative modules for labeling. In the model construction phase, the weighted cross-entropy (WCE) loss function is applied to address the class imbalance issue in the labeled modules. The metric fusion helps active learning identify representative modules. Since selecting modules can exacerbate the class imbalance issue in the labeled modules, we employ a sampling balancing strategy during the active learning phase to address this problem.</div></div><div><h3>Results:</h3><div>CSVD-AES is evaluated through a comprehensive study on four real-world projects. The results demonstrate that CSVD-AES outperforms five state-of-the-art baselines, achieving AUC improvements ranging from 4.0% to 24.4%. A series of ablation experiments verify the rationality of the CSVD-AES component settings.</div></div><div><h3>Conclusion:</h3><div>CSVD-AES effectively addresses the challenges in the field of CSVD by combining active learning and metric fusion, significantly advancing the development of this field.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"192 ","pages":"Article 108015"},"PeriodicalIF":4.3,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145928338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AI systems’ negative social impact and factors 人工智能系统的负面社会影响和因素
IF 4.3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-04-01 Epub Date: 2026-01-17 DOI: 10.1016/j.infsof.2026.108038
Nafen Haj Ahmad , Linnea Stigholt , Leticia Duboc , Birgit Penzenstadler

Context:

AI technologies are rapidly being integrated into society, offering numerous benefits but also raising significant ethical and social concerns. While some AI systems aim to improve efficiency and decision-making, they can also cause harmful impacts on individuals and society.

Objective:

This study examines both the immediate and systemic negative effects of AI systems, as well as the underlying factors that might contribute to these issues.

Method:

Using a multi-vocal literature review, we analyze 28 AI systems and their associated impacts, including discrimination, psychological and physical harm, and unfair treatment.

Results:

We identify key factors that might have led AI systems to operate in that manner and explain why these impacts may occur. Additionally, we propose initial concrete actions to mitigate these negative effects and promote the development of AI systems that align with ethical and social sustainability principles.

Impact:

By shedding light on these issues, we aim to raise awareness among researchers and developers, encouraging the adoption of more responsible and inclusive as well as concrete AI guidelines.
背景:人工智能技术正在迅速融入社会,带来了许多好处,但也引发了重大的伦理和社会问题。虽然一些人工智能系统旨在提高效率和决策,但它们也可能对个人和社会造成有害影响。目的:本研究考察了人工智能系统的直接和系统性负面影响,以及可能导致这些问题的潜在因素。方法:使用多声音文献综述,我们分析了28个人工智能系统及其相关影响,包括歧视,心理和身体伤害以及不公平待遇。结果:我们确定了可能导致人工智能系统以这种方式运行的关键因素,并解释了这些影响可能发生的原因。此外,我们提出了初步的具体行动,以减轻这些负面影响,并促进符合道德和社会可持续性原则的人工智能系统的发展。影响:通过揭示这些问题,我们的目标是提高研究人员和开发人员的意识,鼓励采用更负责任、更包容以及更具体的人工智能指导方针。
{"title":"AI systems’ negative social impact and factors","authors":"Nafen Haj Ahmad ,&nbsp;Linnea Stigholt ,&nbsp;Leticia Duboc ,&nbsp;Birgit Penzenstadler","doi":"10.1016/j.infsof.2026.108038","DOIUrl":"10.1016/j.infsof.2026.108038","url":null,"abstract":"<div><h3>Context:</h3><div>AI technologies are rapidly being integrated into society, offering numerous benefits but also raising significant ethical and social concerns. While some AI systems aim to improve efficiency and decision-making, they can also cause harmful impacts on individuals and society.</div></div><div><h3>Objective:</h3><div>This study examines both the immediate and systemic negative effects of AI systems, as well as the underlying factors that might contribute to these issues.</div></div><div><h3>Method:</h3><div>Using a multi-vocal literature review, we analyze 28 AI systems and their associated impacts, including discrimination, psychological and physical harm, and unfair treatment.</div></div><div><h3>Results:</h3><div>We identify key factors that might have led AI systems to operate in that manner and explain why these impacts may occur. Additionally, we propose initial concrete actions to mitigate these negative effects and promote the development of AI systems that align with ethical and social sustainability principles.</div></div><div><h3>Impact:</h3><div>By shedding light on these issues, we aim to raise awareness among researchers and developers, encouraging the adoption of more responsible and inclusive as well as concrete AI guidelines.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"192 ","pages":"Article 108038"},"PeriodicalIF":4.3,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146038345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Information and Software Technology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1