首页 > 最新文献

Applied Soft Computing最新文献

英文 中文
Cost investigations and ANFIS implementation for M/M/1/L queueing model with general vacation, customer discouragement, and F-policy M/M/1/L排队模型的成本调查和ANFIS的实施
IF 6.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-17 DOI: 10.1016/j.asoc.2026.114667
Khushbu S. Antala, Sudeep Singh Sanga
This work investigates the M/M/1/L queueing model within real-life scenarios, highlighting its substantial impact on various domains such as call centers, distributed computing systems, healthcare facilities, traffic management centers, and other service-oriented operations. It focuses on the finite capacity queueing model with general vacations and customer discouragement under the F-policy. Additionally, we consider essential queueing features that account for customers’ behaviors, such as balking and reneging. When customers encounter long queues, they may experience discouragement and consequently choose not to join the queue or leave without receiving service. Implementation of the ‘F-policy’ is an effective strategy for the admission of customers into the system to reduce congestion resulting from excessive customer arrivals. Further, the server chooses to take a vacation when no customer is present in the system. The vacation time is not exponentially distributed; instead, we consider the general case of vacation time for a more comprehensive analysis. The mathematical development of the M/M/1/L queueing model is carried out using the Chapman-Kolmogorov steady-state equations by introducing supplementary variables corresponding to remaining vacation times. Subsequently, the Laplace-Stieltjes transform and recursive method are employed to solve these equations and establish probability distributions. Further, various performance metrics, such as the number of customers in the system, throughput, customer loss, long-run probabilities, etc., are derived. These performance measures will assist system organizers in making informed decision-making strategies. A numerical example is also presented, illustrating the impact of input parameters and customers’ discouraged behavior on various performance metrics. The adaptive neuro-fuzzy inference system, which is built on an artificial neural network and a support vector regression, a machine learning technique, is employed to validate the numerical results. Moreover, the nonlinear cost function is formulated with the service and vacation rates as decision variables. The cost is minimized using quasi-Newton methods and several metaheuristics (particle swarm optimization, bat algorithm, and differential evolution variants). These algorithms are utilized to compare the optimal values of the cost function. The queueing model’s practical application is demonstrated through its implementation in fog computing systems.
这项工作研究了现实场景中的M/M/1/L排队模型,强调了它对呼叫中心、分布式计算系统、医疗保健设施、交通管理中心和其他面向服务的操作等各个领域的重大影响。研究了F-policy下具有一般假期和顾客不鼓励的有限容量排队模型。此外,我们考虑了基本的排队特征,这些特征解释了客户的行为,如犹豫和食言。当顾客遇到排长队时,他们可能会感到沮丧,从而选择不加入队伍或没有得到服务就离开。实施“f政策”是一项有效的策略,可让旅客进入系统,以减少因旅客过多而造成的挤塞。此外,当系统中没有客户时,服务器选择休假。休假时间不呈指数分布;相反,我们考虑休假时间的一般情况,以便进行更全面的分析。通过引入与剩余休假时间相对应的补充变量,利用Chapman-Kolmogorov稳态方程对M/M/1/L排队模型进行数学发展。然后,利用Laplace-Stieltjes变换和递归方法求解这些方程,建立概率分布。此外,还推导出各种性能度量,例如系统中的客户数量、吞吐量、客户损失、长期运行概率等。这些绩效指标将有助于系统组织者制定明智的决策策略。最后给出了一个数值例子,说明了输入参数和顾客不鼓励行为对各种绩效指标的影响。采用基于人工神经网络和机器学习技术支持向量回归的自适应神经模糊推理系统对数值结果进行验证。并以服务率和休假率为决策变量,建立了非线性成本函数。使用准牛顿方法和几种元启发式方法(粒子群优化、蝙蝠算法和微分进化变体)最小化成本。这些算法用于比较成本函数的最优值。通过在雾计算系统中的实现,说明了排队模型的实际应用。
{"title":"Cost investigations and ANFIS implementation for M/M/1/L queueing model with general vacation, customer discouragement, and F-policy","authors":"Khushbu S. Antala,&nbsp;Sudeep Singh Sanga","doi":"10.1016/j.asoc.2026.114667","DOIUrl":"10.1016/j.asoc.2026.114667","url":null,"abstract":"<div><div>This work investigates the M/M/1/L queueing model within real-life scenarios, highlighting its substantial impact on various domains such as call centers, distributed computing systems, healthcare facilities, traffic management centers, and other service-oriented operations. It focuses on the finite capacity queueing model with general vacations and customer discouragement under the <em>F</em>-policy. Additionally, we consider essential queueing features that account for customers’ behaviors, such as balking and reneging. When customers encounter long queues, they may experience discouragement and consequently choose not to join the queue or leave without receiving service. Implementation of the ‘<em>F</em>-policy’ is an effective strategy for the admission of customers into the system to reduce congestion resulting from excessive customer arrivals. Further, the server chooses to take a vacation when no customer is present in the system. The vacation time is not exponentially distributed; instead, we consider the general case of vacation time for a more comprehensive analysis. The mathematical development of the M/M/1/L queueing model is carried out using the Chapman-Kolmogorov steady-state equations by introducing supplementary variables corresponding to remaining vacation times. Subsequently, the Laplace-Stieltjes transform and recursive method are employed to solve these equations and establish probability distributions. Further, various performance metrics, such as the number of customers in the system, throughput, customer loss, long-run probabilities, etc., are derived. These performance measures will assist system organizers in making informed decision-making strategies. A numerical example is also presented, illustrating the impact of input parameters and customers’ discouraged behavior on various performance metrics. The adaptive neuro-fuzzy inference system, which is built on an artificial neural network and a support vector regression, a machine learning technique, is employed to validate the numerical results. Moreover, the nonlinear cost function is formulated with the service and vacation rates as decision variables. The cost is minimized using quasi-Newton methods and several metaheuristics (particle swarm optimization, bat algorithm, and differential evolution variants). These algorithms are utilized to compare the optimal values of the cost function. The queueing model’s practical application is demonstrated through its implementation in fog computing systems.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"191 ","pages":"Article 114667"},"PeriodicalIF":6.6,"publicationDate":"2026-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146080042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LightPro: Lightweight multi-channel model protection for edge-oriented deep neural network LightPro:面向边缘的深度神经网络的轻量级多通道模型保护
IF 6.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-17 DOI: 10.1016/j.asoc.2026.114605
Jinyin Chen , Tianxin Zhang , Xiaoming Zhao , Zekai Wang , Haibin Zheng
It is well known that deep neural networks (DNNs) rely on large-scale proprietary data and computing-intensive resources to train, high-quality DNNs have become one of the most valuable assets in the supply chain. This also attracts attention of the model stealers for great profits, as they can steal functionally equivalent models by iteratively collecting outputs using carefully crafted examples. Extensive studies on model protection have been carried out against stealing attacks, nevertheless, they are still challenged in aspects of effectiveness, efficiency, and transferability. To this end, we propose LightPro, a lightweight multi-channel authorization cryptographic protection method for DNNs. LightPro significantly differs from previous work in four aspects, i.e., i) lightweight: its encryption process only processes the input and output layers without complex processing of inner information, e.g., model structure and parameters; ii) robust: it provides defense against various reverse attacks by embedding an additional key-verified encryption channel in the input layer and a dynamic noise layer in the output layer; iii) good trade-off: it introduces a noise layer in the output layer to blur the encryption method of the probability score distribution without changing the score ranking thereby maintaining high performance on primary tasks with negligible accuracy degradation; iv) transferable: its flexibly mappable encrypted channel added at the input layer can adapt to different types of inputs, thereby facilitating the application of different tasks, data, and models. Extensive experiments on 8 datasets, and 6 tasks, i.e., image classification (AlexNet, VGG16, GoogLeNet), object detection (SSD, Faster R-CNN, YOLOX), text classification (BERT), speech classification (Deepspeech), signal classification (ResNet) and graph classification (GCN), compared to five baselines against two attacks testify that LightPro achieves the state-of-the-art (SOTA) performance in terms of effectiveness, efficiency, generality and extensibility. For example, it significantly reduced the accuracy of the attack model by 73.02% on average compared to the baselines. Its time cost is reduced to 1/49 of the baselines, and the mean average precision of the attack model is decreased by 4.215.88 times on average.
众所周知,深度神经网络(dnn)依赖于大规模专有数据和计算密集型资源来训练,高质量的dnn已成为供应链中最有价值的资产之一。这也吸引了模型窃取者的注意,因为他们可以通过使用精心制作的示例迭代收集输出来窃取功能等效的模型。针对窃取攻击的模型保护已经进行了大量的研究,然而,它们在有效性、效率和可转移性方面仍然受到挑战。为此,我们提出了轻量级的多通道授权加密保护方法LightPro。LightPro与以往的工作有四个方面的显著不同,1)轻量级:其加密过程只处理输入和输出层,不需要对模型结构、参数等内部信息进行复杂的处理;Ii)鲁棒性:它通过在输入层嵌入额外的密钥验证加密通道,在输出层嵌入动态噪声层,提供对各种反向攻击的防御;Iii)良好的权衡:在输出层引入噪声层,在不改变分数排序的情况下模糊概率分数分布的加密方法,从而在基本任务上保持高性能,而精度下降可以忽略不计;Iv)可转移性:在输入层增加了灵活映射的加密通道,可以适应不同类型的输入,从而方便不同任务、数据和模型的应用。在8个数据集、6个任务,即图像分类(AlexNet、VGG16、GoogLeNet)、对象检测(SSD、Faster R-CNN、YOLOX)、文本分类(BERT)、语音分类(Deepspeech)、信号分类(ResNet)和图形分类(GCN)上进行了广泛的实验,对比了5个基线对两种攻击的影响,证明LightPro在有效性、效率、通性和可扩展性方面达到了最先进(SOTA)的性能。例如,与基线相比,它显著降低了攻击模型的准确率,平均降低了73.02%。它的时间成本降低到基准的1/49,攻击模型的平均精度平均降低4.21 ~ 5.88倍。
{"title":"LightPro: Lightweight multi-channel model protection for edge-oriented deep neural network","authors":"Jinyin Chen ,&nbsp;Tianxin Zhang ,&nbsp;Xiaoming Zhao ,&nbsp;Zekai Wang ,&nbsp;Haibin Zheng","doi":"10.1016/j.asoc.2026.114605","DOIUrl":"10.1016/j.asoc.2026.114605","url":null,"abstract":"<div><div>It is well known that deep neural networks (DNNs) rely on large-scale proprietary data and computing-intensive resources to train, high-quality DNNs have become one of the most valuable assets in the supply chain. This also attracts attention of the model stealers for great profits, as they can steal functionally equivalent models by iteratively collecting outputs using carefully crafted examples. Extensive studies on model protection have been carried out against stealing attacks, nevertheless, they are still challenged in aspects of effectiveness, efficiency, and transferability. To this end, we propose <em>LightPro</em>, a lightweight multi-channel authorization cryptographic protection method for DNNs. LightPro significantly differs from previous work in four aspects, <em>i.e.,</em> i) <em>lightweight</em>: its encryption process only processes the input and output layers without complex processing of inner information, <em>e.g.,</em> model structure and parameters; ii) <em>robust</em>: it provides defense against various reverse attacks by embedding an additional key-verified encryption channel in the input layer and a dynamic noise layer in the output layer; iii) <em>good trade-off</em>: it introduces a noise layer in the output layer to blur the encryption method of the probability score distribution without changing the score ranking thereby maintaining high performance on primary tasks with negligible accuracy degradation; iv) <em>transferable</em>: its flexibly mappable encrypted channel added at the input layer can adapt to different types of inputs, thereby facilitating the application of different tasks, data, and models. Extensive experiments on 8 datasets, and 6 tasks, <em>i.e.,</em> image classification (AlexNet, VGG16, GoogLeNet), object detection (SSD, Faster R-CNN, YOLOX), text classification (BERT), speech classification (Deepspeech), signal classification (ResNet) and graph classification (GCN), compared to five baselines against two attacks testify that LightPro achieves the state-of-the-art (SOTA) performance in terms of effectiveness, efficiency, generality and extensibility. For example, it significantly reduced the accuracy of the attack model by 73.02% on average compared to the baselines. Its time cost is reduced to <span><math><mo>∼</mo></math></span> 1/49 of the baselines, and the mean average precision of the attack model is decreased by 4.21<span><math><mo>∼</mo></math></span>5.88 times on average.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"191 ","pages":"Article 114605"},"PeriodicalIF":6.6,"publicationDate":"2026-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146039724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fuzzy concept-cognitive learning based on attention mechanism and three-way partial order structure 基于注意机制和三向偏序结构的模糊概念认知学习
IF 6.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-17 DOI: 10.1016/j.asoc.2026.114664
Lufang Zhang , Enliang Yan , Tianyong Hao
Concept cognition plays an important role in the implementation of cognitive intelligence, while the attention mechanism is crucial in human cognition. Fuzzy concept-cognitive learning (CCL) based on the attention mechanism represents a promising research direction. Partial order formal structure analysis (POFSA) is an emerging CCL model. However, its reliance on binary formal contexts makes it difficult to handle fuzzy data, which limits its practical usage since many real-world phenomena are inherently fuzzy and cannot be accurately described using binary logic. This article aims to address the gap in fuzzy CCL within the POFSA domain. We propose a fuzzy CCL method based on the attention mechanism and three-way partial order structure. Specifically: 1) Some fuzzy concept granules are defined based on the fuzzy formal decision context; 2) By introducing the attention mechanism to explore the characteristics of fuzzy data and drawing on the idea of a partial order three-way structure, a fuzzy partial order three-way structure based on the attention mechanism is constructed; 3) Classification experiments are conducted to verify that the model can be applied to other fields. Experiments on 11 datasets show that the proposed method can effectively distinguish new concept categories and has excellent classification performance.
概念认知在认知智力的实现中起着重要作用,而注意机制在人类认知中起着至关重要的作用。基于注意机制的模糊概念认知学习(CCL)是一个很有前途的研究方向。偏序形式结构分析(POFSA)是一种新兴的CCL模型。然而,它对二进制形式上下文的依赖使得处理模糊数据变得困难,这限制了它的实际应用,因为许多现实世界的现象本质上是模糊的,不能用二进制逻辑精确地描述。本文旨在解决POFSA领域中模糊CCL的差距。提出了一种基于注意力机制和三向偏序结构的模糊CCL方法。具体而言:1)基于模糊形式决策上下文定义了一些模糊概念粒;2)通过引入注意机制探索模糊数据的特点,借鉴偏序三向结构的思想,构建了基于注意机制的模糊偏序三向结构;3)进行分类实验,验证该模型可应用于其他领域。在11个数据集上的实验表明,该方法能够有效地区分新概念类别,具有优异的分类性能。
{"title":"Fuzzy concept-cognitive learning based on attention mechanism and three-way partial order structure","authors":"Lufang Zhang ,&nbsp;Enliang Yan ,&nbsp;Tianyong Hao","doi":"10.1016/j.asoc.2026.114664","DOIUrl":"10.1016/j.asoc.2026.114664","url":null,"abstract":"<div><div>Concept cognition plays an important role in the implementation of cognitive intelligence, while the attention mechanism is crucial in human cognition. Fuzzy concept-cognitive learning (CCL) based on the attention mechanism represents a promising research direction. Partial order formal structure analysis (POFSA) is an emerging CCL model. However, its reliance on binary formal contexts makes it difficult to handle fuzzy data, which limits its practical usage since many real-world phenomena are inherently fuzzy and cannot be accurately described using binary logic. This article aims to address the gap in fuzzy CCL within the POFSA domain. We propose a fuzzy CCL method based on the attention mechanism and three-way partial order structure. Specifically: 1) Some fuzzy concept granules are defined based on the fuzzy formal decision context; 2) By introducing the attention mechanism to explore the characteristics of fuzzy data and drawing on the idea of a partial order three-way structure, a fuzzy partial order three-way structure based on the attention mechanism is constructed; 3) Classification experiments are conducted to verify that the model can be applied to other fields. Experiments on 11 datasets show that the proposed method can effectively distinguish new concept categories and has excellent classification performance.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"191 ","pages":"Article 114664"},"PeriodicalIF":6.6,"publicationDate":"2026-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146039883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Novel mathematical modeling of IT2 Takagi–Sugeno fuzzy PID controllers with triangular FoUs and their application: A MagLev case study 三角模糊PID控制器的数学建模及应用——以磁悬浮为例
IF 6.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-17 DOI: 10.1016/j.asoc.2026.114663
Debdoot Sain, Tae H. Lee
This study presents two new mathematical models of interval type-2 Takagi–Sugeno fuzzy proportional-integral-derivative (IT2TSFPID) controllers, incorporating triangular footprints of uncertainty (FoUs), specifically the bottom-wide triangular (BWT) and top-wide triangular (TWT) structures. Unlike previous one-dimensional (1D) input-space-based studies that predominantly employed parallelogram-shaped FoUs to model IT2TSFPID controllers, this study is the first to utilize triangular FoUs in their design. The proposed models compute the scaled incremental control effort by algebraically summing the individual scaled incremental contributions from the proportional, integral, and derivative components, each modeled within the interval type-2 (IT2) Takagi–Sugeno (TS) fuzzy framework. A distinctive feature of this work is the use of the modulus of scaled input variables in rule consequents, which reduces the number of tunable parameters and overall modeling complexity. Furthermore, unlike the existing triangular FoU-based IT2 Mamdani fuzzy proportional-integral-derivative (IT2MFPID) controllers, the absence of bias terms in the controller expressions ensures better adherence to the nonlinear PID structure. Notably, the proposed controllers are developed using the centroid defuzzifier, a method not previously applied to IT2TSFPID models. The use of 1D input spaces further allows independent tuning of IT2TSFPID gains, offering improved flexibility in controller design. In addition to modeling, this study provides a detailed analysis of the properties, including bounded-input bounded-output (BIBO) stability, of the proposed controllers, aimed at delivering a comprehensive understanding of their behavior. Finally, rigorous simulation results from a magnetic levitation (MagLev) case study are presented to demonstrate the effectiveness and practical relevance of the proposed controllers, along with extensive comparisons made against the existing work.
本文提出了两种新的区间型2型Takagi-Sugeno模糊比例-积分-导数(IT2TSFPID)控制器的数学模型,其中包含了不确定性的三角形足迹,即底宽三角形(BWT)和顶宽三角形(TWT)结构。与之前基于一维(1D)输入空间的研究不同,该研究主要采用平行四边形fou来模拟IT2TSFPID控制器,该研究首次在其设计中使用三角形fou。所提出的模型通过对比例分量、积分分量和导数分量的个体比例增量贡献进行代数求和来计算比例增量控制努力,每个模型都在区间type-2 (IT2) Takagi-Sugeno (TS)模糊框架内建模。这项工作的一个显著特点是在规则结果中使用缩放输入变量的模数,这减少了可调参数的数量和整体建模复杂性。此外,与现有的基于三角函数的IT2 Mamdani模糊比例积分导数(IT2MFPID)控制器不同,控制器表达式中没有偏置项确保了更好地遵守非线性PID结构。值得注意的是,所提出的控制器是使用质心去模糊器开发的,这是一种以前未应用于IT2TSFPID模型的方法。使用1D输入空间进一步允许独立调整IT2TSFPID增益,提高了控制器设计的灵活性。除了建模之外,本研究还提供了对所提出控制器的属性的详细分析,包括有界输入有界输出(BIBO)稳定性,旨在全面了解其行为。最后,提出了一个磁悬浮(MagLev)案例研究的严格仿真结果,以证明所提出控制器的有效性和实际相关性,并与现有工作进行了广泛的比较。
{"title":"Novel mathematical modeling of IT2 Takagi–Sugeno fuzzy PID controllers with triangular FoUs and their application: A MagLev case study","authors":"Debdoot Sain,&nbsp;Tae H. Lee","doi":"10.1016/j.asoc.2026.114663","DOIUrl":"10.1016/j.asoc.2026.114663","url":null,"abstract":"<div><div>This study presents two new mathematical models of interval type-2 Takagi–Sugeno fuzzy proportional-integral-derivative (IT2TSFPID) controllers, incorporating triangular footprints of uncertainty (FoUs), specifically the bottom-wide triangular (BWT) and top-wide triangular (TWT) structures. Unlike previous one-dimensional (1D) input-space-based studies that predominantly employed parallelogram-shaped FoUs to model IT2TSFPID controllers, this study is the first to utilize triangular FoUs in their design. The proposed models compute the scaled incremental control effort by algebraically summing the individual scaled incremental contributions from the proportional, integral, and derivative components, each modeled within the interval type-2 (IT2) Takagi–Sugeno (TS) fuzzy framework. A distinctive feature of this work is the use of the modulus of scaled input variables in rule consequents, which reduces the number of tunable parameters and overall modeling complexity. Furthermore, unlike the existing triangular FoU-based IT2 Mamdani fuzzy proportional-integral-derivative (IT2MFPID) controllers, the absence of bias terms in the controller expressions ensures better adherence to the nonlinear PID structure. Notably, the proposed controllers are developed using the centroid defuzzifier, a method not previously applied to IT2TSFPID models. The use of 1D input spaces further allows independent tuning of IT2TSFPID gains, offering improved flexibility in controller design. In addition to modeling, this study provides a detailed analysis of the properties, including bounded-input bounded-output (BIBO) stability, of the proposed controllers, aimed at delivering a comprehensive understanding of their behavior. Finally, rigorous simulation results from a magnetic levitation (MagLev) case study are presented to demonstrate the effectiveness and practical relevance of the proposed controllers, along with extensive comparisons made against the existing work.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"191 ","pages":"Article 114663"},"PeriodicalIF":6.6,"publicationDate":"2026-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146080109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Non-exemplar class-incremental learning: Dynamic adversarial sample synthesis and relational knowledge distillation 非范例类增量学习:动态对抗样本合成与关系知识提炼
IF 6.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-17 DOI: 10.1016/j.asoc.2026.114601
Yuanlong Yu, Xiaoli Ke
Deep neural networks exhibit robust performance on isolated tasks but are prone to catastrophic forgetting when learning sequential tasks. While various incremental learning paradigms have been developed to address this, they commonly rely on stored old data or generative models, which introduce significant limitations. Storing past data raises privacy and memory concerns, whereas generative models often face training instability and high computational costs. To overcome these challenges, this paper proposes a novel non-exemplar incremental learning framework, termed MRDAS, designed to effectively alleviate catastrophic forgetting without retaining any previous data. Our framework integrates two key mechanisms: (1) Dynamic Adversarial Sample Synthesis (DASS), which generates pseudo-features for old classes to balance the learning between old and new classes; and (2) Multi-layer Relational Knowledge Distillation (MRKD), which transfers inter-sample relational knowledge across multiple network layers to preserve structural information. Extensive experiments on CIFAR-100, ImageNet-Subset, and Tiny-ImageNet demonstrate the superiority of MRDAS. For example, on the 10-phase CIFAR-100 benchmark, MRDAS achieves a final average accuracy of 68.93%, outperforming the strongest baseline by 3.46 percentage points. It also reduces the forgetting rate to 8.85%, which is 7.29% lower than the best comparable method. The proposed approach thus offers a privacy-preserving, stable, and efficient solution for continual learning, providing a practical pathway toward deployable lifelong learning systems that require neither data rehearsal nor complex generative models.
深度神经网络在孤立任务中表现出强大的性能,但在学习连续任务时容易出现灾难性遗忘。虽然已经开发了各种增量学习范例来解决这个问题,但它们通常依赖于存储的旧数据或生成模型,这引入了显着的局限性。存储过去的数据会引起隐私和内存问题,而生成模型通常面临训练不稳定和高计算成本。为了克服这些挑战,本文提出了一种新的非范例增量学习框架,称为MRDAS,旨在有效减轻灾难性遗忘而不保留任何先前的数据。我们的框架集成了两个关键机制:(1)动态对抗样本合成(DASS),它为旧类生成伪特征,以平衡新旧类之间的学习;(2)多层关系知识蒸馏(MRKD),跨多个网络层传递样本间的关系知识,以保留结构信息。在CIFAR-100、imagenet -子集和Tiny-ImageNet上的大量实验证明了MRDAS的优越性。例如,在10阶段的CIFAR-100基准测试中,MRDAS的最终平均准确率为68.93%,比最强基线高出3.46个百分点。它还将遗忘率降低到8.85%,比最佳可比方法低7.29%。因此,所提出的方法为持续学习提供了一种隐私保护、稳定和高效的解决方案,为既不需要数据排练也不需要复杂生成模型的可部署终身学习系统提供了一条实用的途径。
{"title":"Non-exemplar class-incremental learning: Dynamic adversarial sample synthesis and relational knowledge distillation","authors":"Yuanlong Yu,&nbsp;Xiaoli Ke","doi":"10.1016/j.asoc.2026.114601","DOIUrl":"10.1016/j.asoc.2026.114601","url":null,"abstract":"<div><div>Deep neural networks exhibit robust performance on isolated tasks but are prone to catastrophic forgetting when learning sequential tasks. While various incremental learning paradigms have been developed to address this, they commonly rely on stored old data or generative models, which introduce significant limitations. Storing past data raises privacy and memory concerns, whereas generative models often face training instability and high computational costs. To overcome these challenges, this paper proposes a novel non-exemplar incremental learning framework, termed MRDAS, designed to effectively alleviate catastrophic forgetting without retaining any previous data. Our framework integrates two key mechanisms: (1) Dynamic Adversarial Sample Synthesis (DASS), which generates pseudo-features for old classes to balance the learning between old and new classes; and (2) Multi-layer Relational Knowledge Distillation (MRKD), which transfers inter-sample relational knowledge across multiple network layers to preserve structural information. Extensive experiments on CIFAR-100, ImageNet-Subset, and Tiny-ImageNet demonstrate the superiority of MRDAS. For example, on the 10-phase CIFAR-100 benchmark, MRDAS achieves a final average accuracy of 68.93%, outperforming the strongest baseline by 3.46 percentage points. It also reduces the forgetting rate to 8.85%, which is 7.29% lower than the best comparable method. The proposed approach thus offers a privacy-preserving, stable, and efficient solution for continual learning, providing a practical pathway toward deployable lifelong learning systems that require neither data rehearsal nor complex generative models.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"191 ","pages":"Article 114601"},"PeriodicalIF":6.6,"publicationDate":"2026-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146039858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RADSCL: Representation augmentation integrated with dual supervised contrastive learning for low-resource text classification 基于双监督对比学习的低资源文本分类表征增强
IF 6.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-16 DOI: 10.1016/j.asoc.2026.114646
Jun Zhang , Ze Kuang , Fanfan Shen , Jianfeng He , Rui Sun , Yanxiang He
Text classification in low-resource regime is a challenging task. Data augmentation techniques can significantly alleviate the issue of insufficient training samples in such environments by generating new samples. However, existing data augmentation methods have not yet effectively solved the problems of hard samples that are hard to classify and insufficient model generalization ability, which makes the performance of text classification in low-resource regime still has room for improvement. To this end, this paper proposes a method that fuses representational data augmentation and dual supervised contrast learning (RADSCL) in low-resource regime. Representational data augmentation first uses dynamic span-cutoff to remove context-independent words, which reduces the parameters needed for mixup and lowers the computational cost. Then the meaningless property of PAD word embeddings is utilized to perform weighted mixing with the cut-off text to reduce the importance of certain words in the text that affect classification. Through the representation augmentation method proposed in this paper, high-quality hard positive samples can be generated to optimize the decision boundary of the model. On this basis, in view of the current contrastive learning has achieved significant results in enhancing text representation, this paper constructs a dual supervised contrastive learning framework. The framework not only uses supervised contrastive learning to focus on the contrastive relationship between positive and negative samples, but also fully learns the potential semantic relationship between different categories of samples under the role of soft labels. Label distribution contrastive learning further utilizes the distribution information of soft labels based on supervised contrastive learning to impose effective constraints on the text representation, which effectively improves the performance of the text classification model. Multiple sets of experimental results show that the performance of the RADSCL model outperforms any model that incorporates other data augmentation and supervised contrastive learning models by a considerable margin on three benchmark datasets. In particular, the average accuracy of RADSCL is improved by 2.76 %, 2.66 %, 0.91 %, and 2.75 %, respectively, relative to the fusion models such as EDA+CE+SCL, Mixup+CE+SCL, ADV+CE+SCL, and AWD+CE+SCL.
低资源环境下的文本分类是一项具有挑战性的任务。数据增强技术可以通过生成新的样本来显著缓解这种环境下训练样本不足的问题。然而,现有的数据增强方法尚未有效解决硬样本难以分类和模型泛化能力不足的问题,这使得文本分类在低资源状态下的性能仍有提升空间。为此,本文提出了一种在低资源条件下融合表征数据增强和双监督对比学习(RADSCL)的方法。代表性数据增强首先使用动态跨度截断来删除与上下文无关的单词,这减少了混淆所需的参数并降低了计算成本。然后利用PAD词嵌入的无意义特性与截断文本进行加权混合,降低文本中某些影响分类的词的重要性。通过本文提出的表示增强方法,可以生成高质量的硬正样本来优化模型的决策边界。在此基础上,鉴于目前对比学习在增强文本表征方面已经取得了显著的成果,本文构建了一个双监督对比学习框架。该框架不仅利用监督对比学习关注正负样本之间的对比关系,而且在软标签的作用下,充分学习了不同类别样本之间潜在的语义关系。标签分布对比学习进一步利用基于监督对比学习的软标签分布信息对文本表示进行有效约束,有效地提高了文本分类模型的性能。多组实验结果表明,在三个基准数据集上,RADSCL模型的性能明显优于任何包含其他数据增强和监督对比学习模型的模型。与EDA+CE+SCL、Mixup+CE+SCL、ADV+CE+SCL和AWD+CE+SCL等融合模型相比,RADSCL的平均准确率分别提高了2.76 %、2.66 %、0.91 %和2.75 %。
{"title":"RADSCL: Representation augmentation integrated with dual supervised contrastive learning for low-resource text classification","authors":"Jun Zhang ,&nbsp;Ze Kuang ,&nbsp;Fanfan Shen ,&nbsp;Jianfeng He ,&nbsp;Rui Sun ,&nbsp;Yanxiang He","doi":"10.1016/j.asoc.2026.114646","DOIUrl":"10.1016/j.asoc.2026.114646","url":null,"abstract":"<div><div>Text classification in low-resource regime is a challenging task. Data augmentation techniques can significantly alleviate the issue of insufficient training samples in such environments by generating new samples. However, existing data augmentation methods have not yet effectively solved the problems of hard samples that are hard to classify and insufficient model generalization ability, which makes the performance of text classification in low-resource regime still has room for improvement. To this end, this paper proposes a method that fuses representational data augmentation and dual supervised contrast learning (RADSCL) in low-resource regime. Representational data augmentation first uses dynamic span-cutoff to remove context-independent words, which reduces the parameters needed for mixup and lowers the computational cost. Then the meaningless property of PAD word embeddings is utilized to perform weighted mixing with the cut-off text to reduce the importance of certain words in the text that affect classification. Through the representation augmentation method proposed in this paper, high-quality hard positive samples can be generated to optimize the decision boundary of the model. On this basis, in view of the current contrastive learning has achieved significant results in enhancing text representation, this paper constructs a dual supervised contrastive learning framework. The framework not only uses supervised contrastive learning to focus on the contrastive relationship between positive and negative samples, but also fully learns the potential semantic relationship between different categories of samples under the role of soft labels. Label distribution contrastive learning further utilizes the distribution information of soft labels based on supervised contrastive learning to impose effective constraints on the text representation, which effectively improves the performance of the text classification model. Multiple sets of experimental results show that the performance of the RADSCL model outperforms any model that incorporates other data augmentation and supervised contrastive learning models by a considerable margin on three benchmark datasets. In particular, the average accuracy of RADSCL is improved by 2.76 %, 2.66 %, 0.91 %, and 2.75 %, respectively, relative to the fusion models such as EDA+CE+SCL, Mixup+CE+SCL, ADV+CE+SCL, and AWD+CE+SCL.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"191 ","pages":"Article 114646"},"PeriodicalIF":6.6,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146039870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data-driven machine learning surrogate models for estimating the wind response of tall buildings 用于估算高层建筑风响应的数据驱动机器学习代理模型
IF 6.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-16 DOI: 10.1016/j.asoc.2026.114638
Magdy Alanani , Marwa Allam , Ahmed Elshaer
As the demand for high-performance structures continues to rise, the need for performance-based wind design (PBWD) methodologies becomes increasingly apparent. The utmost outputs of PBWD can be achieved when adopted with an optimization framework. However, the computational costs are prohibitively high for optimizing tall building layouts using high-fidelity simulations. Thus, integrating surrogate models offers a promising solution, enabling computationally affordable design optimization. This paper presents a comprehensive assessment of different data-driven models to develop a surrogate model of tall buildings subjected to dynamic wind loads based on their structural system layout. The study investigates the performance of six different machine learning models: Ridge Regression (RDG), Decision Trees (DT), Random Forests (RF), Extreme Gradient Boosting (XGB), Support Vector Machines (SVM), and Deep Neural Networks (DNN). A training and testing dataset has been prepared using Finite Element Method (FEM) for a case study building subjected to wind load time history generated by an experimentally validated Computational Fluid Dynamics (CFD) model. For a valid comparison, each developed surrogate model is trained using a grid search algorithm through a k-fold cross-validation process to achieve optimal parameters and hyperparameters of each model. The investigation involves a detailed analysis of statistical performance metrics applied to training and testing datasets. The results show that the DNN model achieves the highest predictive accuracy, reaching an R2 of 0.84 for demand-to-capacity ratio prediction under single-angle wind loading, while XGB provides competitive performance with faster inference times. The proposed surrogate framework enables rapid, data-driven estimation of global responses (peak displacement and drift) and local component demands (wall and column D/C ratios), providing a computationally efficient pathway for integrating optimization, sensitivity analysis, and early-stage PBWD decision-making.
随着对高性能结构的需求不断增加,对基于性能的风设计(PBWD)方法的需求日益明显。采用优化框架,可使PBWD的产量达到最大。然而,使用高保真仿真优化高层建筑布局的计算成本高得令人望而却步。因此,集成代理模型提供了一个有前途的解决方案,使计算负担得起的设计优化成为可能。本文综合评估了不同的数据驱动模型,建立了基于结构体系布局的高层建筑动风荷载替代模型。该研究调查了六种不同机器学习模型的性能:岭回归(RDG)、决策树(DT)、随机森林(RF)、极端梯度增强(XGB)、支持向量机(SVM)和深度神经网络(DNN)。利用有限元法(FEM)建立了一个风荷载时程的案例研究数据集,该数据集是由实验验证的计算流体动力学(CFD)模型生成的。为了进行有效的比较,使用网格搜索算法通过k-fold交叉验证过程对每个开发的代理模型进行训练,以获得每个模型的最优参数和超参数。调查包括对应用于训练和测试数据集的统计性能指标的详细分析。结果表明,DNN模型的预测精度最高,在单角风荷载下的需求-容量比预测的R2为0.84,而XGB模型的预测速度更快,具有竞争力。所提出的替代框架能够快速、数据驱动地估计全局响应(峰值位移和漂移)和局部组件需求(壁和柱的D/C比),为集成优化、灵敏度分析和早期PBWD决策提供了一个计算高效的途径。
{"title":"Data-driven machine learning surrogate models for estimating the wind response of tall buildings","authors":"Magdy Alanani ,&nbsp;Marwa Allam ,&nbsp;Ahmed Elshaer","doi":"10.1016/j.asoc.2026.114638","DOIUrl":"10.1016/j.asoc.2026.114638","url":null,"abstract":"<div><div>As the demand for high-performance structures continues to rise, the need for performance-based wind design (PBWD) methodologies becomes increasingly apparent. The utmost outputs of PBWD can be achieved when adopted with an optimization framework. However, the computational costs are prohibitively high for optimizing tall building layouts using high-fidelity simulations. Thus, integrating surrogate models offers a promising solution, enabling computationally affordable design optimization. This paper presents a comprehensive assessment of different data-driven models to develop a surrogate model of tall buildings subjected to dynamic wind loads based on their structural system layout. The study investigates the performance of six different machine learning models: Ridge Regression (RDG), Decision Trees (DT), Random Forests (RF), Extreme Gradient Boosting (XGB), Support Vector Machines (SVM), and Deep Neural Networks (DNN). A training and testing dataset has been prepared using Finite Element Method (FEM) for a case study building subjected to wind load time history generated by an experimentally validated Computational Fluid Dynamics (CFD) model. For a valid comparison, each developed surrogate model is trained using a grid search algorithm through a k-fold cross-validation process to achieve optimal parameters and hyperparameters of each model. The investigation involves a detailed analysis of statistical performance metrics applied to training and testing datasets. The results show that the DNN model achieves the highest predictive accuracy, reaching an <span><math><msup><mrow><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span> of 0.84 for demand-to-capacity ratio prediction under single-angle wind loading, while XGB provides competitive performance with faster inference times. The proposed surrogate framework enables rapid, data-driven estimation of global responses (peak displacement and drift) and local component demands (wall and column D/C ratios), providing a computationally efficient pathway for integrating optimization, sensitivity analysis, and early-stage PBWD decision-making.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"191 ","pages":"Article 114638"},"PeriodicalIF":6.6,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146039867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MHADFormer: A cost-efficient multiscale hybrid transformer mixer model for automating Alzheimer’s disease diagnosis from MRI scans MHADFormer:一种经济高效的多尺度混合变压器混合器模型,用于从MRI扫描中自动诊断阿尔茨海默病
IF 6.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-15 DOI: 10.1016/j.asoc.2026.114624
Francis Jesmar P. Montalbo
Accurate and early diagnosis of Alzheimer’s disease (AD) from magnetic resonance imaging (MRI) remains a critical challenge in clinical neuroscience. Deep learning and vision transformers have advanced automated classification, yet their high computational burden limits deployment in practice. Here, a hybrid architecture, Mixer Hybrid AD Transformer (MHADFormer), is introduced to reconcile accuracy with efficiency through four lightweight modules: the Cost-efficient Feature Extractor (CeFE), Enhanced Mobile Vision Transformer (EMViT), Fused Attention with Cost-efficient Separable Convolutions (FACeS), and Feature Aggregated Spectral Transformer (FAST), integrated with a tailored MLP-Mixer. MHADFormer achieved 99.69 % accuracy on the OASIS dataset and 99.96 % on ADNI, averaging 99.83 %, while requiring only 1.11 M parameters and 0.27 GFLOPs. In addition, the model achieved 81.28 % accuracy in distinguishing between progressive and stable mild cognitive impairment, surpassing state-of-the-art baselines. These results establish MHADFormer as a scalable, interpretable, and resource-aware solution, bridging the gap between deep learning performance and clinical applicability for MRI-based AD diagnosis.
从磁共振成像(MRI)中准确和早期诊断阿尔茨海默病(AD)仍然是临床神经科学的一个关键挑战。深度学习和视觉转换器具有先进的自动分类技术,但它们的高计算负担限制了在实践中的部署。本文介绍了混合架构混合器混合AD变压器(MHADFormer),通过四个轻量级模块来协调精度和效率:具有成本效益的特征提取器(CeFE),增强型移动视觉变压器(EMViT),具有成本效益的可分离卷积(FACeS)的融合注意力,以及与定制MLP-Mixer集成的特征聚合光谱变压器(FAST)。MHADFormer在OASIS数据集上的准确率为99.69 %,在ADNI数据集上的准确率为99.96 %,平均准确率为99.83 %,而只需要1.11个 M参数和0.27个GFLOPs。此外,该模型在区分进行性和稳定性轻度认知障碍方面达到了81.28 %的准确率,超过了最先进的基线。这些结果表明,MHADFormer是一种可扩展、可解释和资源感知的解决方案,弥合了深度学习性能与基于mri的AD诊断临床适用性之间的差距。
{"title":"MHADFormer: A cost-efficient multiscale hybrid transformer mixer model for automating Alzheimer’s disease diagnosis from MRI scans","authors":"Francis Jesmar P. Montalbo","doi":"10.1016/j.asoc.2026.114624","DOIUrl":"10.1016/j.asoc.2026.114624","url":null,"abstract":"<div><div>Accurate and early diagnosis of Alzheimer’s disease (AD) from magnetic resonance imaging (MRI) remains a critical challenge in clinical neuroscience. Deep learning and vision transformers have advanced automated classification, yet their high computational burden limits deployment in practice. Here, a hybrid architecture, Mixer Hybrid AD Transformer (MHADFormer), is introduced to reconcile accuracy with efficiency through four lightweight modules: the Cost-efficient Feature Extractor (CeFE), Enhanced Mobile Vision Transformer (EMViT), Fused Attention with Cost-efficient Separable Convolutions (FACeS), and Feature Aggregated Spectral Transformer (FAST), integrated with a tailored MLP-Mixer. MHADFormer achieved 99.69 % accuracy on the OASIS dataset and 99.96 % on ADNI, averaging 99.83 %, while requiring only 1.11 M parameters and 0.27 GFLOPs. In addition, the model achieved 81.28 % accuracy in distinguishing between progressive and stable mild cognitive impairment, surpassing state-of-the-art baselines. These results establish MHADFormer as a scalable, interpretable, and resource-aware solution, bridging the gap between deep learning performance and clinical applicability for MRI-based AD diagnosis.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"190 ","pages":"Article 114624"},"PeriodicalIF":6.6,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145980375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-level retrieval with representation alignment enhances cross-document evidence synthesis for scientific knowledge generation 基于表示对齐的多级检索增强了科学知识生成的跨文档证据合成
IF 6.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-15 DOI: 10.1016/j.asoc.2026.114649
Shenghua Wang , Zhongqiang Wang , Chunhe Qu , Zhen Yin
Recent advancements in retrieval-augmented generation (RAG) have significantly improved performance in knowledge-intensive natural language processing tasks. However, applying RAG to scientific literature for knowledge synthesis remains challenging due to the difficulty in effectively retrieving and integrating multi-element evidence across documents and in modeling long-range dependencies. Moreover, the absence of dedicated evaluation resources for multi-element, cross-document scientific RAG further limits progress in this area. To address these challenges, this paper proposes a novel approach that integrates multi-level retrieval and contrastive learning to enhance RAG for scientific literature. Specifically, contrastive learning is employed to optimize embeddings for retrieving diverse element types, including text, tables, formulas, and figures, while a multi-level retrieval strategy combines coarse-grained document filtering with fine-grained element ranking to improve retrieval accuracy and efficiency. In addition, Chain-of-Thought (CoT) reasoning is incorporated to enhance the coherence of synthesized knowledge. To support research in this direction, we construct SciLitQA, a new dataset comprising over 50,000 high-quality Q&A pairs tailored for evaluating multi-element, cross-document knowledge synthesis. Extensive experiments on SciLitQA and public benchmarks demonstrate the effectiveness of our approach and establish SciLitQA as a valuable resource for advancing research in scientific RAG.
检索增强生成(RAG)的最新进展显著提高了知识密集型自然语言处理任务的性能。然而,由于难以有效地检索和整合跨文档的多元素证据以及建模远程依赖关系,将RAG应用于科学文献进行知识合成仍然具有挑战性。此外,缺乏专门的多元素、跨文件科学RAG评价资源进一步限制了这一领域的进展。为了解决这些问题,本文提出了一种结合多层次检索和对比学习的方法来提高科学文献的检索效率。具体而言,采用对比学习优化嵌入,以检索文本、表格、公式和图形等不同元素类型,而多级检索策略将粗粒度文档过滤与细粒度元素排序相结合,以提高检索精度和效率。此外,还引入了思维链推理,增强了综合知识的连贯性。为了支持这一方向的研究,我们构建了SciLitQA,这是一个包含超过50,000个高质量问答对的新数据集,专门用于评估多元素、跨文档的知识合成。在SciLitQA和公共基准上进行的大量实验证明了我们的方法的有效性,并将SciLitQA建立为推进科学RAG研究的宝贵资源。
{"title":"Multi-level retrieval with representation alignment enhances cross-document evidence synthesis for scientific knowledge generation","authors":"Shenghua Wang ,&nbsp;Zhongqiang Wang ,&nbsp;Chunhe Qu ,&nbsp;Zhen Yin","doi":"10.1016/j.asoc.2026.114649","DOIUrl":"10.1016/j.asoc.2026.114649","url":null,"abstract":"<div><div>Recent advancements in retrieval-augmented generation (RAG) have significantly improved performance in knowledge-intensive natural language processing tasks. However, applying RAG to scientific literature for knowledge synthesis remains challenging due to the difficulty in effectively retrieving and integrating multi-element evidence across documents and in modeling long-range dependencies. Moreover, the absence of dedicated evaluation resources for multi-element, cross-document scientific RAG further limits progress in this area. To address these challenges, this paper proposes a novel approach that integrates multi-level retrieval and contrastive learning to enhance RAG for scientific literature. Specifically, contrastive learning is employed to optimize embeddings for retrieving diverse element types, including text, tables, formulas, and figures, while a multi-level retrieval strategy combines coarse-grained document filtering with fine-grained element ranking to improve retrieval accuracy and efficiency. In addition, Chain-of-Thought (CoT) reasoning is incorporated to enhance the coherence of synthesized knowledge. To support research in this direction, we construct SciLitQA, a new dataset comprising over 50,000 high-quality Q&amp;A pairs tailored for evaluating multi-element, cross-document knowledge synthesis. Extensive experiments on SciLitQA and public benchmarks demonstrate the effectiveness of our approach and establish SciLitQA as a valuable resource for advancing research in scientific RAG.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"191 ","pages":"Article 114649"},"PeriodicalIF":6.6,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146039912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A hybrid NSGA-II and delayed local search approach for home healthcare scheduling and routing optimization 一种混合NSGA-II和延迟局部搜索方法用于家庭医疗保健调度和路由优化
IF 6.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-14 DOI: 10.1016/j.asoc.2026.114617
Radhia Zaghdoud , Issam Zidi , Olfa Ben Rhaiem , Salim El Khediri , Khaled Mesghouni
The home healthcare sector has emerged as a vital component of modern healthcare systems, addressing the needs of aging populations and individuals with disabilities. However, efficiently managing caregiver schedules and routing while balancing service time windows, skill compatibility, and operational costs remains a critical challenge. This study tackles a multi-objective Home Healthcare Scheduling and Routing Problem (HHCSRP) by optimizing three competing objectives: minimizing total travel time, ensuring equitable workload distribution among caregivers, and reducing waiting time between patient visits. We propose a novel two-stage hybrid algorithm that strategically integrates the Non-Dominated Sorting Genetic Algorithm II (NSGA-II) with a tailored local search method. In the first stage, NSGA-II generates a diverse initial population over multiple generations, while the second stage embeds local search at mid-generations to refine solutions without premature convergence. Computational experiments using adapted Solomon’s Vehicle Routing Problem with Time Windows (VRPTW) benchmarks demonstrate the effectiveness of the algorithm. Our approach achieves superior hypervolume (0.57–0.86) and Pareto front diversity (|FP| = 94.27 on average), outperforming existing methods in balancing efficiency, fairness, and scalability. The results highlight the robustness of the method in handling large-scale instances (100 patients), offering a scalable tool for real-world home healthcare logistics. This work advances multi-objective optimization in healthcare operations, providing actionable insights for administrators to harmonize patient satisfaction and operational efficiency.
家庭医疗保健部门已经成为现代医疗保健系统的重要组成部分,解决了老龄化人口和残疾人的需求。然而,在平衡服务时间窗口、技能兼容性和运营成本的同时,有效地管理护理人员的时间表和路线仍然是一个关键的挑战。本研究通过优化三个相互竞争的目标:最小化总旅行时间,确保护理人员之间公平的工作量分配,以及减少患者就诊之间的等待时间,来解决多目标家庭医疗保健调度和路由问题(HHCSRP)。本文提出了一种新的两阶段混合算法,该算法将非支配排序遗传算法II (NSGA-II)与定制的局部搜索方法相结合。在第一阶段,NSGA-II在多代中生成多样化的初始种群,而第二阶段在中间代嵌入局部搜索以完善解决方案,而不会过早收敛。基于时间窗的所罗门车辆路径问题(VRPTW)基准的计算实验证明了该算法的有效性。我们的方法实现了优越的hypervolume(0.57-0.86)和Pareto front diversity (|FP| =平均94.27),在平衡效率、公平性和可扩展性方面优于现有方法。结果突出了该方法在处理大规模实例(100名患者)方面的鲁棒性,为现实世界的家庭医疗保健物流提供了可扩展的工具。这项工作促进了医疗保健操作中的多目标优化,为管理员提供了可操作的见解,以协调患者满意度和操作效率。
{"title":"A hybrid NSGA-II and delayed local search approach for home healthcare scheduling and routing optimization","authors":"Radhia Zaghdoud ,&nbsp;Issam Zidi ,&nbsp;Olfa Ben Rhaiem ,&nbsp;Salim El Khediri ,&nbsp;Khaled Mesghouni","doi":"10.1016/j.asoc.2026.114617","DOIUrl":"10.1016/j.asoc.2026.114617","url":null,"abstract":"<div><div>The home healthcare sector has emerged as a vital component of modern healthcare systems, addressing the needs of aging populations and individuals with disabilities. However, efficiently managing caregiver schedules and routing while balancing service time windows, skill compatibility, and operational costs remains a critical challenge. This study tackles a multi-objective Home Healthcare Scheduling and Routing Problem (HHCSRP) by optimizing three competing objectives: minimizing total travel time, ensuring equitable workload distribution among caregivers, and reducing waiting time between patient visits. We propose a novel two-stage hybrid algorithm that strategically integrates the Non-Dominated Sorting Genetic Algorithm II (NSGA-II) with a tailored local search method. In the first stage, NSGA-II generates a diverse initial population over multiple generations, while the second stage embeds local search at mid-generations to refine solutions without premature convergence. Computational experiments using adapted Solomon’s Vehicle Routing Problem with Time Windows (VRPTW) benchmarks demonstrate the effectiveness of the algorithm. Our approach achieves superior hypervolume (0.57–0.86) and Pareto front diversity (<span><math><mo>|</mo><mi>F</mi><mi>P</mi><mo>|</mo></math></span> = 94.27 on average), outperforming existing methods in balancing efficiency, fairness, and scalability. The results highlight the robustness of the method in handling large-scale instances (100 patients), offering a scalable tool for real-world home healthcare logistics. This work advances multi-objective optimization in healthcare operations, providing actionable insights for administrators to harmonize patient satisfaction and operational efficiency.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"191 ","pages":"Article 114617"},"PeriodicalIF":6.6,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146039739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Applied Soft Computing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1