首页 > 最新文献

International Journal of Approximate Reasoning最新文献

英文 中文
Cauchy-Schwarz bounded trade-off weighting for causal inference with small sample sizes 用于小样本因果推断的考奇-施瓦茨有界权衡加权法
IF 3.2 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-29 DOI: 10.1016/j.ijar.2024.109311
Qin Ma, Shikui Tu, Lei Xu
The difficulty of causal inference for small-sample-size data lies in the issue of inefficiency that the variance of the estimators may be large. Some existing weighting methods adopt the idea of bias-variance trade-off, but they require manual specification of the trade-off parameters. To overcome this drawback, in this article, we propose a Cauchy-Schwarz Bounded Trade-off Weighting (CBTW) method, in which the trade-off parameter is theoretically derived to guarantee a small Mean Square Error (MSE) in estimation. We theoretically prove that optimizing the objective function of CBTW, which is the Cauchy-Schwarz upper-bound of the MSE for causal effect estimators, contributes to minimizing the MSE. Moreover, since the upper-bound consists of the variance and the squared 2-norm of covariate differences, CBTW can not only estimate the causal effects efficiently, but also keep the covariates balanced. Experimental results on both simulation data and real-world data show that the CBTW outperforms most existing methods especially under small sample size scenarios.
对小样本数据进行因果推断的难点在于估计值方差可能很大的低效率问题。现有的一些加权方法采用了偏差-方差权衡的思想,但需要人工指定权衡参数。为了克服这一缺点,我们在本文中提出了一种 Cauchy-Schwarz 有界权衡加权(CBTW)方法,该方法从理论上推导出权衡参数,以保证估计的均方误差(MSE)很小。我们从理论上证明,优化 CBTW 的目标函数(即因果效应估计的 MSE 的 Cauchy-Schwarz 上限)有助于最小化 MSE。此外,由于上界由协方差的方差和平方ℓ2-正态组成,因此 CBTW 不仅能有效估计因果效应,还能保持协方差的平衡。在模拟数据和实际数据上的实验结果表明,CBTW 优于大多数现有方法,尤其是在样本量较小的情况下。
{"title":"Cauchy-Schwarz bounded trade-off weighting for causal inference with small sample sizes","authors":"Qin Ma,&nbsp;Shikui Tu,&nbsp;Lei Xu","doi":"10.1016/j.ijar.2024.109311","DOIUrl":"10.1016/j.ijar.2024.109311","url":null,"abstract":"<div><div>The difficulty of causal inference for small-sample-size data lies in the issue of inefficiency that the variance of the estimators may be large. Some existing weighting methods adopt the idea of bias-variance trade-off, but they require manual specification of the trade-off parameters. To overcome this drawback, in this article, we propose a Cauchy-Schwarz Bounded Trade-off Weighting (CBTW) method, in which the trade-off parameter is theoretically derived to guarantee a small Mean Square Error (MSE) in estimation. We theoretically prove that optimizing the objective function of CBTW, which is the Cauchy-Schwarz upper-bound of the MSE for causal effect estimators, contributes to minimizing the MSE. Moreover, since the upper-bound consists of the variance and the squared <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>-norm of covariate differences, CBTW can not only estimate the causal effects efficiently, but also keep the covariates balanced. Experimental results on both simulation data and real-world data show that the CBTW outperforms most existing methods especially under small sample size scenarios.</div></div>","PeriodicalId":13842,"journal":{"name":"International Journal of Approximate Reasoning","volume":"176 ","pages":"Article 109311"},"PeriodicalIF":3.2,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142578198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A 4-valued logic for double Stone algebras 双石代数的 4 值逻辑
IF 3.2 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-24 DOI: 10.1016/j.ijar.2024.109309
Arun Kumar, Neha Gaur, Bisham Dewan
This paper investigates the logical structure of the 4-element chain considered as a double Stone algebra. It has been shown that any element of a double Stone algebra can be identified as monotone ordered triplet of sets. As a consequence, we obtain the 4-valued semantics for the logic LD of double Stone algebras. Furthermore, the rough set semantics of the logic LD is provided by dividing the boundary region (uncertainty) into two disjoint subregions.
本文研究了被视为双石代数的四元素链的逻辑结构。研究表明,双石代数的任何元素都可以被识别为单调有序的三重集合。因此,我们得到了双斯通代数逻辑 LD 的 4 值语义。此外,通过将边界区域(不确定性)划分为两个互不相交的子区域,还提供了逻辑 LD 的粗糙集语义。
{"title":"A 4-valued logic for double Stone algebras","authors":"Arun Kumar,&nbsp;Neha Gaur,&nbsp;Bisham Dewan","doi":"10.1016/j.ijar.2024.109309","DOIUrl":"10.1016/j.ijar.2024.109309","url":null,"abstract":"<div><div>This paper investigates the logical structure of the 4-element chain considered as a double Stone algebra. It has been shown that any element of a double Stone algebra can be identified as monotone ordered triplet of sets. As a consequence, we obtain the 4-valued semantics for the logic <span><math><msub><mrow><mi>L</mi></mrow><mrow><mi>D</mi></mrow></msub></math></span> of double Stone algebras. Furthermore, the rough set semantics of the logic <span><math><msub><mrow><mi>L</mi></mrow><mrow><mi>D</mi></mrow></msub></math></span> is provided by dividing the boundary region (uncertainty) into two disjoint subregions.</div></div>","PeriodicalId":13842,"journal":{"name":"International Journal of Approximate Reasoning","volume":"176 ","pages":"Article 109309"},"PeriodicalIF":3.2,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142571437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Convex expectations for countable-state uncertain processes with càdlàg sample paths 具有 càdlàg 样本路径的可数状态不确定过程的凸期望值
IF 3.2 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-18 DOI: 10.1016/j.ijar.2024.109308
Alexander Erreygers
This work investigates convex expectations, mainly in the setting of uncertain processes with countable state space. In the general setting it shows how, under the assumption of downward continuity, a convex expectation on a linear lattice of bounded functions can be extended to a convex expectation on the measurable extended real functions. This result is especially relevant in the setting of uncertain processes: there, an easy way to obtain a convex expectation on the linear lattice of finitary bounded functions is to combine an initial convex expectation with a convex transition semigroup. Crucially, this work presents a sufficient condition on this semigroup which guarantees that the induced convex expectation is downward continuous, so that it can be extended to the set of measurable extended real functions. To conclude, this work looks at existing results on convex transition semigroups from the point of view of the aforementioned sufficient condition, in particular to construct a sublinear Poisson process.
这项工作主要是在具有可数状态空间的不确定过程的背景下研究凸期望。在一般情况下,它说明了在向下连续性假设下,有界函数线性网格上的凸期望如何扩展为可测扩展实函数上的凸期望。这一结果与不确定过程的设置尤其相关:在不确定过程中,获得有限有界函数线性网格上的凸期望的简单方法是将初始凸期望与凸过渡半群结合起来。最重要的是,这项研究提出了一个关于这个半群的充分条件,它保证了诱导凸期望是向下连续的,因此它可以扩展到可测量的扩展实函数集合。最后,本作品从上述充分条件的角度出发,研究了凸过渡半群的现有结果,特别是构建了一个亚线性泊松过程。
{"title":"Convex expectations for countable-state uncertain processes with càdlàg sample paths","authors":"Alexander Erreygers","doi":"10.1016/j.ijar.2024.109308","DOIUrl":"10.1016/j.ijar.2024.109308","url":null,"abstract":"<div><div>This work investigates convex expectations, mainly in the setting of uncertain processes with countable state space. In the general setting it shows how, under the assumption of downward continuity, a convex expectation on a linear lattice of bounded functions can be extended to a convex expectation on the measurable extended real functions. This result is especially relevant in the setting of uncertain processes: there, an easy way to obtain a convex expectation on the linear lattice of finitary bounded functions is to combine an initial convex expectation with a convex transition semigroup. Crucially, this work presents a sufficient condition on this semigroup which guarantees that the induced convex expectation is downward continuous, so that it can be extended to the set of measurable extended real functions. To conclude, this work looks at existing results on convex transition semigroups from the point of view of the aforementioned sufficient condition, in particular to construct a sublinear Poisson process.</div></div>","PeriodicalId":13842,"journal":{"name":"International Journal of Approximate Reasoning","volume":"175 ","pages":"Article 109308"},"PeriodicalIF":3.2,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142531373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Approximate inference on optimized quantum Bayesian networks 优化量子贝叶斯网络的近似推理
IF 3.2 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-17 DOI: 10.1016/j.ijar.2024.109307
Walid Fathallah , Nahla Ben Amor , Philippe Leray
In recent years, there has been a significant upsurge in the interest surrounding Quantum machine learning, with researchers actively developing methods to leverage the power of quantum technology for solving highly complex problems across various domains. However, implementing gate-based quantum algorithms on noisy intermediate quantum devices (NISQ) presents notable challenges due to limited quantum resources and inherent noise. In this paper, we propose an innovative approach for representing Bayesian networks on quantum circuits, specifically designed to address these challenges and highlight the potential of combining optimized circuits with quantum hybrid algorithms for Bayesian network inference. Our aim is to minimize the required quantum resource needed to implement a Quantum Bayesian network (QBN) and implement quantum approximate inference algorithm on a quantum computer. Through simulations and experiments on IBM Quantum computers, we show that our circuit representation significantly reduces the resource requirements without decreasing the performance of the model. These findings underscore how our approach can better enable practical applications of QBN on currently available quantum hardware.
近年来,人们对量子机器学习的兴趣大增,研究人员积极开发各种方法,利用量子技术的力量解决各个领域的高度复杂问题。然而,由于有限的量子资源和固有的噪声,在噪声中间量子器件(NISQ)上实现基于门的量子算法面临着显著的挑战。在本文中,我们提出了一种在量子电路上表示贝叶斯网络的创新方法,专门用于应对这些挑战,并强调了将优化电路与用于贝叶斯网络推理的量子混合算法相结合的潜力。我们的目标是最大限度地减少实现量子贝叶斯网络(QBN)所需的量子资源,并在量子计算机上实现量子近似推理算法。通过在 IBM 量子计算机上进行模拟和实验,我们表明,我们的电路表示法在不降低模型性能的情况下大大降低了资源需求。这些发现强调了我们的方法如何能更好地在现有量子硬件上实现 QBN 的实际应用。
{"title":"Approximate inference on optimized quantum Bayesian networks","authors":"Walid Fathallah ,&nbsp;Nahla Ben Amor ,&nbsp;Philippe Leray","doi":"10.1016/j.ijar.2024.109307","DOIUrl":"10.1016/j.ijar.2024.109307","url":null,"abstract":"<div><div>In recent years, there has been a significant upsurge in the interest surrounding Quantum machine learning, with researchers actively developing methods to leverage the power of quantum technology for solving highly complex problems across various domains. However, implementing gate-based quantum algorithms on noisy intermediate quantum devices (NISQ) presents notable challenges due to limited quantum resources and inherent noise. In this paper, we propose an innovative approach for representing Bayesian networks on quantum circuits, specifically designed to address these challenges and highlight the potential of combining optimized circuits with quantum hybrid algorithms for Bayesian network inference. Our aim is to minimize the required quantum resource needed to implement a Quantum Bayesian network (QBN) and implement quantum approximate inference algorithm on a quantum computer. Through simulations and experiments on IBM Quantum computers, we show that our circuit representation significantly reduces the resource requirements without decreasing the performance of the model. These findings underscore how our approach can better enable practical applications of QBN on currently available quantum hardware.</div></div>","PeriodicalId":13842,"journal":{"name":"International Journal of Approximate Reasoning","volume":"175 ","pages":"Article 109307"},"PeriodicalIF":3.2,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142531372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A dissection of the monotonicity property of binary operations from a dominance point of view 从支配的角度剖析二元运算的单调性特性
IF 3.2 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-11 DOI: 10.1016/j.ijar.2024.109304
Yuntian Wang , Lemnaouar Zedam , Bao Qing Hu , Bernard De Baets
In this paper, we expound weaker forms of increasingness of binary operations on a lattice by reducing the number of variables involved in the classical formulation of the increasingness property as seen from the viewpoint of dominance between binary operations. We investigate the relationships among these weaker forms. Furthermore, we demonstrate the role of these weaker forms in characterizing the meet and join operations of a lattice and a chain in particular. Finally, we provide ample generic examples.
在本文中,我们从二元运算间支配性的角度出发,通过减少二元运算递增性性质经典表述中涉及的变量数量,阐述了网格上二元运算递增性的较弱形式。我们研究了这些较弱形式之间的关系。此外,我们还证明了这些弱形式在表征网格的相遇和连接操作,特别是链的相遇和连接操作中的作用。最后,我们提供了大量通用示例。
{"title":"A dissection of the monotonicity property of binary operations from a dominance point of view","authors":"Yuntian Wang ,&nbsp;Lemnaouar Zedam ,&nbsp;Bao Qing Hu ,&nbsp;Bernard De Baets","doi":"10.1016/j.ijar.2024.109304","DOIUrl":"10.1016/j.ijar.2024.109304","url":null,"abstract":"<div><div>In this paper, we expound weaker forms of increasingness of binary operations on a lattice by reducing the number of variables involved in the classical formulation of the increasingness property as seen from the viewpoint of dominance between binary operations. We investigate the relationships among these weaker forms. Furthermore, we demonstrate the role of these weaker forms in characterizing the meet and join operations of a lattice and a chain in particular. Finally, we provide ample generic examples.</div></div>","PeriodicalId":13842,"journal":{"name":"International Journal of Approximate Reasoning","volume":"175 ","pages":"Article 109304"},"PeriodicalIF":3.2,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142446030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Selected papers from the First International Joint Conference on Conceptual Knowledge Structures 第一届概念知识结构国际联合会议论文选
IF 3.2 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-09 DOI: 10.1016/j.ijar.2024.109303
Inma P. Cabrera, Sébastien Ferré, Sergei Obiedkov
{"title":"Selected papers from the First International Joint Conference on Conceptual Knowledge Structures","authors":"Inma P. Cabrera,&nbsp;Sébastien Ferré,&nbsp;Sergei Obiedkov","doi":"10.1016/j.ijar.2024.109303","DOIUrl":"10.1016/j.ijar.2024.109303","url":null,"abstract":"","PeriodicalId":13842,"journal":{"name":"International Journal of Approximate Reasoning","volume":"175 ","pages":"Article 109303"},"PeriodicalIF":3.2,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142427535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Iterative algorithms for solving one-sided partially observable stochastic shortest path games 求解单边部分可观测随机最短路径博弈的迭代算法
IF 3.2 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-01 DOI: 10.1016/j.ijar.2024.109297
Petr Tomášek, Karel Horák, Branislav Bošanský
Real-world scenarios often involve dynamic interactions among competing agents, where decisions are made considering actions taken by others. These situations can be modeled as partially observable stochastic games (POSGs), with zero-sum variants capturing strictly competitive interactions (e.g., security scenarios). While such models address a broad range of problems, they commonly focus on infinite-horizon scenarios with discounted-sum objectives. Using the discounted-sum objective, however, can lead to suboptimal solutions in cases where the length of the interaction does not directly affect the gained rewards of the players.
We thus focus on games with undiscounted objective and an indefinite horizon where every realization of the game is guaranteed to terminate after some unspecified number of turns. To manage the computational complexity of solving POSGs in general, we restrict to games with one-sided partial observability where only one player has imperfect information while their opponent is provided with full information about the current situation. We introduce two novel algorithms based on the heuristic search value iteration (HSVI) algorithm that iteratively solve sequences of easier-to-solve approximations of the game using fundamentally different approaches for constructing the sequences: (1) in GoalHorizon, the game approximations are based on a limited number of turns in which players can change their actions, (2) in GoalDiscount, the game approximations are constructed using an increasing discount factor. We provide theoretical qualitative guarantees for algorithms, and we also experimentally demonstrate that these algorithms are able to find near-optimal solutions on pursuit-evasion games and a game modeling privilege escalation problem from computer security.
现实世界中的情景往往涉及相互竞争的代理之间的动态互动,在这种情况下,决策要考虑其他人采取的行动。这些情况可被建模为部分可观测随机博弈(POSGs),零和变体可捕捉严格的竞争性互动(如安全情景)。虽然这类模型可以解决广泛的问题,但它们通常侧重于具有贴现和目标的无限视距情景。因此,我们将重点放在具有未贴现目标和无限视界的博弈上,在这种情况下,博弈的每一次实现都会保证在某个未指定的回合数后终止。为了控制解决一般 POSG 的计算复杂性,我们将博弈限制为单边部分可观察性博弈,即只有一个博弈方拥有不完全信息,而其对手则拥有关于当前情况的完全信息。我们在启发式搜索值迭代(HSVI)算法的基础上引入了两种新算法,这两种算法采用根本不同的方法构建序列,以迭代方式求解博弈的较易求解近似序列:(1) 在 GoalHorizon 算法中,博弈近似值基于玩家可以改变行动的有限回合数;(2) 在 GoalDiscount 算法中,博弈近似值使用递增折扣因子构建。我们为算法提供了理论上的定性保证,还通过实验证明了这些算法能够在追逐-逃避博弈和计算机安全中的权限升级问题博弈建模中找到近似最优解。
{"title":"Iterative algorithms for solving one-sided partially observable stochastic shortest path games","authors":"Petr Tomášek,&nbsp;Karel Horák,&nbsp;Branislav Bošanský","doi":"10.1016/j.ijar.2024.109297","DOIUrl":"10.1016/j.ijar.2024.109297","url":null,"abstract":"<div><div>Real-world scenarios often involve dynamic interactions among competing agents, where decisions are made considering actions taken by others. These situations can be modeled as partially observable stochastic games (<span>POSG</span>s), with zero-sum variants capturing strictly competitive interactions (e.g., security scenarios). While such models address a broad range of problems, they commonly focus on infinite-horizon scenarios with discounted-sum objectives. Using the discounted-sum objective, however, can lead to suboptimal solutions in cases where the length of the interaction does not directly affect the gained rewards of the players.</div><div>We thus focus on games with undiscounted objective and an indefinite horizon where every realization of the game is guaranteed to terminate after some unspecified number of turns. To manage the computational complexity of solving <span>POSG</span>s in general, we restrict to games with one-sided partial observability where only one player has imperfect information while their opponent is provided with full information about the current situation. We introduce two novel algorithms based on the heuristic search value iteration (<span>HSVI</span>) algorithm that iteratively solve sequences of easier-to-solve approximations of the game using fundamentally different approaches for constructing the sequences: (1) in <span>GoalHorizon</span>, the game approximations are based on a limited number of turns in which players can change their actions, (2) in <span>GoalDiscount</span>, the game approximations are constructed using an increasing discount factor. We provide theoretical qualitative guarantees for algorithms, and we also experimentally demonstrate that these algorithms are able to find near-optimal solutions on pursuit-evasion games and a game modeling privilege escalation problem from computer security.</div></div>","PeriodicalId":13842,"journal":{"name":"International Journal of Approximate Reasoning","volume":"175 ","pages":"Article 109297"},"PeriodicalIF":3.2,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142427534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Uncertainty-based knowledge distillation for Bayesian deep neural network compression 基于不确定性的贝叶斯深度神经网络压缩知识提炼
IF 3.2 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-01 DOI: 10.1016/j.ijar.2024.109301
Mina Hemmatian , Ali Shahzadi , Saeed Mozaffari
Deep learning models have been widely employed across various fields. In real-world scenarios, especially safety-critical applications, quantifying uncertainty is as crucial as achieving high accuracy. To address this concern, Bayesian deep neural networks (BDNNs) emerged to estimate two different types of uncertainty: Aleatoric and Epistemic. Nevertheless, implementing a BDNN on resource-constrained devices poses challenges due to the substantial computational and storage costs imposed by approximation inference techniques. Thus, efficient compression methods should be utilized. We propose an uncertainty-based knowledge distillation method to compress BDNNs. Knowledge distillation is a model compression technique that involves transferring knowledge from a complex network, known as the teacher network, to a simpler one, referred to as the student network. Our method incorporates uncertainty into knowledge distillation to address situations where inappropriate teacher supervision undermines compression performance. We utilize the Epistemic uncertainty of teacher predictions to tailor supervision for each sample individually to take into account teacher's limited knowledge. Additionally, we adjust the temperature parameter of the distillation process for each sample based on the Aleatoric uncertainty of the teacher predictions, ensuring that the student receives appropriate supervision even in the presence of ambiguous data. As a result, the proposed method enables the Bayesian student network to be trained under both appropriate supervision of the Bayesian teacher network and ground truth labels. We evaluated our method on the CIFAR-10, CIFAR-100, and RAF-DB datasets, demonstrating notable improvements in accuracy over state-of-the-art knowledge distillation-based methods. Furthermore, the robustness of our approach was assessed through testing weakly trained teacher networks and the analysis of blurred and low-resolution data, which have high uncertainty. Experimental results show that the proposed method outperformed existing methods.
深度学习模型已被广泛应用于各个领域。在现实世界的应用场景中,尤其是安全关键型应用中,量化不确定性与实现高精度同样重要。为了解决这一问题,贝叶斯深度神经网络(BDNN)应运而生,用于估计两种不同类型的不确定性:Aleatoric 和 Epistemic。然而,由于近似推理技术需要大量的计算和存储成本,在资源受限的设备上实施贝叶斯深度神经网络面临着挑战。因此,应采用高效的压缩方法。我们提出了一种基于不确定性的知识蒸馏方法来压缩 BDNN。知识蒸馏是一种模型压缩技术,涉及将复杂网络(称为教师网络)中的知识转移到更简单的网络(称为学生网络)中。我们的方法将不确定性纳入知识蒸馏,以解决教师监督不当会影响压缩性能的情况。我们利用教师预测的认识不确定性,对每个样本进行量身定制的监督,以考虑教师的有限知识。此外,我们还根据教师预测的不确定性(Aleatoric uncertainty)调整每个样本的蒸馏过程温度参数,确保学生即使在数据不明确的情况下也能得到适当的监督。因此,所提出的方法能使贝叶斯学生网络在贝叶斯教师网络和地面实况标签的适当监督下得到训练。我们在 CIFAR-10、CIFAR-100 和 RAF-DB 数据集上对我们的方法进行了评估,结果表明与最先进的基于知识提炼的方法相比,我们的方法在准确性上有显著提高。此外,我们还通过测试训练不足的教师网络以及分析具有高不确定性的模糊和低分辨率数据,评估了我们方法的鲁棒性。实验结果表明,所提出的方法优于现有方法。
{"title":"Uncertainty-based knowledge distillation for Bayesian deep neural network compression","authors":"Mina Hemmatian ,&nbsp;Ali Shahzadi ,&nbsp;Saeed Mozaffari","doi":"10.1016/j.ijar.2024.109301","DOIUrl":"10.1016/j.ijar.2024.109301","url":null,"abstract":"<div><div>Deep learning models have been widely employed across various fields. In real-world scenarios, especially safety-critical applications, quantifying uncertainty is as crucial as achieving high accuracy. To address this concern, Bayesian deep neural networks (BDNNs) emerged to estimate two different types of uncertainty: Aleatoric and Epistemic. Nevertheless, implementing a BDNN on resource-constrained devices poses challenges due to the substantial computational and storage costs imposed by approximation inference techniques. Thus, efficient compression methods should be utilized. We propose an uncertainty-based knowledge distillation method to compress BDNNs. Knowledge distillation is a model compression technique that involves transferring knowledge from a complex network, known as the teacher network, to a simpler one, referred to as the student network. Our method incorporates uncertainty into knowledge distillation to address situations where inappropriate teacher supervision undermines compression performance. We utilize the Epistemic uncertainty of teacher predictions to tailor supervision for each sample individually to take into account teacher's limited knowledge. Additionally, we adjust the temperature parameter of the distillation process for each sample based on the Aleatoric uncertainty of the teacher predictions, ensuring that the student receives appropriate supervision even in the presence of ambiguous data. As a result, the proposed method enables the Bayesian student network to be trained under both appropriate supervision of the Bayesian teacher network and ground truth labels. We evaluated our method on the CIFAR-10, CIFAR-100, and RAF-DB datasets, demonstrating notable improvements in accuracy over state-of-the-art knowledge distillation-based methods. Furthermore, the robustness of our approach was assessed through testing weakly trained teacher networks and the analysis of blurred and low-resolution data, which have high uncertainty. Experimental results show that the proposed method outperformed existing methods.</div></div>","PeriodicalId":13842,"journal":{"name":"International Journal of Approximate Reasoning","volume":"175 ","pages":"Article 109301"},"PeriodicalIF":3.2,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142427533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Distributed fusion-based algorithms for learning high-dimensional Bayesian Networks: Testing ring and star topologies 基于分布式融合的高维贝叶斯网络学习算法:测试环形和星形拓扑结构
IF 3.2 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-27 DOI: 10.1016/j.ijar.2024.109302
Jorge D. Laborda , Pablo Torrijos , José M. Puerta , José A. Gámez
Learning Bayesian Networks (BNs) from high-dimensional data is a complex and time-consuming task. Although there are approaches based on horizontal (instances) or vertical (variables) partitioning in the literature, none can guarantee the same theoretical properties as the Greedy Equivalence Search (GES) algorithm, except those based on the GES algorithm itself. This paper proposes a parallel distributed framework that uses GES as its local learning algorithm, obtaining results similar to those of GES and guaranteeing its theoretical properties but requiring less execution time. The framework involves splitting the set of all possible edges into clusters and constraining each framework node to only work with the received subset of edges. The global learning process is an iterative algorithm that carries out rounds until a convergence criterion is met. We have designed a ring and a star topology to distribute node connections. Regardless of the topology, each node receives a BN as input; it then fuses it with its own BN model and uses the result as the starting point for a local learning process, limited to its own subset of edges. Once finished, the result is then sent to another node as input. Experiments were carried out on a large repertory of domains, including large BNs up to more than 1000 variables. Our results demonstrate our proposal's effectiveness compared to GES and its fast version (fGES), generating high-quality BNs in less execution time.
从高维数据中学习贝叶斯网络(BN)是一项复杂而耗时的任务。虽然文献中有基于水平(实例)或垂直(变量)划分的方法,但除了基于 GES 算法本身的方法外,没有一种方法能保证与贪婪等价搜索(GES)算法相同的理论属性。本文提出了一种并行分布式框架,该框架使用 GES 作为本地学习算法,可获得与 GES 类似的结果,并保证其理论属性,但所需执行时间较少。该框架包括将所有可能的边集分割成群,并限制每个框架节点只能处理接收到的边子集。全局学习过程是一种迭代算法,在达到收敛标准之前会进行一轮又一轮的学习。我们设计了环形和星形拓扑结构来分配节点连接。无论采用哪种拓扑结构,每个节点都会接收一个 BN 作为输入;然后将其与自己的 BN 模型融合,并将结果作为局部学习过程的起点,但仅限于自己的边子集。一旦完成,结果就会作为输入发送到另一个节点。我们在大量领域进行了实验,包括多达 1000 多个变量的大型 BN。结果表明,与 GES 及其快速版本(fGES)相比,我们的建议非常有效,能在更短的执行时间内生成高质量的 BN。
{"title":"Distributed fusion-based algorithms for learning high-dimensional Bayesian Networks: Testing ring and star topologies","authors":"Jorge D. Laborda ,&nbsp;Pablo Torrijos ,&nbsp;José M. Puerta ,&nbsp;José A. Gámez","doi":"10.1016/j.ijar.2024.109302","DOIUrl":"10.1016/j.ijar.2024.109302","url":null,"abstract":"<div><div>Learning Bayesian Networks (BNs) from high-dimensional data is a complex and time-consuming task. Although there are approaches based on horizontal (instances) or vertical (variables) partitioning in the literature, none can guarantee the same theoretical properties as the Greedy Equivalence Search (GES) algorithm, except those based on the GES algorithm itself. This paper proposes a parallel distributed framework that uses GES as its local learning algorithm, obtaining results similar to those of GES and guaranteeing its theoretical properties but requiring less execution time. The framework involves splitting the set of all possible edges into clusters and constraining each framework node to only work with the received subset of edges. The global learning process is an iterative algorithm that carries out rounds until a convergence criterion is met. We have designed a ring and a star topology to distribute node connections. Regardless of the topology, each node receives a BN as input; it then fuses it with its own BN model and uses the result as the starting point for a local learning process, limited to its own subset of edges. Once finished, the result is then sent to another node as input. Experiments were carried out on a large repertory of domains, including large BNs up to more than 1000 variables. Our results demonstrate our proposal's effectiveness compared to GES and its fast version (fGES), generating high-quality BNs in less execution time.</div></div>","PeriodicalId":13842,"journal":{"name":"International Journal of Approximate Reasoning","volume":"175 ","pages":"Article 109302"},"PeriodicalIF":3.2,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142427531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Belief rule learning and reasoning for classification based on fuzzy belief decision tree 基于模糊信念决策树的信念规则学习与分类推理
IF 3.2 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-26 DOI: 10.1016/j.ijar.2024.109300
Lianmeng Jiao , Han Zhang , Xiaojiao Geng , Quan Pan
The belief rules which extend the classical fuzzy IF-THEN rules with belief consequent parts have been widely used for classifier design due to their capabilities of building linguistic models interpretable to users and addressing various types of uncertainty. However, in the rule learning process, a high number of features generally results in a belief rule base with large size, which degrades both the classification accuracy and the model interpretability. Motivated by this challenge, the decision tree building technique which implements feature selection and model construction jointly is introduced in this paper to learn a compact and accurate belief rule base. To this end, a new fuzzy belief decision tree (FBDT) with fuzzy feature partitions and belief leaf nodes is designed: a fuzzy information gain ratio is first defined as the feature selection criterion for node fuzzy splitting and then the belief distributions are introduced to the leaf nodes to characterize the class uncertainty. Based on the initial rules extracted from the constructed FBDT, a joint optimization objective considering both classification accuracy and model interpretability is then designed to further reduce the rule redundancy. Experimental results based on real datasets show that the proposed FBDT-based classification method has much smaller rule base and better interpretability than other rule-based methods on the premise of competitive accuracy.
信念规则是对经典模糊 IF-THEN 规则的扩展,具有信念后果部分,因其能够建立用户可解释的语言模型,并能解决各种类型的不确定性,已被广泛用于分类器设计。然而,在规则学习过程中,大量特征通常会导致信念规则库规模庞大,从而降低分类精度和模型可解释性。受此挑战的启发,本文引入了决策树构建技术,将特征选择和模型构建结合起来,以学习一个紧凑而精确的信念规则库。为此,本文设计了一种具有模糊特征分区和信念叶节点的新型模糊信念决策树(FBDT):首先定义一个模糊信息增益比作为节点模糊分区的特征选择标准,然后在叶节点中引入信念分布来表征类的不确定性。根据从构建的 FBDT 中提取的初始规则,设计一个同时考虑分类准确性和模型可解释性的联合优化目标,以进一步减少规则冗余。基于真实数据集的实验结果表明,与其他基于规则的方法相比,基于 FBDT 的分类方法在具有竞争力的准确率前提下,规则库更小,可解释性更好。
{"title":"Belief rule learning and reasoning for classification based on fuzzy belief decision tree","authors":"Lianmeng Jiao ,&nbsp;Han Zhang ,&nbsp;Xiaojiao Geng ,&nbsp;Quan Pan","doi":"10.1016/j.ijar.2024.109300","DOIUrl":"10.1016/j.ijar.2024.109300","url":null,"abstract":"<div><div>The belief rules which extend the classical fuzzy IF-THEN rules with belief consequent parts have been widely used for classifier design due to their capabilities of building linguistic models interpretable to users and addressing various types of uncertainty. However, in the rule learning process, a high number of features generally results in a belief rule base with large size, which degrades both the classification accuracy and the model interpretability. Motivated by this challenge, the decision tree building technique which implements feature selection and model construction jointly is introduced in this paper to learn a compact and accurate belief rule base. To this end, a new fuzzy belief decision tree (FBDT) with fuzzy feature partitions and belief leaf nodes is designed: a fuzzy information gain ratio is first defined as the feature selection criterion for node fuzzy splitting and then the belief distributions are introduced to the leaf nodes to characterize the class uncertainty. Based on the initial rules extracted from the constructed FBDT, a joint optimization objective considering both classification accuracy and model interpretability is then designed to further reduce the rule redundancy. Experimental results based on real datasets show that the proposed FBDT-based classification method has much smaller rule base and better interpretability than other rule-based methods on the premise of competitive accuracy.</div></div>","PeriodicalId":13842,"journal":{"name":"International Journal of Approximate Reasoning","volume":"175 ","pages":"Article 109300"},"PeriodicalIF":3.2,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142359094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Approximate Reasoning
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1