首页 > 最新文献

International Journal of Approximate Reasoning最新文献

英文 中文
Fusing fuzzy rough sets and mean shift for anomaly detection 基于模糊粗糙集和均值移位的异常检测
IF 3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-01 Epub Date: 2025-07-18 DOI: 10.1016/j.ijar.2025.109530
Mengyao Liao , Zhiyu Chen , Can Gao , Jie Zhou , Xiaodong Yue
Outlier detection is a critical but challenging task due to the complex distribution of practical data, and some Fuzzy Rough Sets (FRS)-based methods have been presented to identify outliers from these data. However, these methods still have limitations when facing the co-existence of different types of outliers. In this study, an improved FRS-based unsupervised anomaly detection method is proposed by integrating distance and density information. Specifically, to detect the local outliers, a fuzzy granule density is first defined by introducing a Gaussian kernel similarity to characterize the local density of samples. Then, optimistic and pessimistic fuzzy granule densities are further developed to evaluate the density variation in the local neighborhood. Moreover, a distance measure based on mean shift is introduced to detect global and group outliers. Finally, an outlier detection method that integrates the density and distance measures is designed to effectively identify diverse types of outliers. Extensive experiments on synthetic and public datasets, along with statistical significance analysis, demonstrate the superior performance of the proposed method, achieving an average improvement of at least 12.27% in terms of AUROC.
由于实际数据的复杂分布,异常值检测是一项关键但具有挑战性的任务,一些基于模糊粗糙集(FRS)的方法已经提出了从这些数据中识别异常值的方法。然而,这些方法在面对不同类型离群值共存时仍然存在局限性。本文提出了一种改进的基于frs的无监督异常检测方法,将距离和密度信息相结合。具体来说,为了检测局部异常值,首先通过引入高斯核相似度来定义模糊颗粒密度来表征样本的局部密度。然后,进一步发展乐观和悲观模糊颗粒密度来评价局部邻域的密度变化。此外,还引入了一种基于均值位移的距离度量来检测全局异常点和组异常点。最后,设计了一种融合密度和距离测度的离群点检测方法,有效识别不同类型的离群点。在合成和公共数据集上进行的大量实验以及统计显著性分析表明,所提出的方法具有优越的性能,在AUROC方面平均提高了至少12.27%。
{"title":"Fusing fuzzy rough sets and mean shift for anomaly detection","authors":"Mengyao Liao ,&nbsp;Zhiyu Chen ,&nbsp;Can Gao ,&nbsp;Jie Zhou ,&nbsp;Xiaodong Yue","doi":"10.1016/j.ijar.2025.109530","DOIUrl":"10.1016/j.ijar.2025.109530","url":null,"abstract":"<div><div>Outlier detection is a critical but challenging task due to the complex distribution of practical data, and some Fuzzy Rough Sets (FRS)-based methods have been presented to identify outliers from these data. However, these methods still have limitations when facing the co-existence of different types of outliers. In this study, an improved FRS-based unsupervised anomaly detection method is proposed by integrating distance and density information. Specifically, to detect the local outliers, a fuzzy granule density is first defined by introducing a Gaussian kernel similarity to characterize the local density of samples. Then, optimistic and pessimistic fuzzy granule densities are further developed to evaluate the density variation in the local neighborhood. Moreover, a distance measure based on mean shift is introduced to detect global and group outliers. Finally, an outlier detection method that integrates the density and distance measures is designed to effectively identify diverse types of outliers. Extensive experiments on synthetic and public datasets, along with statistical significance analysis, demonstrate the superior performance of the proposed method, achieving an average improvement of at least 12.27% in terms of AUROC.</div></div>","PeriodicalId":13842,"journal":{"name":"International Journal of Approximate Reasoning","volume":"187 ","pages":"Article 109530"},"PeriodicalIF":3.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144722467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Superhedging supermartingales Superhedging上鞅
IF 3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-01 Epub Date: 2025-09-03 DOI: 10.1016/j.ijar.2025.109567
C. Bender , S.E. Ferrando , K. Gajewski , A.L. González
Supermartingales are here defined in a non-probabilistic setting and can be interpreted solely in terms of superhedging operations. The classical expectation operator is replaced by a pair of subadditive operators: one defines a class of null sets, and the other acts as an outer integral. These operators are motivated by a financial theory of no-arbitrage pricing. Such a setting extends the classical stochastic framework by replacing the path space of the process by a trajectory set, while also providing a financial/gambling interpretation based on the notion of superhedging. The paper proves analogues of the following classical results: Doob's supermartingale decomposition and Doob's pointwise convergence theorem for non-negative supermartingales. The approach shows how linearity of the expectation operator can be circumvented and how integrability properties in the proposed setting lead to the special case of (hedging) martingales while no integrability conditions are required for the general supermartingale case.
在这里,超鞅是在非概率设置中定义的,并且可以仅根据超对冲操作来解释。经典的期望运算符被一对子加性运算符所取代:一个定义了一个空集的类,另一个作为外积分。这些经营者的动机是无套利定价的金融理论。这种设置通过用轨迹集替换过程的路径空间扩展了经典的随机框架,同时也提供了基于超套期保值概念的金融/赌博解释。本文证明了以下经典结果的类似物:非负上鞅的Doob上鞅分解和Doob上鞅的点向收敛定理。该方法显示了期望算子的线性是如何被规避的,以及在所提出的设置中的可积性如何导致(套期)鞅的特殊情况,而一般上鞅情况不需要可积性条件。
{"title":"Superhedging supermartingales","authors":"C. Bender ,&nbsp;S.E. Ferrando ,&nbsp;K. Gajewski ,&nbsp;A.L. González","doi":"10.1016/j.ijar.2025.109567","DOIUrl":"10.1016/j.ijar.2025.109567","url":null,"abstract":"<div><div>Supermartingales are here defined in a non-probabilistic setting and can be interpreted solely in terms of superhedging operations. The classical expectation operator is replaced by a pair of subadditive operators: one defines a class of null sets, and the other acts as an outer integral. These operators are motivated by a financial theory of no-arbitrage pricing. Such a setting extends the classical stochastic framework by replacing the path space of the process by a trajectory set, while also providing a financial/gambling interpretation based on the notion of superhedging. The paper proves analogues of the following classical results: Doob's supermartingale decomposition and Doob's pointwise convergence theorem for non-negative supermartingales. The approach shows how linearity of the expectation operator can be circumvented and how integrability properties in the proposed setting lead to the special case of (hedging) martingales while no integrability conditions are required for the general supermartingale case.</div></div>","PeriodicalId":13842,"journal":{"name":"International Journal of Approximate Reasoning","volume":"187 ","pages":"Article 109567"},"PeriodicalIF":3.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145010090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Being Bayesian about learning Bayesian networks from hybrid data 成为贝叶斯就是从混合数据中学习贝叶斯网络
IF 3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-01 Epub Date: 2025-08-20 DOI: 10.1016/j.ijar.2025.109549
Marco Grzegorczyk
We develop a new Bayesian model to infer the structure of Bayesian networks from hybrid data, that is, data containing a mix of continuous (Gaussian) and discrete (categorical) variables. In line with state-of-the-art hybrid Bayesian network models, we do not allow discrete variables to have continuous parents. However, our new model differs from existing approaches by incorporating discrete variables through multivariate linear regression rather than mixture modeling. In our model, the continuous variables follow a multivariate Gaussian distribution with a shared covariance matrix, while the mean vector varies across different configurations.
As with all Bayesian network models, we use directed acyclic graphs (DAGs) to represent conditional dependency relations among the continuous variables. For our Gaussian distribution, this requires the covariance matrix to be consistent with the structure of the DAG. Our key idea is to apply multivariate linear regression, using the discrete variables as potential covariates to adjust the mean vector of the multivariate Gaussian distribution. Each continuous variable is associated with its own regression model and discrete parent set. Since the values of the discrete variables vary across observations, the mean vector becomes observation-specific.
This enables mean-adjustment of the continuous variables for their discrete parents while simultaneously inferring a Gaussian Bayesian network among them. In simulation studies, we compare our new model against two state-of-the-art hybrid Bayesian network models and demonstrate that both existing models have conceptual shortcomings, positioning our new hybrid Bayesian network model as a strong alternative.
我们开发了一个新的贝叶斯模型来从混合数据推断贝叶斯网络的结构,即包含连续(高斯)和离散(分类)变量混合的数据。根据最先进的混合贝叶斯网络模型,我们不允许离散变量有连续的父变量。然而,我们的新模型不同于现有的方法,通过多元线性回归而不是混合建模纳入离散变量。在我们的模型中,连续变量遵循具有共享协方差矩阵的多元高斯分布,而平均向量在不同配置中变化。与所有贝叶斯网络模型一样,我们使用有向无环图(dag)来表示连续变量之间的条件依赖关系。对于我们的高斯分布,这要求协方差矩阵与DAG的结构一致。我们的关键思想是应用多元线性回归,使用离散变量作为潜在协变量来调整多元高斯分布的平均向量。每个连续变量都与自己的回归模型和离散父集相关联。由于离散变量的值在不同的观测值之间变化,因此平均向量是特定于观测值的。这使得连续变量对离散父变量的均值调整成为可能,同时推断出它们之间的高斯贝叶斯网络。在仿真研究中,我们将我们的新模型与两种最先进的混合贝叶斯网络模型进行了比较,并证明这两种现有模型都存在概念上的缺陷,将我们的新混合贝叶斯网络模型定位为一种强大的替代方案。
{"title":"Being Bayesian about learning Bayesian networks from hybrid data","authors":"Marco Grzegorczyk","doi":"10.1016/j.ijar.2025.109549","DOIUrl":"10.1016/j.ijar.2025.109549","url":null,"abstract":"<div><div>We develop a new Bayesian model to infer the structure of Bayesian networks from hybrid data, that is, data containing a mix of continuous (Gaussian) and discrete (categorical) variables. In line with state-of-the-art hybrid Bayesian network models, we do not allow discrete variables to have continuous parents. However, our new model differs from existing approaches by incorporating discrete variables through multivariate linear regression rather than mixture modeling. In our model, the continuous variables follow a multivariate Gaussian distribution with a shared covariance matrix, while the mean vector varies across different configurations.</div><div>As with all Bayesian network models, we use directed acyclic graphs (DAGs) to represent conditional dependency relations among the continuous variables. For our Gaussian distribution, this requires the covariance matrix to be consistent with the structure of the DAG. Our key idea is to apply multivariate linear regression, using the discrete variables as potential covariates to adjust the mean vector of the multivariate Gaussian distribution. Each continuous variable is associated with its own regression model and discrete parent set. Since the values of the discrete variables vary across observations, the mean vector becomes observation-specific.</div><div>This enables mean-adjustment of the continuous variables for their discrete parents while simultaneously inferring a Gaussian Bayesian network among them. In simulation studies, we compare our new model against two state-of-the-art hybrid Bayesian network models and demonstrate that both existing models have conceptual shortcomings, positioning our new hybrid Bayesian network model as a strong alternative.</div></div>","PeriodicalId":13842,"journal":{"name":"International Journal of Approximate Reasoning","volume":"187 ","pages":"Article 109549"},"PeriodicalIF":3.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144886819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GTransformer: Multi-view functional granulation and self-attention for tabular data modeling GTransformer:用于表格数据建模的多视图功能粒化和自关注
IF 3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-01 Epub Date: 2025-08-08 DOI: 10.1016/j.ijar.2025.109547
Liang Liao , Yumin Chen , Yingyue Chen , Yiting Lin
To bridge the performance gap between deep learning models and tree ensemble methods in tabular data tasks, we propose GTransformer, a novel deep architecture that innovatively integrates granular computing and self-attention mechanisms. Our approach introduces a scalable granulation function set, from which diverse functions are randomly sampled to construct multi-view feature granules. These granules are aggregated into granule vectors, forming a multi-view functional granulation layer that provides comprehensive representations of tabular features from multiple perspectives. Subsequently, a Transformer encoder driven by granule sequences is employed to model deep interactions among features, with predictions generated via a hierarchical multilayer perceptron (MLP) classification head. Experiments on 12 datasets show that GTransformer achieves an average AUC of 92.9%, which is comparable to the 92.3% performance of LightGBM. Compared with the current mainstream deep model TabNet, the average AUC gain is 2.74%, with a 14.5% improvement on the Sonar dataset. GTransformer demonstrates strong robustness in scenarios with noise and missing data, especially on the Credit and HTRU2 datasets, where the accuracy decline is 24.73% and 17.03% less than that of MLP-Head respectively, further verifying its applicability in complex real-world application scenarios.
为了弥合深度学习模型和树集成方法在表格数据任务中的性能差距,我们提出了一种新的深度架构GTransformer,它创新地集成了颗粒计算和自关注机制。我们的方法引入了一个可扩展的粒化函数集,从中随机抽取不同的函数来构建多视图特征粒。这些颗粒聚集成颗粒向量,形成一个多视图功能造粒层,从多个角度提供表格特征的综合表示。随后,采用由颗粒序列驱动的Transformer编码器对特征之间的深度交互进行建模,并通过分层多层感知器(MLP)分类头生成预测。在12个数据集上的实验表明,GTransformer的平均AUC达到了92.9%,与LightGBM的92.3%相当。与当前主流深度模型TabNet相比,平均AUC增益为2.74%,在Sonar数据集上提高了14.5%。GTransformer在存在噪声和缺失数据的场景下表现出了较强的鲁棒性,尤其是在Credit和HTRU2数据集上,其准确率降幅分别比MLP-Head小24.73%和17.03%,进一步验证了其在复杂的现实应用场景中的适用性。
{"title":"GTransformer: Multi-view functional granulation and self-attention for tabular data modeling","authors":"Liang Liao ,&nbsp;Yumin Chen ,&nbsp;Yingyue Chen ,&nbsp;Yiting Lin","doi":"10.1016/j.ijar.2025.109547","DOIUrl":"10.1016/j.ijar.2025.109547","url":null,"abstract":"<div><div>To bridge the performance gap between deep learning models and tree ensemble methods in tabular data tasks, we propose GTransformer, a novel deep architecture that innovatively integrates granular computing and self-attention mechanisms. Our approach introduces a scalable granulation function set, from which diverse functions are randomly sampled to construct multi-view feature granules. These granules are aggregated into granule vectors, forming a multi-view functional granulation layer that provides comprehensive representations of tabular features from multiple perspectives. Subsequently, a Transformer encoder driven by granule sequences is employed to model deep interactions among features, with predictions generated via a hierarchical multilayer perceptron (MLP) classification head. Experiments on 12 datasets show that GTransformer achieves an average AUC of 92.9%, which is comparable to the 92.3% performance of LightGBM. Compared with the current mainstream deep model TabNet, the average AUC gain is 2.74%, with a 14.5% improvement on the Sonar dataset. GTransformer demonstrates strong robustness in scenarios with noise and missing data, especially on the Credit and HTRU2 datasets, where the accuracy decline is 24.73% and 17.03% less than that of MLP-Head respectively, further verifying its applicability in complex real-world application scenarios.</div></div>","PeriodicalId":13842,"journal":{"name":"International Journal of Approximate Reasoning","volume":"187 ","pages":"Article 109547"},"PeriodicalIF":3.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144829500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Explainable granular fusion: Graph-embedded rectangular neighborhood rough sets for knowledge system convergence 可解释的颗粒融合:用于知识系统收敛的图嵌入矩形邻域粗糙集
IF 3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-01 Epub Date: 2025-08-28 DOI: 10.1016/j.ijar.2025.109561
Yigao Li, Weihua Xu
With the development of Rough Set Theory (RST), many improved theories based on RST have emerged. Some of these theories have been applied in the field of feature selection, significantly improving its efficiency. However, they have not yet been widely used in multi-source information domains. This paper proposes a multi-source information fusion method based on Granular-Rectangular Neighborhood Rough Set (GRNRS) and graph theory. First, an improved algorithm based on GRNRS is proposed to evaluate the contribution of each information source to a classification task under a specific attribute. In this process, we provided rigorous theoretical proofs for the concepts and mechanisms used in the improved GRNRS. Meanwhile, the Pearson Correlation Coefficient (PCC) is used to assess the linear relationship between information sources. Then, by integrating the results of the improved GRNRS algorithm and PCC, the adjacency matrix of a graph is constructed. Finally, the preference value of each information source under a specific attribute is calculated based on the adjacency matrix. Information fusion under a specific attribute is achieved by selecting the information source with the highest preference value. Extensive experiments are conducted to analyze the impact of the algorithm's parameters on its final performance. Meanwhile, our method is compared with seven other information fusion algorithms using three metrics: classification accuracy, Average Quality (AQ), and runtime. Friedman and Nemenyi tests are conducted on the comparison results under the classification accuracy and AQ metrics, demonstrating that there are significant differences among the algorithms. The results demonstrate that the proposed algorithm is both time-efficient and effective.
随着粗糙集理论的发展,出现了许多基于粗糙集理论的改进理论。其中一些理论已被应用于特征选择领域,显著提高了特征选择的效率。然而,它们在多源信息领域尚未得到广泛应用。提出了一种基于颗粒矩形邻域粗糙集(GRNRS)和图论的多源信息融合方法。首先,提出了一种基于GRNRS的改进算法来评估每个信息源在特定属性下对分类任务的贡献。在此过程中,我们为改进的GRNRS所使用的概念和机制提供了严格的理论证明。同时,使用Pearson相关系数(PCC)来评估信息源之间的线性关系。然后,将改进的GRNRS算法的结果与PCC算法的结果相结合,构造图的邻接矩阵。最后,基于邻接矩阵计算每个信息源在特定属性下的优先级值。通过选择首选值最高的信息源,实现特定属性下的信息融合。通过大量的实验来分析算法参数对最终性能的影响。同时,用分类精度、平均质量(AQ)和运行时间这三个指标与其他7种信息融合算法进行了比较。对分类精度和AQ指标下的比较结果进行Friedman和Nemenyi检验,表明算法之间存在显著差异。实验结果表明,该算法具有较好的时间效率和有效性。
{"title":"Explainable granular fusion: Graph-embedded rectangular neighborhood rough sets for knowledge system convergence","authors":"Yigao Li,&nbsp;Weihua Xu","doi":"10.1016/j.ijar.2025.109561","DOIUrl":"10.1016/j.ijar.2025.109561","url":null,"abstract":"<div><div>With the development of Rough Set Theory (RST), many improved theories based on RST have emerged. Some of these theories have been applied in the field of feature selection, significantly improving its efficiency. However, they have not yet been widely used in multi-source information domains. This paper proposes a multi-source information fusion method based on Granular-Rectangular Neighborhood Rough Set (GRNRS) and graph theory. First, an improved algorithm based on GRNRS is proposed to evaluate the contribution of each information source to a classification task under a specific attribute. In this process, we provided rigorous theoretical proofs for the concepts and mechanisms used in the improved GRNRS. Meanwhile, the Pearson Correlation Coefficient (PCC) is used to assess the linear relationship between information sources. Then, by integrating the results of the improved GRNRS algorithm and PCC, the adjacency matrix of a graph is constructed. Finally, the preference value of each information source under a specific attribute is calculated based on the adjacency matrix. Information fusion under a specific attribute is achieved by selecting the information source with the highest preference value. Extensive experiments are conducted to analyze the impact of the algorithm's parameters on its final performance. Meanwhile, our method is compared with seven other information fusion algorithms using three metrics: classification accuracy, Average Quality (AQ), and runtime. Friedman and Nemenyi tests are conducted on the comparison results under the classification accuracy and AQ metrics, demonstrating that there are significant differences among the algorithms. The results demonstrate that the proposed algorithm is both time-efficient and effective.</div></div>","PeriodicalId":13842,"journal":{"name":"International Journal of Approximate Reasoning","volume":"187 ","pages":"Article 109561"},"PeriodicalIF":3.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144922645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Self-supervised multi-level generative adversarial network data imputation algorithm 自监督多级生成对抗网络数据输入算法
IF 3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-01 Epub Date: 2025-08-27 DOI: 10.1016/j.ijar.2025.109553
Yi Xu , Shujuan Fang , Xuhui Xing
Data missing has always been a challenging problem in machine learning. The Generative Adversarial Imputation Networks (GAIN) have been shown to outperform many existing solutions. However, in GAIN, because missing values lack ground truth as supervision, it is unable to construct reconstruction loss for missing values and can only judge the reasonableness of imputed values based on reconstruction loss of non-missing values and adversarial loss. From the perspective of granular computing, data has levels, and data at different levels of granularity encapsulates different knowledge. Therefore, based on granular computing, this paper proposes a self-supervised multi-level generative adversarial network data imputation algorithm (MGAIN). Firstly, multiple levels of data are constructed using nested feature set sequences. Then, GAIN is used to impute missing values at the coarsest granularity level, and the imputation results of missing values at the coarse granularity level are used as supervision for imputing missing values at the fine granularity level, constructing reconstruction loss for missing values at the fine granularity level. Finally, based on reconstruction loss of missing values, reconstruction loss of non-missing values, and adversarial loss, data at the finer granularity level is imputed. MGAIN imputes missing values level by level from the coarse granularity level to the fine granularity level to obtain more accurate imputation results. Experimental results validate the effectiveness of the proposed method.
数据丢失一直是机器学习中的一个难题。生成对抗输入网络(GAIN)已被证明优于许多现有的解决方案。而在GAIN中,由于缺失值缺乏作为监督的基础真值,无法对缺失值构建重构损失,只能根据非缺失值的重构损失和对抗性损失来判断输入值的合理性。从粒度计算的角度来看,数据具有级别,不同粒度级别的数据封装了不同的知识。为此,本文提出了一种基于颗粒计算的自监督多级生成对抗网络数据输入算法(MGAIN)。首先,使用嵌套的特征集序列构建多层数据。然后,利用GAIN在最粗粒度层面进行缺失值的估算,利用粗粒度层面缺失值的估算结果作为细粒度层面缺失值估算的监督,构建细粒度层面缺失值的重构损失。最后,基于缺失值的重建损失、非缺失值的重建损失和对抗损失,估算出更细粒度的数据。MGAIN从粗粒度级到细粒度级逐级进行缺失值的imputation,以获得更准确的imputation结果。实验结果验证了该方法的有效性。
{"title":"Self-supervised multi-level generative adversarial network data imputation algorithm","authors":"Yi Xu ,&nbsp;Shujuan Fang ,&nbsp;Xuhui Xing","doi":"10.1016/j.ijar.2025.109553","DOIUrl":"10.1016/j.ijar.2025.109553","url":null,"abstract":"<div><div>Data missing has always been a challenging problem in machine learning. The Generative Adversarial Imputation Networks (GAIN) have been shown to outperform many existing solutions. However, in GAIN, because missing values lack ground truth as supervision, it is unable to construct reconstruction loss for missing values and can only judge the reasonableness of imputed values based on reconstruction loss of non-missing values and adversarial loss. From the perspective of granular computing, data has levels, and data at different levels of granularity encapsulates different knowledge. Therefore, based on granular computing, this paper proposes a self-supervised multi-level generative adversarial network data imputation algorithm (MGAIN). Firstly, multiple levels of data are constructed using nested feature set sequences. Then, GAIN is used to impute missing values at the coarsest granularity level, and the imputation results of missing values at the coarse granularity level are used as supervision for imputing missing values at the fine granularity level, constructing reconstruction loss for missing values at the fine granularity level. Finally, based on reconstruction loss of missing values, reconstruction loss of non-missing values, and adversarial loss, data at the finer granularity level is imputed. MGAIN imputes missing values level by level from the coarse granularity level to the fine granularity level to obtain more accurate imputation results. Experimental results validate the effectiveness of the proposed method.</div></div>","PeriodicalId":13842,"journal":{"name":"International Journal of Approximate Reasoning","volume":"187 ","pages":"Article 109553"},"PeriodicalIF":3.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144925202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the optimality of coin-betting for mean estimation 基于均值估计的投币最优性研究
IF 3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-01 Epub Date: 2025-09-01 DOI: 10.1016/j.ijar.2025.109550
Eugenio Clerico
We consider the problem of testing the mean of a bounded real random variable. We introduce a notion of optimal classes for e-variables and e-processes, and establish the optimality of the coin-betting formulation among e-variable-based algorithmic frameworks for testing and estimating the (conditional) mean. As a consequence, we provide a direct and explicit characterisation of all valid e-variables and e-processes for this testing problem. In the language of classical statistical decision theory, we fully describe the set of all admissible e-variables and e-processes, and identify the corresponding minimal complete class.
考虑一个有界实随机变量均值的检验问题。我们引入了e变量和e过程的最优类的概念,并在基于e变量的算法框架中建立了用于测试和估计(条件)均值的投币公式的最优性。因此,我们为这个测试问题提供了所有有效e变量和e过程的直接和明确的特征。用经典统计决策理论的语言,充分描述了所有允许e变量和e过程的集合,并识别了相应的最小完全类。
{"title":"On the optimality of coin-betting for mean estimation","authors":"Eugenio Clerico","doi":"10.1016/j.ijar.2025.109550","DOIUrl":"10.1016/j.ijar.2025.109550","url":null,"abstract":"<div><div>We consider the problem of testing the mean of a bounded real random variable. We introduce a notion of optimal classes for e-variables and e-processes, and establish the optimality of the coin-betting formulation among e-variable-based algorithmic frameworks for testing and estimating the (conditional) mean. As a consequence, we provide a direct and explicit characterisation of all valid e-variables and e-processes for this testing problem. In the language of classical statistical decision theory, we fully describe the set of all admissible e-variables and e-processes, and identify the corresponding minimal complete class.</div></div>","PeriodicalId":13842,"journal":{"name":"International Journal of Approximate Reasoning","volume":"187 ","pages":"Article 109550"},"PeriodicalIF":3.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144932346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Three-way conflict analysis: Issue reduct based on incomplete fuzzy value information 三向冲突分析:基于不完全模糊价值信息的问题缩减
IF 3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-01 Epub Date: 2025-09-10 DOI: 10.1016/j.ijar.2025.109568
Hai-Long Yang , Sheng Gao , Zhi-Lian Guo
In the three-way conflict analysis (TWCA), certain core issues lead to the emergence, development, and resolution of conflicts. Issue reduct enables us to concentrate on key issues and more accurately identify the root causes of conflicts. Existing research primarily addresses issue reduct based on complete three-valued situation tables (TSTs), which have certain limitations. This paper discusses the issue reduct in TWCA based on incomplete fuzzy-valued situation tables (IFSTs). First, to deal with incomplete information, we introduce the Social Trust Network (STN) and the K-Nearest Neighbor (KNN) method, employing an iterative weighting method to fill in missing values. Second, by utilizing the matrix representation of relations among agents, we transform the relation matrix into constraint conditions and propose a recursive backtracking algorithm with pruning strategies to calculate conflict, neutrality, alliance, and global reducts. Finally, we use the development plan of the Gansu Provincial Government as a case study to illustrate the model's applicability and advantages through parameter and comparative analysis.
在三方冲突分析(TWCA)中,某些核心问题导致冲突的产生、发展和解决。减少问题使我们能够集中精力解决关键问题,更准确地找出冲突的根源。现有的研究主要是基于完全三值情景表的问题约简,存在一定的局限性。本文讨论了基于不完全模糊值情景表的TWCA问题约简。首先,为了处理不完全信息,我们引入了社会信任网络(STN)和k近邻(KNN)方法,采用迭代加权法来填补缺失值。其次,利用智能体之间关系的矩阵表示,将关系矩阵转化为约束条件,并提出了一种带有修剪策略的递归回溯算法来计算冲突、中立、联盟和全局约简。最后,以甘肃省政府发展规划为例,通过参数分析和对比分析来说明模型的适用性和优势。
{"title":"Three-way conflict analysis: Issue reduct based on incomplete fuzzy value information","authors":"Hai-Long Yang ,&nbsp;Sheng Gao ,&nbsp;Zhi-Lian Guo","doi":"10.1016/j.ijar.2025.109568","DOIUrl":"10.1016/j.ijar.2025.109568","url":null,"abstract":"<div><div>In the three-way conflict analysis (TWCA), certain core issues lead to the emergence, development, and resolution of conflicts. Issue reduct enables us to concentrate on key issues and more accurately identify the root causes of conflicts. Existing research primarily addresses issue reduct based on complete three-valued situation tables (TSTs), which have certain limitations. This paper discusses the issue reduct in TWCA based on incomplete fuzzy-valued situation tables (IFSTs). First, to deal with incomplete information, we introduce the Social Trust Network (STN) and the <em>K</em>-Nearest Neighbor (KNN) method, employing an iterative weighting method to fill in missing values. Second, by utilizing the matrix representation of relations among agents, we transform the relation matrix into constraint conditions and propose a recursive backtracking algorithm with pruning strategies to calculate conflict, neutrality, alliance, and global reducts. Finally, we use the development plan of the Gansu Provincial Government as a case study to illustrate the model's applicability and advantages through parameter and comparative analysis.</div></div>","PeriodicalId":13842,"journal":{"name":"International Journal of Approximate Reasoning","volume":"187 ","pages":"Article 109568"},"PeriodicalIF":3.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145104296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Relative pre-reducts for computing the relative reducts of large data sets 用于计算大型数据集的相对约简的相对预约
IF 3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-01 Epub Date: 2025-08-07 DOI: 10.1016/j.ijar.2025.109544
Hajime Okawa , Yasuo Kudo , Tetsuya Murai
In this paper, we introduce the concept of relative pre-reducts to derive the relative reducts from a large dataset. The relative reduct is considered a consistency-based attribute reduction method that is commonly utilized to extract concise subsets of condition attributes. Nonetheless, calculating all relative reducts necessitates substantial time and memory to build a discernibility matrix. In this research, we demonstrate that all relative pre-reducts can be computed using a simplified matrix referred to as the partial discernibility matrix, which can be readily converted into relative reducts. We also suggest employing a data partitioning approach to generate the discernibility matrix. This method alleviates the issue of an increased number of results for each partition. The outcomes from this technique yield the relative pre-reducts proposed in this study. Since our enhancements to the computation of relative reducts are independent of other advancements, they can be implemented in conjunction with existing methods. Experimental findings indicate that utilizing relative pre-reducts for computing relative reducts is efficient for large datasets.
在本文中,我们引入了相对预约的概念,从大型数据集中推导出相对约简。相对约简被认为是一种基于一致性的属性约简方法,通常用于提取条件属性的简明子集。尽管如此,计算所有的相对减少需要大量的时间和内存来构建一个可分辨性矩阵。在本研究中,我们证明了所有的相对预约都可以用一个简化的矩阵来计算,即部分差别矩阵,它可以很容易地转换为相对约简。我们还建议采用数据划分方法来生成差别矩阵。这种方法减轻了每个分区的结果数量增加的问题。这项技术的结果产生了本研究中提出的相对预还原。由于我们对相对缩减计算的增强是独立于其他改进的,因此它们可以与现有方法一起实现。实验结果表明,对于大型数据集,利用相对预约简计算相对约简是有效的。
{"title":"Relative pre-reducts for computing the relative reducts of large data sets","authors":"Hajime Okawa ,&nbsp;Yasuo Kudo ,&nbsp;Tetsuya Murai","doi":"10.1016/j.ijar.2025.109544","DOIUrl":"10.1016/j.ijar.2025.109544","url":null,"abstract":"<div><div>In this paper, we introduce the concept of relative pre-reducts to derive the relative reducts from a large dataset. The relative reduct is considered a consistency-based attribute reduction method that is commonly utilized to extract concise subsets of condition attributes. Nonetheless, calculating all relative reducts necessitates substantial time and memory to build a discernibility matrix. In this research, we demonstrate that all relative pre-reducts can be computed using a simplified matrix referred to as the partial discernibility matrix, which can be readily converted into relative reducts. We also suggest employing a data partitioning approach to generate the discernibility matrix. This method alleviates the issue of an increased number of results for each partition. The outcomes from this technique yield the relative pre-reducts proposed in this study. Since our enhancements to the computation of relative reducts are independent of other advancements, they can be implemented in conjunction with existing methods. Experimental findings indicate that utilizing relative pre-reducts for computing relative reducts is efficient for large datasets.</div></div>","PeriodicalId":13842,"journal":{"name":"International Journal of Approximate Reasoning","volume":"187 ","pages":"Article 109544"},"PeriodicalIF":3.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144829498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Special issue on the Twelfth International Conference on Probabilistic Graphical Models (PGM 2024) 第十二届国际概率图模型会议特刊(PGM 2024)
IF 3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-01 Epub Date: 2025-09-05 DOI: 10.1016/j.ijar.2025.109571
Silja Renooij, Johan Kwisthout, Janneke H. Bolt
{"title":"Special issue on the Twelfth International Conference on Probabilistic Graphical Models (PGM 2024)","authors":"Silja Renooij,&nbsp;Johan Kwisthout,&nbsp;Janneke H. Bolt","doi":"10.1016/j.ijar.2025.109571","DOIUrl":"10.1016/j.ijar.2025.109571","url":null,"abstract":"","PeriodicalId":13842,"journal":{"name":"International Journal of Approximate Reasoning","volume":"187 ","pages":"Article 109571"},"PeriodicalIF":3.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145010082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Approximate Reasoning
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1