自适应模糊邻域决策树

IF 7.2 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Applied Soft Computing Pub Date : 2024-11-05 DOI:10.1016/j.asoc.2024.112435
Xinyu Cui , Changzhong Wang , Shuang An , Yuhua Qian
{"title":"自适应模糊邻域决策树","authors":"Xinyu Cui ,&nbsp;Changzhong Wang ,&nbsp;Shuang An ,&nbsp;Yuhua Qian","doi":"10.1016/j.asoc.2024.112435","DOIUrl":null,"url":null,"abstract":"<div><div>Decision tree algorithms have gained widespread acceptance in machine learning, with the central challenge lying in devising an optimal splitting strategy for node sample subspaces. In the context of continuous data, conventional approaches typically involve fuzzifying data or adopting a dichotomous scheme akin to the CART tree. Nevertheless, fuzzifying continuous features often entails information loss, whereas the dichotomous approach can generate an excessive number of classification rules, potentially leading to overfitting. To address these limitations, this study introduces an adaptive growth decision tree framework, termed the fuzzy neighborhood decision tree (FNDT). Initially, we establish a fuzzy neighborhood decision model by leveraging the concept of fuzzy inclusion degree. Furthermore, we delve into the topological structure of misclassified samples under the proposed decision model, providing a theoretical foundation for the construction of FNDT. Subsequently, we utilize conditional information entropy to sift through original features, prioritizing those that offer the maximum information gain for decision tree nodes. By leveraging the conditional decision partitions derived from the fuzzy neighborhood decision model, we achieve an adaptive splitting method for optimal features, culminating in an adaptive growth decision tree algorithm that relies solely on the inherent structure of real-valued data. Experimental evaluations reveal that, compared with advanced decision tree algorithms, FNDT exhibits a simple tree structure, stronger generalization capabilities, and superior performance in classifying continuous data.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"167 ","pages":"Article 112435"},"PeriodicalIF":7.2000,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Adaptive fuzzy neighborhood decision tree\",\"authors\":\"Xinyu Cui ,&nbsp;Changzhong Wang ,&nbsp;Shuang An ,&nbsp;Yuhua Qian\",\"doi\":\"10.1016/j.asoc.2024.112435\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Decision tree algorithms have gained widespread acceptance in machine learning, with the central challenge lying in devising an optimal splitting strategy for node sample subspaces. In the context of continuous data, conventional approaches typically involve fuzzifying data or adopting a dichotomous scheme akin to the CART tree. Nevertheless, fuzzifying continuous features often entails information loss, whereas the dichotomous approach can generate an excessive number of classification rules, potentially leading to overfitting. To address these limitations, this study introduces an adaptive growth decision tree framework, termed the fuzzy neighborhood decision tree (FNDT). Initially, we establish a fuzzy neighborhood decision model by leveraging the concept of fuzzy inclusion degree. Furthermore, we delve into the topological structure of misclassified samples under the proposed decision model, providing a theoretical foundation for the construction of FNDT. Subsequently, we utilize conditional information entropy to sift through original features, prioritizing those that offer the maximum information gain for decision tree nodes. By leveraging the conditional decision partitions derived from the fuzzy neighborhood decision model, we achieve an adaptive splitting method for optimal features, culminating in an adaptive growth decision tree algorithm that relies solely on the inherent structure of real-valued data. Experimental evaluations reveal that, compared with advanced decision tree algorithms, FNDT exhibits a simple tree structure, stronger generalization capabilities, and superior performance in classifying continuous data.</div></div>\",\"PeriodicalId\":50737,\"journal\":{\"name\":\"Applied Soft Computing\",\"volume\":\"167 \",\"pages\":\"Article 112435\"},\"PeriodicalIF\":7.2000,\"publicationDate\":\"2024-11-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Soft Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1568494624012092\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1568494624012092","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

决策树算法已在机器学习领域获得广泛认可,其核心挑战在于为节点样本子空间设计最佳的分割策略。对于连续数据,传统方法通常是对数据进行模糊化处理,或采用类似于 CART 树的二分法。然而,模糊化连续特征往往会造成信息丢失,而二分法则会产生过多的分类规则,从而可能导致过度拟合。为了解决这些局限性,本研究引入了一种自适应增长决策树框架,即模糊邻域决策树(FNDT)。首先,我们利用模糊包含度的概念建立了一个模糊邻域决策模型。此外,我们还深入研究了所提出的决策模型下错误分类样本的拓扑结构,为 FNDT 的构建提供了理论基础。随后,我们利用条件信息熵筛选原始特征,优先选择那些能为决策树节点提供最大信息增益的特征。通过利用从模糊邻域决策模型中得出的条件决策分区,我们实现了最优特征的自适应分割方法,最终形成了完全依赖于实值数据固有结构的自适应增长决策树算法。实验评估表明,与先进的决策树算法相比,FNDT 具有简单的树形结构、更强的泛化能力以及在连续数据分类方面的卓越性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Adaptive fuzzy neighborhood decision tree
Decision tree algorithms have gained widespread acceptance in machine learning, with the central challenge lying in devising an optimal splitting strategy for node sample subspaces. In the context of continuous data, conventional approaches typically involve fuzzifying data or adopting a dichotomous scheme akin to the CART tree. Nevertheless, fuzzifying continuous features often entails information loss, whereas the dichotomous approach can generate an excessive number of classification rules, potentially leading to overfitting. To address these limitations, this study introduces an adaptive growth decision tree framework, termed the fuzzy neighborhood decision tree (FNDT). Initially, we establish a fuzzy neighborhood decision model by leveraging the concept of fuzzy inclusion degree. Furthermore, we delve into the topological structure of misclassified samples under the proposed decision model, providing a theoretical foundation for the construction of FNDT. Subsequently, we utilize conditional information entropy to sift through original features, prioritizing those that offer the maximum information gain for decision tree nodes. By leveraging the conditional decision partitions derived from the fuzzy neighborhood decision model, we achieve an adaptive splitting method for optimal features, culminating in an adaptive growth decision tree algorithm that relies solely on the inherent structure of real-valued data. Experimental evaluations reveal that, compared with advanced decision tree algorithms, FNDT exhibits a simple tree structure, stronger generalization capabilities, and superior performance in classifying continuous data.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Applied Soft Computing
Applied Soft Computing 工程技术-计算机:跨学科应用
CiteScore
15.80
自引率
6.90%
发文量
874
审稿时长
10.9 months
期刊介绍: Applied Soft Computing is an international journal promoting an integrated view of soft computing to solve real life problems.The focus is to publish the highest quality research in application and convergence of the areas of Fuzzy Logic, Neural Networks, Evolutionary Computing, Rough Sets and other similar techniques to address real world complexities. Applied Soft Computing is a rolling publication: articles are published as soon as the editor-in-chief has accepted them. Therefore, the web site will continuously be updated with new articles and the publication time will be short.
期刊最新文献
A multi-strategy fruit fly optimization algorithm for the distributed permutation flowshop scheduling problem with sequence-dependent setup times A sparse diverse-branch large kernel convolutional neural network for human activity recognition using wearables A reinforcement learning hyper-heuristic algorithm for the distributed flowshops scheduling problem under consideration of emergency order insertion Differential evolution with multi-strategies for UAV trajectory planning and point cloud registration Shapelet selection for time series classification
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1