{"title":"Rough sets and data analysis","authors":"Zdzislaw Pawlak","doi":"10.1109/AFSS.1996.583540","DOIUrl":null,"url":null,"abstract":"In this talk we are going to present basic concepts of a new approach to data analysis, called rough set theory. The theory has attracted attention of many researchers and practitioners all over the world, who contributed essentially to its development and applications. Rough set theory overlaps with many other theories, especially with fuzzy set theory, evidence theory and Boolean reasoning methods, discriminant analysis-nevertheless it can be viewed in its own rights, as an independent, complementary, and not competing discipline. Rough set theory is based on classification. Consider, for example, a group of patients suffering from a certain disease. With every patient a data file is associated containing information like, e.g. body temperature, blood pressure, name, age, address and others. All patients revealing the same symptoms are indiscernible (similar) in view of the available information and can be classified in blocks, which can be understood as elementary granules of knowledge about patients (or types of patients). These granules are called elementary sets or concepts, and can be considered as elementary building blocks of knowledge about patients. Elementary concepts can be combined into compound concepts, i.e. concepts that are uniquely defined in terms of elementary concepts. Any union of elementary sets is called a crisp set, and any other sets are referred to as rough (vague, imprecise). With every set X we can associate two crisp sets, called the lower and the upper approximation of X. The lower approximation of X is the union of all elementary set which are included in X, whereas the upper approximation of X is the union of all elementary set which have non-empty intersection with X. In other words the lower approximation of a set is the set of all elements that surely belongs to X, whereas the upper approximation of X is the set of all elements that possibly belong to X. The difference of the upper and the lower approximation of X is its boundary region. Obviously a set is rough if it has non empty boundary region; otherwise the set is crisp. Elements of the boundary region cannot be classified, employing the available knowledge, either to the set or its complement. Approximations of sets are basic operation in rough set theory.","PeriodicalId":197019,"journal":{"name":"Soft Computing in Intelligent Systems and Information Processing. Proceedings of the 1996 Asian Fuzzy Systems Symposium","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1996-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"48","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Soft Computing in Intelligent Systems and Information Processing. Proceedings of the 1996 Asian Fuzzy Systems Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AFSS.1996.583540","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 48

Abstract

In this talk we are going to present basic concepts of a new approach to data analysis, called rough set theory. The theory has attracted attention of many researchers and practitioners all over the world, who contributed essentially to its development and applications. Rough set theory overlaps with many other theories, especially with fuzzy set theory, evidence theory and Boolean reasoning methods, discriminant analysis-nevertheless it can be viewed in its own rights, as an independent, complementary, and not competing discipline. Rough set theory is based on classification. Consider, for example, a group of patients suffering from a certain disease. With every patient a data file is associated containing information like, e.g. body temperature, blood pressure, name, age, address and others. All patients revealing the same symptoms are indiscernible (similar) in view of the available information and can be classified in blocks, which can be understood as elementary granules of knowledge about patients (or types of patients). These granules are called elementary sets or concepts, and can be considered as elementary building blocks of knowledge about patients. Elementary concepts can be combined into compound concepts, i.e. concepts that are uniquely defined in terms of elementary concepts. Any union of elementary sets is called a crisp set, and any other sets are referred to as rough (vague, imprecise). With every set X we can associate two crisp sets, called the lower and the upper approximation of X. The lower approximation of X is the union of all elementary set which are included in X, whereas the upper approximation of X is the union of all elementary set which have non-empty intersection with X. In other words the lower approximation of a set is the set of all elements that surely belongs to X, whereas the upper approximation of X is the set of all elements that possibly belong to X. The difference of the upper and the lower approximation of X is its boundary region. Obviously a set is rough if it has non empty boundary region; otherwise the set is crisp. Elements of the boundary region cannot be classified, employing the available knowledge, either to the set or its complement. Approximations of sets are basic operation in rough set theory.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
粗糙集和数据分析
在这次演讲中,我们将介绍一种新的数据分析方法的基本概念,称为粗糙集理论。该理论引起了世界各地许多研究者和实践者的关注,他们对该理论的发展和应用做出了重要贡献。粗糙集理论与许多其他理论重叠,特别是与模糊集理论,证据理论和布尔推理方法,判别分析-尽管如此,它可以被看作是一个独立的,互补的,而不是竞争的学科。粗糙集理论是基于分类的。例如,考虑一群患有某种疾病的病人。每个病人都有一个数据文件,其中包含体温、血压、姓名、年龄、地址等信息。从现有的信息来看,所有表现出相同症状的患者都是无法区分的(相似的),并且可以按块进行分类,可以将其理解为关于患者(或患者类型)的基本知识颗粒。这些颗粒被称为基本集合或概念,可以被认为是关于患者知识的基本构建块。基本概念可以组合成复合概念,即根据基本概念唯一定义的概念。初等集合的任何并集称为清晰集,其他集合称为粗糙集(模糊的、不精确的)。对于每一个集合X,我们可以把两个清晰的集合联系起来,叫做X的上近似值和下近似值,X的下近似值是包含在X中的所有初等集合的并集,而X的上近似值是与X有非空相交的所有初等集合的并集,换句话说,一个集合的下近似值是肯定属于X的所有元素的集合,而X的上近似是可能属于X的所有元素的集合,X的上近似和下近似之差是它的边界区域。显然,如果一个集合有非空的边界区域,它就是粗糙的;否则,这一套是脆的。边界区域的元素不能被分类,使用可用的知识,无论是对集合还是它的补充。集合逼近是粗糙集理论中的基本运算。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Supporting rough set theory in very large databases using oracle RDBMS Theory of including degrees and its applications to uncertainty inferences Fuzzy decision making through relationships analysis between criteria Stratification structures on a kind of completely distributive lattices and their applications in theory of topological molecular lattices Supporting consensus reaching under fuzziness via ordered weighted averaging (OWA) operators
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1