{"title":"基于信息理论的数据库相似度学习方法","authors":"Changhwan Lee","doi":"10.1109/CAIA.1994.323686","DOIUrl":null,"url":null,"abstract":"Similarity-based learning has been widely and successfully used in some domains. Despite these successes, most similarity measures used in the current literature are defined on limited feature types. Therefore, these similarity measures cannot be applied to the database environment due to the variety of data types that exist. In this paper, we propose a new method of similarity-based learning for databases using information theory. The current similarity measures are improved in several ways. Similarity is defined on every attribute type in the database, and each attribute is assigned a weight depending on its importance with respect to the target attribute. Besides, our nearest neighbor algorithm gives different weights to the selected instances. Our system is implemented and tested on some typical machine learning databases. Our experiments show that the classification accuracy of our system is, in general, superior to that of other learning methods.<<ETX>>","PeriodicalId":297396,"journal":{"name":"Proceedings of the Tenth Conference on Artificial Intelligence for Applications","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1994-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"An information theoretic similarity-based learning method for databases\",\"authors\":\"Changhwan Lee\",\"doi\":\"10.1109/CAIA.1994.323686\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Similarity-based learning has been widely and successfully used in some domains. Despite these successes, most similarity measures used in the current literature are defined on limited feature types. Therefore, these similarity measures cannot be applied to the database environment due to the variety of data types that exist. In this paper, we propose a new method of similarity-based learning for databases using information theory. The current similarity measures are improved in several ways. Similarity is defined on every attribute type in the database, and each attribute is assigned a weight depending on its importance with respect to the target attribute. Besides, our nearest neighbor algorithm gives different weights to the selected instances. Our system is implemented and tested on some typical machine learning databases. Our experiments show that the classification accuracy of our system is, in general, superior to that of other learning methods.<<ETX>>\",\"PeriodicalId\":297396,\"journal\":{\"name\":\"Proceedings of the Tenth Conference on Artificial Intelligence for Applications\",\"volume\":\"18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1994-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Tenth Conference on Artificial Intelligence for Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CAIA.1994.323686\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Tenth Conference on Artificial Intelligence for Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CAIA.1994.323686","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An information theoretic similarity-based learning method for databases
Similarity-based learning has been widely and successfully used in some domains. Despite these successes, most similarity measures used in the current literature are defined on limited feature types. Therefore, these similarity measures cannot be applied to the database environment due to the variety of data types that exist. In this paper, we propose a new method of similarity-based learning for databases using information theory. The current similarity measures are improved in several ways. Similarity is defined on every attribute type in the database, and each attribute is assigned a weight depending on its importance with respect to the target attribute. Besides, our nearest neighbor algorithm gives different weights to the selected instances. Our system is implemented and tested on some typical machine learning databases. Our experiments show that the classification accuracy of our system is, in general, superior to that of other learning methods.<>