{"title":"基于概念模糊集的非法网站信息过滤系统","authors":"A. Shinmura, K. Taniguchi, K. Kawahara, T. Takagi","doi":"10.1109/NAFIPS.2002.1018079","DOIUrl":null,"url":null,"abstract":"Currently on the Internet, there exists a host of illegal Web sites which specialize in the distribution of commercial software and music. This paper proposes a method to distinguish illegal Web sites from legal ones not only by using TF-IDF (term frequency-inverse document frequency) values but also by recognizing the purpose/meaning of the Web sites. This is achieved by describing what are considered to be illegal sites and by judging whether the objective Web sites match the description of illegality. Conceptual fuzzy sets (CFSs) are used to describe the concept of illegal Web sites. First, we introduce the usefulness of CFSs in overcoming those problems, and propose the realization of CFSs using RBF (radial basis function)-like networks. In a CFS, the meaning of a concept is represented by the distribution of the activation values of the other nodes. Because the distribution changes depend on which labels are activated as a result of the conditions, the activations show a context-dependent meaning. Next, we propose the architecture of a filtering system. Finally, we compare the proposed method with the TF-IDF method with a support vector machine. The e-measures, as a total evaluation, indicate that the proposed system shows better results as compared to the TF-IDF method with the support vector machine.","PeriodicalId":348314,"journal":{"name":"2002 Annual Meeting of the North American Fuzzy Information Processing Society Proceedings. NAFIPS-FLINT 2002 (Cat. No. 02TH8622)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Exposure of illegal Web sites using conceptual fuzzy sets-based information filtering system\",\"authors\":\"A. Shinmura, K. Taniguchi, K. Kawahara, T. Takagi\",\"doi\":\"10.1109/NAFIPS.2002.1018079\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Currently on the Internet, there exists a host of illegal Web sites which specialize in the distribution of commercial software and music. This paper proposes a method to distinguish illegal Web sites from legal ones not only by using TF-IDF (term frequency-inverse document frequency) values but also by recognizing the purpose/meaning of the Web sites. This is achieved by describing what are considered to be illegal sites and by judging whether the objective Web sites match the description of illegality. Conceptual fuzzy sets (CFSs) are used to describe the concept of illegal Web sites. First, we introduce the usefulness of CFSs in overcoming those problems, and propose the realization of CFSs using RBF (radial basis function)-like networks. In a CFS, the meaning of a concept is represented by the distribution of the activation values of the other nodes. Because the distribution changes depend on which labels are activated as a result of the conditions, the activations show a context-dependent meaning. Next, we propose the architecture of a filtering system. Finally, we compare the proposed method with the TF-IDF method with a support vector machine. The e-measures, as a total evaluation, indicate that the proposed system shows better results as compared to the TF-IDF method with the support vector machine.\",\"PeriodicalId\":348314,\"journal\":{\"name\":\"2002 Annual Meeting of the North American Fuzzy Information Processing Society Proceedings. NAFIPS-FLINT 2002 (Cat. No. 02TH8622)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-08-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2002 Annual Meeting of the North American Fuzzy Information Processing Society Proceedings. NAFIPS-FLINT 2002 (Cat. No. 02TH8622)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NAFIPS.2002.1018079\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2002 Annual Meeting of the North American Fuzzy Information Processing Society Proceedings. NAFIPS-FLINT 2002 (Cat. No. 02TH8622)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NAFIPS.2002.1018079","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Exposure of illegal Web sites using conceptual fuzzy sets-based information filtering system
Currently on the Internet, there exists a host of illegal Web sites which specialize in the distribution of commercial software and music. This paper proposes a method to distinguish illegal Web sites from legal ones not only by using TF-IDF (term frequency-inverse document frequency) values but also by recognizing the purpose/meaning of the Web sites. This is achieved by describing what are considered to be illegal sites and by judging whether the objective Web sites match the description of illegality. Conceptual fuzzy sets (CFSs) are used to describe the concept of illegal Web sites. First, we introduce the usefulness of CFSs in overcoming those problems, and propose the realization of CFSs using RBF (radial basis function)-like networks. In a CFS, the meaning of a concept is represented by the distribution of the activation values of the other nodes. Because the distribution changes depend on which labels are activated as a result of the conditions, the activations show a context-dependent meaning. Next, we propose the architecture of a filtering system. Finally, we compare the proposed method with the TF-IDF method with a support vector machine. The e-measures, as a total evaluation, indicate that the proposed system shows better results as compared to the TF-IDF method with the support vector machine.