The task of outlier identification is to find small groups of data objects that are exceptional when compared with rest large amount of data. The identification of outliers can lead to the discovery of truly unexpected knowledge in areas such as electronic commerce, credit card frauds, voting irregularity analysis, data cleansing, network intrusion, severe weather prediction & many more. This paper deals with the identification of outliers and to get efficient clusters in fuzzy clustering. In this paper a new density based definition of outlier and an algorithm dasiaDFCMpsila is proposed; which works in two phases. In first phase, it identifies outliers and separate them from original data-set and in the second phase, it creates clusters from noiseless data. DFCM modifies FCM fuzzy clustering technique to create clusters. But it can also be implemented with any other fuzzy clustering technique. Numerical examples and tests show that proposed algorithm gives better result when compared with FCM.
{"title":"DFCM: Density Based Approach to Identify Outliers and to Get Efficient Clusters in Fuzzy Clustering","authors":"Prabhjot Kaur","doi":"10.1109/WIIAT.2008.58","DOIUrl":"https://doi.org/10.1109/WIIAT.2008.58","url":null,"abstract":"The task of outlier identification is to find small groups of data objects that are exceptional when compared with rest large amount of data. The identification of outliers can lead to the discovery of truly unexpected knowledge in areas such as electronic commerce, credit card frauds, voting irregularity analysis, data cleansing, network intrusion, severe weather prediction & many more. This paper deals with the identification of outliers and to get efficient clusters in fuzzy clustering. In this paper a new density based definition of outlier and an algorithm dasiaDFCMpsila is proposed; which works in two phases. In first phase, it identifies outliers and separate them from original data-set and in the second phase, it creates clusters from noiseless data. DFCM modifies FCM fuzzy clustering technique to create clusters. But it can also be implemented with any other fuzzy clustering technique. Numerical examples and tests show that proposed algorithm gives better result when compared with FCM.","PeriodicalId":393772,"journal":{"name":"2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122354443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We model the sum and product riddle in public announcement logic, which is interpreted on an epistemic Kripke model. The model is symbolically represented as a finite state program with n agents. A model checking method to the riddle is developed by using the BDD-based symbolic model checking algorithm for logic of knowledge we developed in [7]. The method is implemented by extending the model checker MCTK [7] and then the solution of the riddle is verified successfully.
{"title":"Solving Sum and Product Riddle via BDD-Based Model Checking","authors":"Xiangyu Luo, Kaile Su, A. Sattar, Yan Chen","doi":"10.1109/WIIAT.2008.277","DOIUrl":"https://doi.org/10.1109/WIIAT.2008.277","url":null,"abstract":"We model the sum and product riddle in public announcement logic, which is interpreted on an epistemic Kripke model. The model is symbolically represented as a finite state program with n agents. A model checking method to the riddle is developed by using the BDD-based symbolic model checking algorithm for logic of knowledge we developed in [7]. The method is implemented by extending the model checker MCTK [7] and then the solution of the riddle is verified successfully.","PeriodicalId":393772,"journal":{"name":"2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology","volume":"162 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132495058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We investigate the following data mining problems from the document retrieval: From a large data set of documents, we need to find documents that relate to human interest as few iterations of human testing or checking as possible. In each iteration a comparatively small batch of documents is evaluated for relating to the human interest. We apply active learning techniques based on Support Vector Machine for evaluating successive batches, which is called relevance feedback. Our proposed approach has been very useful for document retrieval with relevance feedback experimentally. In this paper, we adopt several Vector Space Models into our proposed method, and then show the comparison results of the performance of our method in several Vector Space Models.
{"title":"Comparison of Performance for SVM Based Relevance Feedback Document Retrieval in Several Vector Space Models","authors":"T. Onoda, H. Murata, S. Yamada","doi":"10.1109/WIIAT.2008.101","DOIUrl":"https://doi.org/10.1109/WIIAT.2008.101","url":null,"abstract":"We investigate the following data mining problems from the document retrieval: From a large data set of documents, we need to find documents that relate to human interest as few iterations of human testing or checking as possible. In each iteration a comparatively small batch of documents is evaluated for relating to the human interest. We apply active learning techniques based on Support Vector Machine for evaluating successive batches, which is called relevance feedback. Our proposed approach has been very useful for document retrieval with relevance feedback experimentally. In this paper, we adopt several Vector Space Models into our proposed method, and then show the comparison results of the performance of our method in several Vector Space Models.","PeriodicalId":393772,"journal":{"name":"2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology","volume":"229 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130917471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In recently years, many people easily access Internet auctions in e-commerce trading. At the same time, network structures like the WWW have become huge and are analyzed on a grand scale. In Internet auctions, users face the problem of really knowing the credit and trustworthiness of participants, and the simple rating mechanism widely used in Internet auctions fails to represent this accurately. This paper proposes participant ranking methods based on relationships in Internet auctions. Our algorithm called "Auction Network Trust (ANT)" employs HITS's techniques and Internet auction data. At this stage, we successfully implemented a crawler for Internet auction sites and compared our algorithm to a reputation value of Internet auctions with several approaches such as user rankings. Furthermore, our work possesses a network analyzing system on a larger trading network that predicts which buyers and sellers are active and demonstrate better behaviors. Our experiments show many behaviors in the Internet auctions and that ANT presents different scores from HITS on the WWW.
{"title":"An Approach to Ranking Participants Based on Relationship Network in E-commerce","authors":"Masao Kobayashi, Takayuki Ito","doi":"10.1109/WIIAT.2008.134","DOIUrl":"https://doi.org/10.1109/WIIAT.2008.134","url":null,"abstract":"In recently years, many people easily access Internet auctions in e-commerce trading. At the same time, network structures like the WWW have become huge and are analyzed on a grand scale. In Internet auctions, users face the problem of really knowing the credit and trustworthiness of participants, and the simple rating mechanism widely used in Internet auctions fails to represent this accurately. This paper proposes participant ranking methods based on relationships in Internet auctions. Our algorithm called \"Auction Network Trust (ANT)\" employs HITS's techniques and Internet auction data. At this stage, we successfully implemented a crawler for Internet auction sites and compared our algorithm to a reputation value of Internet auctions with several approaches such as user rankings. Furthermore, our work possesses a network analyzing system on a larger trading network that predicts which buyers and sellers are active and demonstrate better behaviors. Our experiments show many behaviors in the Internet auctions and that ANT presents different scores from HITS on the WWW.","PeriodicalId":393772,"journal":{"name":"2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129038175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Previous work in hierarchical categorization focuses on the hierarchical perceptron (Hieron) algorithm. Hierarchical perceptron works on the principles of the perceptron,that is each class label in the hierarchy has an associated weight vector. To account for the hierarchy, we begin at the root of the tree and sum all weights to the target label.We make a prediction by considering the label that yields the maximum inner product of its feature set with its path-summed weights. Learning is done by adjusting the weights along the path from the predicted node to the correct node by a specific loss function that adheres to the large margin principal. There are several problems with applying this approach to a multiple class problem. In many cases we could end up punishing weights that gave a correct prediction, because the algorithm can only take a single case at a time. In this paper we present an extended hierarchical perceptron algorithm capable of solving the multiple categorization problem (MultiHieron). We introduce new aggregate loss function for multiple label learning. We make weight updates simultaneously instead of serially. Then, significant improvement over the basic Hieron algorithm is demonstrated on the aviation safety reporting system (ASRS) flight anomaly database and OntoNews corpus using both flat and hierarchical categorization metrics.
{"title":"Multi-concept Document Classification Using a Perceptron-Like Algorithm","authors":"Clay Woolam, L. Khan","doi":"10.1109/WIIAT.2008.410","DOIUrl":"https://doi.org/10.1109/WIIAT.2008.410","url":null,"abstract":"Previous work in hierarchical categorization focuses on the hierarchical perceptron (Hieron) algorithm. Hierarchical perceptron works on the principles of the perceptron,that is each class label in the hierarchy has an associated weight vector. To account for the hierarchy, we begin at the root of the tree and sum all weights to the target label.We make a prediction by considering the label that yields the maximum inner product of its feature set with its path-summed weights. Learning is done by adjusting the weights along the path from the predicted node to the correct node by a specific loss function that adheres to the large margin principal. There are several problems with applying this approach to a multiple class problem. In many cases we could end up punishing weights that gave a correct prediction, because the algorithm can only take a single case at a time. In this paper we present an extended hierarchical perceptron algorithm capable of solving the multiple categorization problem (MultiHieron). We introduce new aggregate loss function for multiple label learning. We make weight updates simultaneously instead of serially. Then, significant improvement over the basic Hieron algorithm is demonstrated on the aviation safety reporting system (ASRS) flight anomaly database and OntoNews corpus using both flat and hierarchical categorization metrics.","PeriodicalId":393772,"journal":{"name":"2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology","volume":"158 11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128838161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sensor-network technology is indispensable for constructing ubiquitous network infrastructures. Although information about adjacent relations between sensors is also very important for sensor networks, obtaining this information automatically without manual assistance is extremely difficult. Consequently, we propose a new methodology for constructing adjacent relations in sensor networks using an ant-colony optimization algorithm. This methodology can be used to automatically extract adjacent relations without using prepared sensor-location information or RFIDs to identify individual humans. We implemented a prototype system, and verified its basic effectiveness through simulations and an experiment using real data.
{"title":"Pheromone Approach to the Adaptive Discovery of Sensor-Network Topology","authors":"H. Tamaki, Ken-ichi Fukui, M. Numao, S. Kurihara","doi":"10.1109/WIIAT.2008.143","DOIUrl":"https://doi.org/10.1109/WIIAT.2008.143","url":null,"abstract":"Sensor-network technology is indispensable for constructing ubiquitous network infrastructures. Although information about adjacent relations between sensors is also very important for sensor networks, obtaining this information automatically without manual assistance is extremely difficult. Consequently, we propose a new methodology for constructing adjacent relations in sensor networks using an ant-colony optimization algorithm. This methodology can be used to automatically extract adjacent relations without using prepared sensor-location information or RFIDs to identify individual humans. We implemented a prototype system, and verified its basic effectiveness through simulations and an experiment using real data.","PeriodicalId":393772,"journal":{"name":"2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125507919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose an ontology-based approach for inferences linking trust information in two different situations. That reasoning process can augment the typically sparse trust information, by inferring the missing information from other situational conditions, and can better support situation-aware trust management. Our work is more comprehensive in comparison with other models and considers various aspects of the relationship between situation-awareness and trust management.
{"title":"Cross-Situation Trust Reasoning","authors":"Mozhgan Tavakolifard, S. J. Knapskog, P. Herrmann","doi":"10.1109/WIIAT.2008.41","DOIUrl":"https://doi.org/10.1109/WIIAT.2008.41","url":null,"abstract":"We propose an ontology-based approach for inferences linking trust information in two different situations. That reasoning process can augment the typically sparse trust information, by inferring the missing information from other situational conditions, and can better support situation-aware trust management. Our work is more comprehensive in comparison with other models and considers various aspects of the relationship between situation-awareness and trust management.","PeriodicalId":393772,"journal":{"name":"2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125534565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Query expansion is one of the most complex tasks in information retrieval. Several new queries can be expanded related to a user one. The problem arises in choosing the queries that are more useful for search process. Here it is supposed that the most useful expanded queries are those queries which have similar meanings with regard to the original query but the number of words that they (original and expanded query) share is low. So, their meanings are similar but grammatically they are different. So, following this idea, several experiments have been carried out to assess a fuzzy measure that is able to select which are the most useful expanded queries, i.e., a fuzzy filtering process for query expansion.
{"title":"Filtering Short Queries by Means of Fuzzy Semantic-Lexical Relations for Meta-searchers Using WordNet","authors":"J. Serrano-Guerrero, F. P. Romero, J. A. Olivas","doi":"10.1109/WIIAT.2008.112","DOIUrl":"https://doi.org/10.1109/WIIAT.2008.112","url":null,"abstract":"Query expansion is one of the most complex tasks in information retrieval. Several new queries can be expanded related to a user one. The problem arises in choosing the queries that are more useful for search process. Here it is supposed that the most useful expanded queries are those queries which have similar meanings with regard to the original query but the number of words that they (original and expanded query) share is low. So, their meanings are similar but grammatically they are different. So, following this idea, several experiments have been carried out to assess a fuzzy measure that is able to select which are the most useful expanded queries, i.e., a fuzzy filtering process for query expansion.","PeriodicalId":393772,"journal":{"name":"2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126496572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
G. Voutsadakis, G. Slutzki, Vasant G Honavar, Jie Bao
We introduce F-ALCI, a federated version of the description logic ALCI. An F-ALCI ontology, like its package-based counterpart ALCIP-, consists of multiple ALCI ontologies that can import concepts or roles defined in other modules. Unlike ALCIP- which supports only contextualized negation, F-ALCI, supports contextualization of each of the logical connectives, a feature that allows more flexible reuse of knowledge from independently developed ontologies. We provide a new semantics for F-ALCI based on image domain relations and establish the conditions that need to be imposed on domain relations to ensure properties, such as preservation of unsatisfiability and monotonicity of inference, that are desirable in distributed web applications. We also establish the decidability of F-ALCI.
{"title":"Federated ALCI: Preliminary Report","authors":"G. Voutsadakis, G. Slutzki, Vasant G Honavar, Jie Bao","doi":"10.1109/WIIAT.2008.296","DOIUrl":"https://doi.org/10.1109/WIIAT.2008.296","url":null,"abstract":"We introduce F-ALCI, a federated version of the description logic ALCI. An F-ALCI ontology, like its package-based counterpart ALCIP-, consists of multiple ALCI ontologies that can import concepts or roles defined in other modules. Unlike ALCIP- which supports only contextualized negation, F-ALCI, supports contextualization of each of the logical connectives, a feature that allows more flexible reuse of knowledge from independently developed ontologies. We provide a new semantics for F-ALCI based on image domain relations and establish the conditions that need to be imposed on domain relations to ensure properties, such as preservation of unsatisfiability and monotonicity of inference, that are desirable in distributed web applications. We also establish the decidability of F-ALCI.","PeriodicalId":393772,"journal":{"name":"2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125640798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Market Surveillance plays an important role in maintaining market integrity, transparency and fairness. The existing trading pattern analysis only focuses on interday data which discloses explicit and high-level market dynamics. In the mean time, the existing market surveillance systems are facing challenges of misuse, mis-disclosure and misdealing of information, announcement and order in one market or crossing multiple markets. Therefore, there is a crucial need to develop workable methods for smart surveillance. To deal with such issues, we propose an innovative methodology - microstructure activity pattern analysis. Based on this methodology, a case study in identifying exceptional microstructure activity patterns is carried out. The experiments on real-life stock data show that microstructure activity pattern analysis opens a new and effective means for crucially understanding and analysing market dynamics. The resulting findings such as exceptional microstructure activity patterns can greatly enhance the learning, detection, adaption and decision-making capability of market surveillance.
{"title":"Mining Exceptional Activity Patterns in Microstructure Data","authors":"Yuming Ou, Longbing Cao, C. Luo, Li Liu","doi":"10.1109/WIIAT.2008.160","DOIUrl":"https://doi.org/10.1109/WIIAT.2008.160","url":null,"abstract":"Market Surveillance plays an important role in maintaining market integrity, transparency and fairness. The existing trading pattern analysis only focuses on interday data which discloses explicit and high-level market dynamics. In the mean time, the existing market surveillance systems are facing challenges of misuse, mis-disclosure and misdealing of information, announcement and order in one market or crossing multiple markets. Therefore, there is a crucial need to develop workable methods for smart surveillance. To deal with such issues, we propose an innovative methodology - microstructure activity pattern analysis. Based on this methodology, a case study in identifying exceptional microstructure activity patterns is carried out. The experiments on real-life stock data show that microstructure activity pattern analysis opens a new and effective means for crucially understanding and analysing market dynamics. The resulting findings such as exceptional microstructure activity patterns can greatly enhance the learning, detection, adaption and decision-making capability of market surveillance.","PeriodicalId":393772,"journal":{"name":"2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology","volume":"190 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123007797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}