The P2P data integration system aims to combine the advantages of P2P technologies and data integration to overcome centralized data integration systems' shortcomings. Kademlia, as a widely used and efficient network protocol for P2P files sharing system, has a very clear logical structure, and with its unique identifying pattern of nodes and XOR metric for distance, it can provide O(logn) lookup to locate the node closest to a given key. In this paper, we put forward a method of applying Kademlia to the P2P data integration system, and propose a new P2P data integration model, Dual-Kad, combing the Kademlia network over the Peer layer with that over the Super-Peer layer. Dual-Kad can process queries based on semantic logic which is a limitation of the original Kademlia, and shorten the query routing path, cache the query results, and as a result, speed the whole query routing. We describe the detailed structures of Dual-Kad and its query routing algorithms. Our query routing strategies are proved effective in our case studies mentioned in this paper.
{"title":"Dual-Kad: Kademlia-Based Query Processing Strategies for P2P Data Integration","authors":"Zongquan Wang, Guoqing Dong, Jie Zhu","doi":"10.1109/WISA.2012.31","DOIUrl":"https://doi.org/10.1109/WISA.2012.31","url":null,"abstract":"The P2P data integration system aims to combine the advantages of P2P technologies and data integration to overcome centralized data integration systems' shortcomings. Kademlia, as a widely used and efficient network protocol for P2P files sharing system, has a very clear logical structure, and with its unique identifying pattern of nodes and XOR metric for distance, it can provide O(logn) lookup to locate the node closest to a given key. In this paper, we put forward a method of applying Kademlia to the P2P data integration system, and propose a new P2P data integration model, Dual-Kad, combing the Kademlia network over the Peer layer with that over the Super-Peer layer. Dual-Kad can process queries based on semantic logic which is a limitation of the original Kademlia, and shorten the query routing path, cache the query results, and as a result, speed the whole query routing. We describe the detailed structures of Dual-Kad and its query routing algorithms. Our query routing strategies are proved effective in our case studies mentioned in this paper.","PeriodicalId":313228,"journal":{"name":"2012 Ninth Web Information Systems and Applications Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130977183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chen Zhao, Derong Shen, Yue Kou, Tiezheng Nie, Ge Yu
Determing matching schemas enables queries on heterogeneous data space to be formulated and facilitates data integration. Current schema matching techniques most focus on mining mappings using elements' own information. This paper proposes to introduce semantic and functional dependencies into matching process to achieve multilayer schema matching results. It calculates semantic similarity with the help of Word Net and generates candidate mapping sets. By introducing functional dependency to formulize structural information, it can get structural similarities between element pairs. A probabilistic factor is considered to select mapping pairs. Through experimental evaluation on real data, the superiority of our method is verified.
{"title":"A Multilayer Method of Schema Matching Based on Semantic and Functional Dependencies","authors":"Chen Zhao, Derong Shen, Yue Kou, Tiezheng Nie, Ge Yu","doi":"10.1109/WISA.2012.9","DOIUrl":"https://doi.org/10.1109/WISA.2012.9","url":null,"abstract":"Determing matching schemas enables queries on heterogeneous data space to be formulated and facilitates data integration. Current schema matching techniques most focus on mining mappings using elements' own information. This paper proposes to introduce semantic and functional dependencies into matching process to achieve multilayer schema matching results. It calculates semantic similarity with the help of Word Net and generates candidate mapping sets. By introducing functional dependency to formulize structural information, it can get structural similarities between element pairs. A probabilistic factor is considered to select mapping pairs. Through experimental evaluation on real data, the superiority of our method is verified.","PeriodicalId":313228,"journal":{"name":"2012 Ninth Web Information Systems and Applications Conference","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130121435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the growing of the Internet, lots of heterogeneous relational databases are built in distributed environment. Data exchange between these databases absorbs more attention of researchers and engineers nowadays than ever. As a well-formed makeup language, XML is suitable to store and transfer information. Therefore we investigate the data exchange method via XML in this paper. We analyze mapping techniques between XML schema and relational database. Then, an effective method for data exchange is described in detail. Finally we design and implement a data exchange system by Java and DOM interface technology. It works well in a real commercial web application.
{"title":"Investigations on XML-based Data Exchange between Heterogeneous Databases","authors":"Mingli Wu, Yebai Li","doi":"10.1109/WISA.2012.44","DOIUrl":"https://doi.org/10.1109/WISA.2012.44","url":null,"abstract":"With the growing of the Internet, lots of heterogeneous relational databases are built in distributed environment. Data exchange between these databases absorbs more attention of researchers and engineers nowadays than ever. As a well-formed makeup language, XML is suitable to store and transfer information. Therefore we investigate the data exchange method via XML in this paper. We analyze mapping techniques between XML schema and relational database. Then, an effective method for data exchange is described in detail. Finally we design and implement a data exchange system by Java and DOM interface technology. It works well in a real commercial web application.","PeriodicalId":313228,"journal":{"name":"2012 Ninth Web Information Systems and Applications Conference","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125553466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Requirements is the formal expression of user's needs. Also, requirements elicitation is the process of activity focusing on requirements collection. Traditional acquisition methods, such as interview, observation and prototype, are unsuited for the service-oriented software development featuring in the distributed stakeholders, collective intelligence and behavioral emergence. In this paper, a collaborative requirements elicitation approach based on social intelligence for networked software is put forward, and requirements-semantics concept is defined as the formal requirements description generated by collective participation. Furthermore, semantic wikis technology is chosen as requirements authoring platform to adapt the distributed and collaborative features. Faced to the wide-area distributed Internet, it combines with the Web 2.0 and the semantic web to revise the experts requirements-semantics model through the social classification. At the same time, instantiation of requirements model is finished with semantic tagging and validation. Apart from the traditional documentary specification, requirements-semantics artifacts will be exported from the elicitation process to the subsequent software production process, i.e. services aggregation and services resource customization. Experiment and prototype have proved the feasibility and effectiveness of the proposed approach.
{"title":"Distributed and Collaborative Requirements Elicitation Based on Social Intelligence","authors":"Bin Wen, Ziqiang Luo, Peng Liang","doi":"10.1109/WISA.2012.14","DOIUrl":"https://doi.org/10.1109/WISA.2012.14","url":null,"abstract":"Requirements is the formal expression of user's needs. Also, requirements elicitation is the process of activity focusing on requirements collection. Traditional acquisition methods, such as interview, observation and prototype, are unsuited for the service-oriented software development featuring in the distributed stakeholders, collective intelligence and behavioral emergence. In this paper, a collaborative requirements elicitation approach based on social intelligence for networked software is put forward, and requirements-semantics concept is defined as the formal requirements description generated by collective participation. Furthermore, semantic wikis technology is chosen as requirements authoring platform to adapt the distributed and collaborative features. Faced to the wide-area distributed Internet, it combines with the Web 2.0 and the semantic web to revise the experts requirements-semantics model through the social classification. At the same time, instantiation of requirements model is finished with semantic tagging and validation. Apart from the traditional documentary specification, requirements-semantics artifacts will be exported from the elicitation process to the subsequent software production process, i.e. services aggregation and services resource customization. Experiment and prototype have proved the feasibility and effectiveness of the proposed approach.","PeriodicalId":313228,"journal":{"name":"2012 Ninth Web Information Systems and Applications Conference","volume":"126 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124209574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
It's significant to establish a mathematical model for the spread of epidemic, to help control the epidemic situation and minimize their impacts. In this paper, Richards model is proposed to fit the spread and PSO is employed to estimate the parameters of Richards model. Meanwhile concave function decreasing strategy and linear decreasing strategy are adopted to update the particle's velocity inertia weights respectively, and a new object function in the sense of normalized cross-correlation is built. The experiment result indicates that, PSO is a valid method for the parameter estimation of Richards model.
{"title":"Applied Research of PSO in Parameter Estimation of Richards Model","authors":"Ting-fa Wu, Jun-Bin You, Meijuan Yan, Hao-jun Sun","doi":"10.1109/WISA.2012.29","DOIUrl":"https://doi.org/10.1109/WISA.2012.29","url":null,"abstract":"It's significant to establish a mathematical model for the spread of epidemic, to help control the epidemic situation and minimize their impacts. In this paper, Richards model is proposed to fit the spread and PSO is employed to estimate the parameters of Richards model. Meanwhile concave function decreasing strategy and linear decreasing strategy are adopted to update the particle's velocity inertia weights respectively, and a new object function in the sense of normalized cross-correlation is built. The experiment result indicates that, PSO is a valid method for the parameter estimation of Richards model.","PeriodicalId":313228,"journal":{"name":"2012 Ninth Web Information Systems and Applications Conference","volume":"302 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124316997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
For the massive remote image file's management, we build the file catalog system of data application based on the part whole ontology of spatial relation. The method is that we analyzed the attributed item of image metadata and calculated the weight for application, then build the catalog concept level relation, calculated the similitude degree of the image attributed item and catalogue node to build the catalog system, stored the file into the corresponding directory of catalog. We design and realize the catalog system, the experiment show that the method of data integration based on the subdivision of part whole ontology is effective for image data's high efficient integrative management.
{"title":"Build the Image File Catalog System Based on the Subdivision of Part-Whole Ontology","authors":"Jifeng Cui, Yong Zhang, Chunxiao Xing","doi":"10.1109/WISA.2012.46","DOIUrl":"https://doi.org/10.1109/WISA.2012.46","url":null,"abstract":"For the massive remote image file's management, we build the file catalog system of data application based on the part whole ontology of spatial relation. The method is that we analyzed the attributed item of image metadata and calculated the weight for application, then build the catalog concept level relation, calculated the similitude degree of the image attributed item and catalogue node to build the catalog system, stored the file into the corresponding directory of catalog. We design and realize the catalog system, the experiment show that the method of data integration based on the subdivision of part whole ontology is effective for image data's high efficient integrative management.","PeriodicalId":313228,"journal":{"name":"2012 Ninth Web Information Systems and Applications Conference","volume":"04 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127273289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Smart Grid is an important application in Internet Of Things (IOT). Monitoring data in large-scale smart grid are massive, real-time and dynamic which collected by a lot of sensors, Intelligent Electronic Devices (IED) and etc.. All on account of that, traditional centralized storage proposals aren't applicable to data storage in large-scale smart grid. Therefore, we propose a data-centric storage approach in support of monitoring system in large-scale smart grid: Hierarchical Extended Storage Mechanism for Massive Dynamic Data (HES). HES stores monitoring data in different area according to data types. It can add storage nodes dynamically by coding method with extended hash function for avoiding data loss of incidents and frequent events. Monitoring data are stored dispersedly in the nodes of the same player by the multi-threshold levels means in HES, which avoids load skew. The simulation results show that HES satisfies the needs of massive dynamic data storage, and achieves load balance and a longer life cycle of monitoring network.
{"title":"A Data-Centric Storage Approach for Efficient Query of Large-Scale Smart Grid","authors":"Yan Wang, Qingxu Deng, W. Liu, Baoyan Song","doi":"10.1109/WISA.2012.27","DOIUrl":"https://doi.org/10.1109/WISA.2012.27","url":null,"abstract":"Smart Grid is an important application in Internet Of Things (IOT). Monitoring data in large-scale smart grid are massive, real-time and dynamic which collected by a lot of sensors, Intelligent Electronic Devices (IED) and etc.. All on account of that, traditional centralized storage proposals aren't applicable to data storage in large-scale smart grid. Therefore, we propose a data-centric storage approach in support of monitoring system in large-scale smart grid: Hierarchical Extended Storage Mechanism for Massive Dynamic Data (HES). HES stores monitoring data in different area according to data types. It can add storage nodes dynamically by coding method with extended hash function for avoiding data loss of incidents and frequent events. Monitoring data are stored dispersedly in the nodes of the same player by the multi-threshold levels means in HES, which avoids load skew. The simulation results show that HES satisfies the needs of massive dynamic data storage, and achieves load balance and a longer life cycle of monitoring network.","PeriodicalId":313228,"journal":{"name":"2012 Ninth Web Information Systems and Applications Conference","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124090281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the tremendous growth of the Web, it has become a huge challenge for the single-process crawlers to locate the resources that are precise and relevant to some topics in an appropriate amount of time, so it is increasingly important to use the parallel crawler. However, due to the parallelism of crawlers, one headache problem we have to face is how to distribute the URLs to crawlers to make the parallel system work coordinately and thereby make sure that the Web pages fetched are of high quality. In this paper, a novel URL assignment model for the parallel crawler is described, which is based on multi-objective decision making method and considers multiple factors synthetically such as load balance, overlap and so on. Extensive experiments test and validate our techniques.
{"title":"A Novel URL Assignment Model Based on Multi-objective Decision Making Method","authors":"Qiuyan Huang, Qingzhong Li, Zhongmin Yan","doi":"10.1109/WISA.2012.19","DOIUrl":"https://doi.org/10.1109/WISA.2012.19","url":null,"abstract":"With the tremendous growth of the Web, it has become a huge challenge for the single-process crawlers to locate the resources that are precise and relevant to some topics in an appropriate amount of time, so it is increasingly important to use the parallel crawler. However, due to the parallelism of crawlers, one headache problem we have to face is how to distribute the URLs to crawlers to make the parallel system work coordinately and thereby make sure that the Web pages fetched are of high quality. In this paper, a novel URL assignment model for the parallel crawler is described, which is based on multi-objective decision making method and considers multiple factors synthetically such as load balance, overlap and so on. Extensive experiments test and validate our techniques.","PeriodicalId":313228,"journal":{"name":"2012 Ninth Web Information Systems and Applications Conference","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124170388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As the development of collaborative applications, the parallel interactions among processes are more and more complicated and frequent. However, modeling of the interactions among many collaborative processes is a complicated and error-prone procedure. In this paper, firstly, a novel model based on Petri net, called PIPN, was proposed. PIPN is suitable to define and analyze the parallel interactions among collaborative processes. Secondly, seven parallel interactive modes were summarized according to three views of parallel interactions, which are unidirectional or bidirectional, single-point or multi-point, and synchronous or asynchronous. Then the formal definitions and control flow graphs of these modes were given. Finally, an example, called micro blog, was modeled to verify the reasonableness and feasibility of the work in this paper.
{"title":"Modeling of Parallel Interactive Modes among Collaborative Processes Based on High Level Petri Nets","authors":"Qianqian Xia, Jiantao Zhou, C. Sun","doi":"10.1109/WISA.2012.49","DOIUrl":"https://doi.org/10.1109/WISA.2012.49","url":null,"abstract":"As the development of collaborative applications, the parallel interactions among processes are more and more complicated and frequent. However, modeling of the interactions among many collaborative processes is a complicated and error-prone procedure. In this paper, firstly, a novel model based on Petri net, called PIPN, was proposed. PIPN is suitable to define and analyze the parallel interactions among collaborative processes. Secondly, seven parallel interactive modes were summarized according to three views of parallel interactions, which are unidirectional or bidirectional, single-point or multi-point, and synchronous or asynchronous. Then the formal definitions and control flow graphs of these modes were given. Finally, an example, called micro blog, was modeled to verify the reasonableness and feasibility of the work in this paper.","PeriodicalId":313228,"journal":{"name":"2012 Ninth Web Information Systems and Applications Conference","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133599424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Blogosphere has become a hot research field in recent years. As the existing detection algorithm has problems of inefficient feature selection and weak correlation, we propose an algorithm of splog detection based on features relation tree. We could construct the tree according to the correlation of the features, reserving the strong relevance features and removing the weak ones, then prune the redundant and irrelevance features by using the secondary features selection method and retain the best feature subset. The experimental results conducted in the Libsvm platform show that the algorithm based on the features of relation tree has higher precision and covering rate compared to the traditional ones. The precision of the algorithm on simulated training remains at about 90%, which has better generalization ability.
{"title":"Detection Splog Algorithm Based on Features Relation Tree","authors":"Yong-gong Ren, Xue Yang, Ming-fei Yin","doi":"10.1109/WISA.2012.39","DOIUrl":"https://doi.org/10.1109/WISA.2012.39","url":null,"abstract":"Blogosphere has become a hot research field in recent years. As the existing detection algorithm has problems of inefficient feature selection and weak correlation, we propose an algorithm of splog detection based on features relation tree. We could construct the tree according to the correlation of the features, reserving the strong relevance features and removing the weak ones, then prune the redundant and irrelevance features by using the secondary features selection method and retain the best feature subset. The experimental results conducted in the Libsvm platform show that the algorithm based on the features of relation tree has higher precision and covering rate compared to the traditional ones. The precision of the algorithm on simulated training remains at about 90%, which has better generalization ability.","PeriodicalId":313228,"journal":{"name":"2012 Ninth Web Information Systems and Applications Conference","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133609807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}