Hybrid row-column storage model [1][4], a common database approach for both OLTP and OLAP, have attracted a lot of attention in the past few years. Previous works about hybrid row-column approach mainly focus on physical storage. In this paper, we propose the idea of Layout-Conscious Optimization(LCO), techniques that fully exploits possibilities and take advantages of hybrid row-column data layout in all layers of DBMS, e.g., physical storage, query processing as well as network transfer. We believe LCO offer new opportunities to improve the performance of DBMS. To demonstrate the power of LCO, we present the design of a row-column hybrid network transfer protocol for DBMS, which reduces data transfer by 75% while incurs little extra cost on CPU.
{"title":"Layout-Conscious Optimization: Beyond Hybrid Row-Column Storage Model","authors":"H. Tian, Chunxiao Xing","doi":"10.1109/WISA.2012.48","DOIUrl":"https://doi.org/10.1109/WISA.2012.48","url":null,"abstract":"Hybrid row-column storage model [1][4], a common database approach for both OLTP and OLAP, have attracted a lot of attention in the past few years. Previous works about hybrid row-column approach mainly focus on physical storage. In this paper, we propose the idea of Layout-Conscious Optimization(LCO), techniques that fully exploits possibilities and take advantages of hybrid row-column data layout in all layers of DBMS, e.g., physical storage, query processing as well as network transfer. We believe LCO offer new opportunities to improve the performance of DBMS. To demonstrate the power of LCO, we present the design of a row-column hybrid network transfer protocol for DBMS, which reduces data transfer by 75% while incurs little extra cost on CPU.","PeriodicalId":313228,"journal":{"name":"2012 Ninth Web Information Systems and Applications Conference","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131156377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaolin Zhang, Guang-Yue Cui, Li-Xin Liu, Weiliang Huo
The data model is one of the core problems in the field of XML data management, but at present the research on management of uncertain data supporting multi-dimensional continuous random variables has been limited. An extended data model supporting multi-dimensional continuous random variables was proposed based on the existing model. The new model made the continuous uncertain XML no longer confined to express one dimension. In addition, queries about joint probability and conditional probability on the model were defined, The query strategy also can choose appropriate characteristic value calculation method according to different continuous distribution types, thus improve the query processing efficiency largely.
{"title":"An Extended Continuous Uncertain XML Data Model Research","authors":"Xiaolin Zhang, Guang-Yue Cui, Li-Xin Liu, Weiliang Huo","doi":"10.1109/WISA.2012.38","DOIUrl":"https://doi.org/10.1109/WISA.2012.38","url":null,"abstract":"The data model is one of the core problems in the field of XML data management, but at present the research on management of uncertain data supporting multi-dimensional continuous random variables has been limited. An extended data model supporting multi-dimensional continuous random variables was proposed based on the existing model. The new model made the continuous uncertain XML no longer confined to express one dimension. In addition, queries about joint probability and conditional probability on the model were defined, The query strategy also can choose appropriate characteristic value calculation method according to different continuous distribution types, thus improve the query processing efficiency largely.","PeriodicalId":313228,"journal":{"name":"2012 Ninth Web Information Systems and Applications Conference","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124006899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the rapid development of E-commerce, people can get information easily from networks and customers have more choices, but at the same time it brings other problems. The vast amounts of information increase the burden for customers to purchase, they have to browse more unrelated information, and increase the time spent. To solve this problem and guide the customers' purchase in E-commerce, there needs to be an auto promotion system to help customers. In this research, we discuss the traditional collaborative filtering algorithm's, and propose a new item clustering-based collaborative filtering approach (ICSCFA). At first, the approach employs clustering items by support to decrease the nearest-neighbour space, and then gives the prediction of rate. The experiments have proven that the new approach increases the quality of clustering and is effective in relieving the extremely sparse customer rated matrix problem, enhancing the recommendation system's accuracy of prediction.
{"title":"A New Item Clustering-Based Collaborative Filtering Approach","authors":"Hao-jun Sun, Tao Wu, Meijuan Yan, Yunxia Wu","doi":"10.1109/WISA.2012.30","DOIUrl":"https://doi.org/10.1109/WISA.2012.30","url":null,"abstract":"With the rapid development of E-commerce, people can get information easily from networks and customers have more choices, but at the same time it brings other problems. The vast amounts of information increase the burden for customers to purchase, they have to browse more unrelated information, and increase the time spent. To solve this problem and guide the customers' purchase in E-commerce, there needs to be an auto promotion system to help customers. In this research, we discuss the traditional collaborative filtering algorithm's, and propose a new item clustering-based collaborative filtering approach (ICSCFA). At first, the approach employs clustering items by support to decrease the nearest-neighbour space, and then gives the prediction of rate. The experiments have proven that the new approach increases the quality of clustering and is effective in relieving the extremely sparse customer rated matrix problem, enhancing the recommendation system's accuracy of prediction.","PeriodicalId":313228,"journal":{"name":"2012 Ninth Web Information Systems and Applications Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129022055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Microblogs, a mixture of new media and social networks, is a hotbed of lurking users. It is meaningful to capture profiles for a lurking user in customized applications because a lurking user can receive all messages except for sending few messages in microblogs. However, it is difficult to capture profiles for a lurking user because of lacking its user-generated contents. In this paper, we propose an approach to generate the lurking user's profiles by its followees' activities. In addition, we present a unified social context graph model to represent the lurking user's followees' activities. And the RWR algorithm is used to generate profiles of the lurking user in this graph model. Extensive experiments show that our approach can effectively determine profiles for lurking users.
{"title":"Generating Profiles for a Lurking User by its Followees' Social Context in Microblogs","authors":"Zhao Zhang, Bin Zhao, Weining Qian, Aoying Zhou","doi":"10.1109/WISA.2012.37","DOIUrl":"https://doi.org/10.1109/WISA.2012.37","url":null,"abstract":"Microblogs, a mixture of new media and social networks, is a hotbed of lurking users. It is meaningful to capture profiles for a lurking user in customized applications because a lurking user can receive all messages except for sending few messages in microblogs. However, it is difficult to capture profiles for a lurking user because of lacking its user-generated contents. In this paper, we propose an approach to generate the lurking user's profiles by its followees' activities. In addition, we present a unified social context graph model to represent the lurking user's followees' activities. And the RWR algorithm is used to generate profiles of the lurking user in this graph model. Extensive experiments show that our approach can effectively determine profiles for lurking users.","PeriodicalId":313228,"journal":{"name":"2012 Ninth Web Information Systems and Applications Conference","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131368635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recently, social tagging systems become more and more popular in many Web 2.0 applications. In such systems, Users are allowed to annotate a particular resource with a freely chosen a set of tags. These user-generated tags can represent users' interests more concise and closer to human understanding. Interests will change over time. Thus, how to describe users' interests and interests transfer path become a big challenge for personalized recommendation systems. In this approach, we propose a variable-length time interval division algorithm and user interest model based on time interval. Then, in order to draw users' interests transfer path over a specific time period, we suggest interest transfer model. After that, we apply a classical community partition algorithm in our approach to separate users into communities. Finally, we raise a novel method to measure users' similarities based on interest transfer model and provide personalized tag recommendation according to similar users' interests in their next time intervals. Experimental results demonstrate the higher precision and recall with our approach than classical user-based collaborative filtering methods.
{"title":"An Approach for Personalized Tag Recommendation Based on Interest Transfer Model","authors":"Yue Liu, Nan Yang, Gang Yang","doi":"10.1109/WISA.2012.43","DOIUrl":"https://doi.org/10.1109/WISA.2012.43","url":null,"abstract":"Recently, social tagging systems become more and more popular in many Web 2.0 applications. In such systems, Users are allowed to annotate a particular resource with a freely chosen a set of tags. These user-generated tags can represent users' interests more concise and closer to human understanding. Interests will change over time. Thus, how to describe users' interests and interests transfer path become a big challenge for personalized recommendation systems. In this approach, we propose a variable-length time interval division algorithm and user interest model based on time interval. Then, in order to draw users' interests transfer path over a specific time period, we suggest interest transfer model. After that, we apply a classical community partition algorithm in our approach to separate users into communities. Finally, we raise a novel method to measure users' similarities based on interest transfer model and provide personalized tag recommendation according to similar users' interests in their next time intervals. Experimental results demonstrate the higher precision and recall with our approach than classical user-based collaborative filtering methods.","PeriodicalId":313228,"journal":{"name":"2012 Ninth Web Information Systems and Applications Conference","volume":"142 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124691319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yao Zhang, Daling Wang, Shi Feng, Yifei Zhang, Fangling Leng
Traditional Web crawlers use one or more URLs of the initial Webpages to extract new URLs continuously, and then access data of the pages. AJAX, as one of the core technologies of Web2.0, greatly enhances the response efficiency of Web applications, brings good user experience, and therefore has been widely used. However, due to the use of AJAX techniques shatters the architecture of traditional Web pages which is based on static pages, the traditional Web crawlers cannot meet the challenges of dynamic partial refresh and asynchronous loading. In this paper, we propose an efficient approach for the information in dynamic pages by analyzing script language, and use path repository and judge the page refreshing state to improve the accuracy and efficiency of the algorithm. Experimental evaluation shows the efficiency and effectiveness of our approach.
{"title":"An Approach for Crawling Dynamic WebPages Based on Script Language Analysis","authors":"Yao Zhang, Daling Wang, Shi Feng, Yifei Zhang, Fangling Leng","doi":"10.1109/WISA.2012.34","DOIUrl":"https://doi.org/10.1109/WISA.2012.34","url":null,"abstract":"Traditional Web crawlers use one or more URLs of the initial Webpages to extract new URLs continuously, and then access data of the pages. AJAX, as one of the core technologies of Web2.0, greatly enhances the response efficiency of Web applications, brings good user experience, and therefore has been widely used. However, due to the use of AJAX techniques shatters the architecture of traditional Web pages which is based on static pages, the traditional Web crawlers cannot meet the challenges of dynamic partial refresh and asynchronous loading. In this paper, we propose an efficient approach for the information in dynamic pages by analyzing script language, and use path repository and judge the page refreshing state to improve the accuracy and efficiency of the algorithm. Experimental evaluation shows the efficiency and effectiveness of our approach.","PeriodicalId":313228,"journal":{"name":"2012 Ninth Web Information Systems and Applications Conference","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131929613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the wide application of data mining, many data mining applications need to use past and current data to predict the future state of the data. In view of this situation, we propose a new method, namely AMFP-Stream, for predicting frequent patterns over data streams efficiently and effectively. AMFP-Stream algorithm can predict those frequent item sets that have high potential to become frequent in the subsequent time windows to meet users' needs. Firstly, the algorithm converts the data to 0-1 matrix. Then it will update the associated matrix by tailoring the matrix and bitting operations, from which frequent item sets can be mined as well. Finally, it will predict possible frequent item sets that may appear in the windows next time by using the current data. Experimental results show that AMFP-Stream algorithm can predict the frequent item sets in different experimental conditions, therefore, the algorithm is feasible.
{"title":"An Algorithm for Predicting Frequent Patterns over Data Streams Based on Associated Matrix","authors":"Yong-gong Ren, Zhiqiang Hu, Jian Wang","doi":"10.1109/WISA.2012.40","DOIUrl":"https://doi.org/10.1109/WISA.2012.40","url":null,"abstract":"With the wide application of data mining, many data mining applications need to use past and current data to predict the future state of the data. In view of this situation, we propose a new method, namely AMFP-Stream, for predicting frequent patterns over data streams efficiently and effectively. AMFP-Stream algorithm can predict those frequent item sets that have high potential to become frequent in the subsequent time windows to meet users' needs. Firstly, the algorithm converts the data to 0-1 matrix. Then it will update the associated matrix by tailoring the matrix and bitting operations, from which frequent item sets can be mined as well. Finally, it will predict possible frequent item sets that may appear in the windows next time by using the current data. Experimental results show that AMFP-Stream algorithm can predict the frequent item sets in different experimental conditions, therefore, the algorithm is feasible.","PeriodicalId":313228,"journal":{"name":"2012 Ninth Web Information Systems and Applications Conference","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130455559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Evaluation of the Deep Web data sources must be based on the data in the Web databases, then how to select the most representative keywords as a query word to obtain a large number of uniformly distributed data is a major difficulty, this paper proposed a Deep Web database sampling method based on high correlation keyword, using a graph based keyword-connected network to get query words, the method can get a random sample of high-quality data from the Deep Web data source more efficiently.
{"title":"A Deep Web Database Sampling Method Based on High Correlation Keywords","authors":"Yongqing Zheng, Yufang Bian, Xin Du, Hongchen Wu","doi":"10.1109/WISA.2012.25","DOIUrl":"https://doi.org/10.1109/WISA.2012.25","url":null,"abstract":"Evaluation of the Deep Web data sources must be based on the data in the Web databases, then how to select the most representative keywords as a query word to obtain a large number of uniformly distributed data is a major difficulty, this paper proposed a Deep Web database sampling method based on high correlation keyword, using a graph based keyword-connected network to get query words, the method can get a random sample of high-quality data from the Deep Web data source more efficiently.","PeriodicalId":313228,"journal":{"name":"2012 Ninth Web Information Systems and Applications Conference","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115468305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
RFID technologies are applied extensively in Cyber-Physical Systems (CPS). RFID system collects, filters, and integrates large volume of events gathered continuously by readers to process composite event detections from applications. When the system processes many composite events, detection sharing is quite important for their execution and enhancing the performance of the system. In this paper, we propose a composite event detecting approach based on similar sub-event for RFID event streams. In order to achieve it, we propose the concept of small event by analyzing the different composite event and the relationship between operators, give the rules and properties of composite event rewriting, and give an approach of small event sharing and an implementation strategy for sharing similar sub-events. Finally, we demonstrate the effectiveness of our approach through a detail performance analysis of our algorithm implementation as well as through a comparison to a typical detection algorithm.
{"title":"A Composite Events Detecting Approach Based on Similar Sub-events","authors":"Baoyan Song, Huizhen Lou, Yan Wang","doi":"10.1109/WISA.2012.36","DOIUrl":"https://doi.org/10.1109/WISA.2012.36","url":null,"abstract":"RFID technologies are applied extensively in Cyber-Physical Systems (CPS). RFID system collects, filters, and integrates large volume of events gathered continuously by readers to process composite event detections from applications. When the system processes many composite events, detection sharing is quite important for their execution and enhancing the performance of the system. In this paper, we propose a composite event detecting approach based on similar sub-event for RFID event streams. In order to achieve it, we propose the concept of small event by analyzing the different composite event and the relationship between operators, give the rules and properties of composite event rewriting, and give an approach of small event sharing and an implementation strategy for sharing similar sub-events. Finally, we demonstrate the effectiveness of our approach through a detail performance analysis of our algorithm implementation as well as through a comparison to a typical detection algorithm.","PeriodicalId":313228,"journal":{"name":"2012 Ninth Web Information Systems and Applications Conference","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131107201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
It can improve the performance of mobile data management by caching the frequently accessed and infrequently updated data items. In this paper, a cache replacement policy called Update-based Minimal Access Cost Replacement (UMACR) was proposed to remedy the defects of the existing cache replacement policies such as GDSF and OUR. A variety of factors such as sizes of data items, data access and update information were taken into account in UMACR. To facilitate the replacement policy, two enhanced cache access policies called Update-Server-based Poll-Each-Read (USBPER) and Update-Client-based Call-Back (U2CB) were introduced to guarantee the consistency of data items. USBPER and U2CB remedied the defects of the existing cache access policies such as UPER and UCB by considering updates happening both at the server and clients. Conflict detection and handling was also implemented at the server. At last, we conducted a lot of simulation experiments of mobile data management, and the results demonstrate that the policies we proposed are effective enough, which lay a solid foundation for further research.
{"title":"Research of Cache Mechanism in Mobile Data Management","authors":"Yang Jun, Lu Shan, Xu Lizhen","doi":"10.1109/WISA.2012.24","DOIUrl":"https://doi.org/10.1109/WISA.2012.24","url":null,"abstract":"It can improve the performance of mobile data management by caching the frequently accessed and infrequently updated data items. In this paper, a cache replacement policy called Update-based Minimal Access Cost Replacement (UMACR) was proposed to remedy the defects of the existing cache replacement policies such as GDSF and OUR. A variety of factors such as sizes of data items, data access and update information were taken into account in UMACR. To facilitate the replacement policy, two enhanced cache access policies called Update-Server-based Poll-Each-Read (USBPER) and Update-Client-based Call-Back (U2CB) were introduced to guarantee the consistency of data items. USBPER and U2CB remedied the defects of the existing cache access policies such as UPER and UCB by considering updates happening both at the server and clients. Conflict detection and handling was also implemented at the server. At last, we conducted a lot of simulation experiments of mobile data management, and the results demonstrate that the policies we proposed are effective enough, which lay a solid foundation for further research.","PeriodicalId":313228,"journal":{"name":"2012 Ninth Web Information Systems and Applications Conference","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114840954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}