Pub Date : 2010-10-01DOI: 10.4108/ICST.COLLABORATECOM.2010.20
Yuehua Wang, Ling Liu, C. Pu, Gong Zhang
Multicast is a common platform for supporting group communication applications, such as IPTV, multimedia content delivery, and location-based advertisements. Distributed hash table (DHT) based overlay networks such as Chord and CAN presents a popular distributed computing architecture for multicast applications. However, existing research efforts have been mostly dedicated to efficient message delivery techniques to alleviate the influence of network dynamics on geo-distance based routing, such as reducing the delivery path length or optimizing routing path by utilizing network locality. In this paper, we argue that the geo-distance based routing protocols used in existing overlay networks are inefficient in terms of both resource use and environmental accommodation for multicast applications. We devise a utility driven routing scheme to improve the routing efficiency with three unique features. First, our utility function is defined based on a careful combination of hop counts and routing path latency. Second, we use CAN-like routing as an example and extend it by utilizing shortcuts to reduce the routing path length and by introducing a utility function to combine path latency with geo-distance based metric in determining the near-optimal route for each routing request. Third and most importantly, our utility function is designed by using a tunable influence parameter to allow nodes to adaptively make the most promising routing decision according to their specific network state and circumstances, such as overlay connectivity and next hop latency. Our experimental evaluation shows that the utility-driven routing scheme is highly scalable and efficient compared to existing geo-distance based routing protocols and demonstrates that by combining shortcuts, path latency with geo-distance can effectively enhance the multicast delivery efficiency for large scale group communication applications.
{"title":"An utility-driven routing scheme for scaling multicast applications","authors":"Yuehua Wang, Ling Liu, C. Pu, Gong Zhang","doi":"10.4108/ICST.COLLABORATECOM.2010.20","DOIUrl":"https://doi.org/10.4108/ICST.COLLABORATECOM.2010.20","url":null,"abstract":"Multicast is a common platform for supporting group communication applications, such as IPTV, multimedia content delivery, and location-based advertisements. Distributed hash table (DHT) based overlay networks such as Chord and CAN presents a popular distributed computing architecture for multicast applications. However, existing research efforts have been mostly dedicated to efficient message delivery techniques to alleviate the influence of network dynamics on geo-distance based routing, such as reducing the delivery path length or optimizing routing path by utilizing network locality. In this paper, we argue that the geo-distance based routing protocols used in existing overlay networks are inefficient in terms of both resource use and environmental accommodation for multicast applications. We devise a utility driven routing scheme to improve the routing efficiency with three unique features. First, our utility function is defined based on a careful combination of hop counts and routing path latency. Second, we use CAN-like routing as an example and extend it by utilizing shortcuts to reduce the routing path length and by introducing a utility function to combine path latency with geo-distance based metric in determining the near-optimal route for each routing request. Third and most importantly, our utility function is designed by using a tunable influence parameter to allow nodes to adaptively make the most promising routing decision according to their specific network state and circumstances, such as overlay connectivity and next hop latency. Our experimental evaluation shows that the utility-driven routing scheme is highly scalable and efficient compared to existing geo-distance based routing protocols and demonstrates that by combining shortcuts, path latency with geo-distance can effectively enhance the multicast delivery efficiency for large scale group communication applications.","PeriodicalId":354101,"journal":{"name":"6th International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom 2010)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133339672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-10-01DOI: 10.4108/ICST.COLLABORATECOM.2010.30
Jun Li, Shuang Yang, Xin Wang
Distributed storage systems provide reliable storage service by storing data, with a certain amount of redundancy, into a substantial number of storage nodes. In order to compensate the data loss incurred by node failures, the lost data should be regenerated. Tree-structured regeneration, during which storage nodes may relay the network traffic, has shown its potential to improve the efficiency of the regeneration process in the network with symmetric links. In this paper, we consider tree-structured regeneration in the network with asymmetric links, and analyze its expected time spend during the regeneration. Moreover, we further reduce the regeneration time by constructing multiple parallel regeneration trees. We proposed two optimal algorithms with polynomial time complexity, to construct multiple edge-disjoint and multiple edge-sharing parallel regeneration trees, respectively. We evaluate our algorithms by the simulation using real data measured in PlanetLab. The simulation results show that multiple parallel regeneration trees can reduce the regeneration time by 75% and keep the file availability more than 98%.
{"title":"Building parallel regeneration trees in distributed storage systems with asymmetric links","authors":"Jun Li, Shuang Yang, Xin Wang","doi":"10.4108/ICST.COLLABORATECOM.2010.30","DOIUrl":"https://doi.org/10.4108/ICST.COLLABORATECOM.2010.30","url":null,"abstract":"Distributed storage systems provide reliable storage service by storing data, with a certain amount of redundancy, into a substantial number of storage nodes. In order to compensate the data loss incurred by node failures, the lost data should be regenerated. Tree-structured regeneration, during which storage nodes may relay the network traffic, has shown its potential to improve the efficiency of the regeneration process in the network with symmetric links. In this paper, we consider tree-structured regeneration in the network with asymmetric links, and analyze its expected time spend during the regeneration. Moreover, we further reduce the regeneration time by constructing multiple parallel regeneration trees. We proposed two optimal algorithms with polynomial time complexity, to construct multiple edge-disjoint and multiple edge-sharing parallel regeneration trees, respectively. We evaluate our algorithms by the simulation using real data measured in PlanetLab. The simulation results show that multiple parallel regeneration trees can reduce the regeneration time by 75% and keep the file availability more than 98%.","PeriodicalId":354101,"journal":{"name":"6th International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom 2010)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132277848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-10-01DOI: 10.4108/ICST.COLLABORATECOM.2010.5
Yuanchen He, Zhenyu Zhong, S. Krasser, Yuchun Tang
Millions of new domains are registered every day and the many of them are malicious. It is challenging to keep track of malicious domains by only Web content analysis due to the large number of domains. One interesting pattern in legitimate domain names is that many of them consist of English words or look like meaningful English while many malicious domain names are randomly generated and do not include meaningful words. We show that it is possible to transform this intuitive observation into statistically informative features using second order Markov models. Four transition matrices are built from known legitimate domain names, known malicious domain names, English words in a dictionary, and based on a uniform distribution. The probabilities from these Markov models, as well as other features extracted from DNS data, are used to build a Random Forest classifier. The experimental results demonstrate that our system can quickly catch malicious domains with a low false positive rate.
{"title":"Mining DNS for malicious domain registrations","authors":"Yuanchen He, Zhenyu Zhong, S. Krasser, Yuchun Tang","doi":"10.4108/ICST.COLLABORATECOM.2010.5","DOIUrl":"https://doi.org/10.4108/ICST.COLLABORATECOM.2010.5","url":null,"abstract":"Millions of new domains are registered every day and the many of them are malicious. It is challenging to keep track of malicious domains by only Web content analysis due to the large number of domains. One interesting pattern in legitimate domain names is that many of them consist of English words or look like meaningful English while many malicious domain names are randomly generated and do not include meaningful words. We show that it is possible to transform this intuitive observation into statistically informative features using second order Markov models. Four transition matrices are built from known legitimate domain names, known malicious domain names, English words in a dictionary, and based on a uniform distribution. The probabilities from these Markov models, as well as other features extracted from DNS data, are used to build a Random Forest classifier. The experimental results demonstrate that our system can quickly catch malicious domains with a low false positive rate.","PeriodicalId":354101,"journal":{"name":"6th International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom 2010)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132757251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-03-30DOI: 10.4108/ICST.COLLABORATECOM.2010.4
Aditi Gupta, Salmin Sultana, Michael S. Kirkpatrick, E. Bertino
As the use of peer-to-peer (P2P) services for distributed file sharing has grown, the need for fine-grained access control (FGAC) has emerged. Existing access control frameworks use an all-or-nothing approach that is inadequate for sensitive content that may be shared by multiple users. In this paper, we propose a FGAC mechanism based on selective encryption techniques. Using this approach, the owner of a file specifies access control policies over various byte ranges in the file. The separate byte ranges are then encrypted and signed with different keys. Users of the file only receive the encryption keys for the ranges they are authorized to read and signing keys for the ranges they are authorized to write. We also propose an optional enhancement of the scheme where a file owner can hide location of the file. Our approach includes a key distribution scheme based on a public key infrastructure (PKI) and access control vectors. We also discuss how policy changes and file modifications are handled in our scheme. We have integrated our FGAC mechanism with the Chord structured P2P network. In this paper, we discuss relevant issues concerning the implementation and integration with Chord and present the performance results for our prototype implementation.
{"title":"A selective encryption approach to fine-grained access control for P2P file sharing","authors":"Aditi Gupta, Salmin Sultana, Michael S. Kirkpatrick, E. Bertino","doi":"10.4108/ICST.COLLABORATECOM.2010.4","DOIUrl":"https://doi.org/10.4108/ICST.COLLABORATECOM.2010.4","url":null,"abstract":"As the use of peer-to-peer (P2P) services for distributed file sharing has grown, the need for fine-grained access control (FGAC) has emerged. Existing access control frameworks use an all-or-nothing approach that is inadequate for sensitive content that may be shared by multiple users. In this paper, we propose a FGAC mechanism based on selective encryption techniques. Using this approach, the owner of a file specifies access control policies over various byte ranges in the file. The separate byte ranges are then encrypted and signed with different keys. Users of the file only receive the encryption keys for the ranges they are authorized to read and signing keys for the ranges they are authorized to write. We also propose an optional enhancement of the scheme where a file owner can hide location of the file. Our approach includes a key distribution scheme based on a public key infrastructure (PKI) and access control vectors. We also discuss how policy changes and file modifications are handled in our scheme. We have integrated our FGAC mechanism with the Chord structured P2P network. In this paper, we discuss relevant issues concerning the implementation and integration with Chord and present the performance results for our prototype implementation.","PeriodicalId":354101,"journal":{"name":"6th International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom 2010)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123930362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2008-04-07DOI: 10.4108/ICST.COLLABORATECOM.2010.2
F. Kashani, C. Shahabi
In this paper, we propose an efficient sample-based approach to answer fixed-precision approximate continuous aggregate queries in peer-to-peer databases. First, we define practical semantics to formulate fixed-precision approximate continuous aggregate queries. Second, we propose “Digest”, a two-tier system for correct and efficient query answering by sampling. At the top tier, we develop a query evaluation engine that uses the samples collected from the peer-to-peer database to continually estimate the running result of the approximate continuous aggregate query with guaranteed precision. For efficient query evaluation, we propose an extrapolation algorithm that predicts the evolution of the running result and adapts the frequency of the continual sampling occasions accordingly to avoid redundant samples. We also introduce a repeated sampling algorithm that draws on the correlation between the samples at successive sampling occasions and exploits linear regression to minimize the number of the samples derived at each occasion. At the bottom tier, we introduce a distributed sampling algorithm for random sampling (uniform and nonuniform) from peer-to-peer databases with arbitrary network topology and tuple distribution. Our sampling algorithm is based on the Metropolis Markov Chain Monte Carlo method that guarantees randomness of the sample with arbitrary small variation difference with the desired distribution, while it is comparable to optimal sampling in sampling cost/time. We evaluate the efficiency of Digest via simulation using real data.
{"title":"Fixed-precision approximate continuous aggregate queries in peer-to-peer databases","authors":"F. Kashani, C. Shahabi","doi":"10.4108/ICST.COLLABORATECOM.2010.2","DOIUrl":"https://doi.org/10.4108/ICST.COLLABORATECOM.2010.2","url":null,"abstract":"In this paper, we propose an efficient sample-based approach to answer fixed-precision approximate continuous aggregate queries in peer-to-peer databases. First, we define practical semantics to formulate fixed-precision approximate continuous aggregate queries. Second, we propose “Digest”, a two-tier system for correct and efficient query answering by sampling. At the top tier, we develop a query evaluation engine that uses the samples collected from the peer-to-peer database to continually estimate the running result of the approximate continuous aggregate query with guaranteed precision. For efficient query evaluation, we propose an extrapolation algorithm that predicts the evolution of the running result and adapts the frequency of the continual sampling occasions accordingly to avoid redundant samples. We also introduce a repeated sampling algorithm that draws on the correlation between the samples at successive sampling occasions and exploits linear regression to minimize the number of the samples derived at each occasion. At the bottom tier, we introduce a distributed sampling algorithm for random sampling (uniform and nonuniform) from peer-to-peer databases with arbitrary network topology and tuple distribution. Our sampling algorithm is based on the Metropolis Markov Chain Monte Carlo method that guarantees randomness of the sample with arbitrary small variation difference with the desired distribution, while it is comparable to optimal sampling in sampling cost/time. We evaluate the efficiency of Digest via simulation using real data.","PeriodicalId":354101,"journal":{"name":"6th International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom 2010)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129940288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}