Despite the broad utilization of cloud computing, some applications and services still cannot benefit from this popular computing paradigm due to inherent problems of cloud computing such as unacceptable latency, lack of mobility support and location-awareness. As a result, fog computing, has emerged as a promising infrastructure to provide elastic resources at the edge of network. In this paper, we have discussed current definitions of fog computing and similar concepts, and proposed a more comprehensive definition. We also analyzed the goals and challenges in fog computing platform, and presented platform design with several exemplar applications. We finally implemented and evaluated a prototype fog computing platform.
{"title":"Fog Computing: Platform and Applications","authors":"Shanhe Yi, Zijiang Hao, Zhengrui Qin, Qun A. Li","doi":"10.1109/HotWeb.2015.22","DOIUrl":"https://doi.org/10.1109/HotWeb.2015.22","url":null,"abstract":"Despite the broad utilization of cloud computing, some applications and services still cannot benefit from this popular computing paradigm due to inherent problems of cloud computing such as unacceptable latency, lack of mobility support and location-awareness. As a result, fog computing, has emerged as a promising infrastructure to provide elastic resources at the edge of network. In this paper, we have discussed current definitions of fog computing and similar concepts, and proposed a more comprehensive definition. We also analyzed the goals and challenges in fog computing platform, and presented platform design with several exemplar applications. We finally implemented and evaluated a prototype fog computing platform.","PeriodicalId":252318,"journal":{"name":"2015 Third IEEE Workshop on Hot Topics in Web Systems and Technologies (HotWeb)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129923031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Caching is an effective optimization in large scale web search engines, which is to reduce the underlying I/O burden of storage systems as far as possible by leveraging cache localities. Result cache and posting list cache are popular used approaches. However, they cannot perform well with long queries. The policies used in intersection cache are inefficient with poor flexibility for different applications. In this paper, we analyze the characteristics of query term intersections in typical search engines, and present a novel three-level cache architecture, called TLMCA, which combines the intersection cache, result cache, and posting list cache in memory. In TLMCA, we introduce an intersection cache data selection policy based on the Top-N frequent itemset mining, and design an intersection cache data replacement policy based on incremental frequent itemset mining. The experimental results demonstrate that the proposed intersection cache selection and replacement policies used in TLMCA can improve the retrieval performance by up to 27% compared to the two-level cache.
{"title":"An Intersection Cache Based on Frequent Itemset Mining in Large Scale Search Engines","authors":"Wanwan Zhou, Ruixuan Li, Xinhua Dong, Zhiyong Xu, Weijun Xiao","doi":"10.1109/HotWeb.2015.17","DOIUrl":"https://doi.org/10.1109/HotWeb.2015.17","url":null,"abstract":"Caching is an effective optimization in large scale web search engines, which is to reduce the underlying I/O burden of storage systems as far as possible by leveraging cache localities. Result cache and posting list cache are popular used approaches. However, they cannot perform well with long queries. The policies used in intersection cache are inefficient with poor flexibility for different applications. In this paper, we analyze the characteristics of query term intersections in typical search engines, and present a novel three-level cache architecture, called TLMCA, which combines the intersection cache, result cache, and posting list cache in memory. In TLMCA, we introduce an intersection cache data selection policy based on the Top-N frequent itemset mining, and design an intersection cache data replacement policy based on incremental frequent itemset mining. The experimental results demonstrate that the proposed intersection cache selection and replacement policies used in TLMCA can improve the retrieval performance by up to 27% compared to the two-level cache.","PeriodicalId":252318,"journal":{"name":"2015 Third IEEE Workshop on Hot Topics in Web Systems and Technologies (HotWeb)","volume":"243 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132961443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Julian Jarrett, Larissa Ferreira da Silva, Laerte Mello, Sadallo Andere, Gustavo Cruz, M. Brian Blake
When leveraging the crowd to perform complex tasks, it is imperative to identify the most effective worker for a particular job. Demographic profiles provided by workers, skill self-assessments by workers, and past performance as captured by employers all represent viable data points available within labor markets. Employers often question the validity of a worker's self-assessment of skills and expertise level when selecting workers in context of other information. More specifically, employers would like to answer the question, "Is worker confidence a predictor of quality?" In this paper, we discuss the state-of-the-art in recommending crowd workers based on assessment information. A major contribution of our work is an architecture, platform, and push/pull process for categorizing and recommending workers based on available self-assessment information. We present a study exploring the validity of skills input by workers in light of their actual performance and other metrics captured by employers. A further contribution of this approach is the extrapolation of a body of workers to describe the nature of the community more broadly. Through experimentation, within the language-processing domain, we demonstrate a new capability of deriving trends that might help future employers to select appropriate workers.
{"title":"Self-Generating a Labor Force for Crowdsourcing: Is Worker Confidence a Predictor of Quality?","authors":"Julian Jarrett, Larissa Ferreira da Silva, Laerte Mello, Sadallo Andere, Gustavo Cruz, M. Brian Blake","doi":"10.1109/HotWeb.2015.9","DOIUrl":"https://doi.org/10.1109/HotWeb.2015.9","url":null,"abstract":"When leveraging the crowd to perform complex tasks, it is imperative to identify the most effective worker for a particular job. Demographic profiles provided by workers, skill self-assessments by workers, and past performance as captured by employers all represent viable data points available within labor markets. Employers often question the validity of a worker's self-assessment of skills and expertise level when selecting workers in context of other information. More specifically, employers would like to answer the question, \"Is worker confidence a predictor of quality?\" In this paper, we discuss the state-of-the-art in recommending crowd workers based on assessment information. A major contribution of our work is an architecture, platform, and push/pull process for categorizing and recommending workers based on available self-assessment information. We present a study exploring the validity of skills input by workers in light of their actual performance and other metrics captured by employers. A further contribution of this approach is the extrapolation of a body of workers to describe the nature of the community more broadly. Through experimentation, within the language-processing domain, we demonstrate a new capability of deriving trends that might help future employers to select appropriate workers.","PeriodicalId":252318,"journal":{"name":"2015 Third IEEE Workshop on Hot Topics in Web Systems and Technologies (HotWeb)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133778718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Responsiveness is increasingly important for web servers to seamlessly interact with end-users and enhance user experience. In this paper, we studied how different server architectures -- asynchronous and thread-based -- impact the responsiveness of web servers under high concurrency workload. Through extensive measurements of a standard web server benchmark (Apache Bench), we show that the web servers with asynchronous architecture can achieve much better tail-latency than the thread-based version due to their robustness to handle high concurrency workload. Our fine-grained timeline analysis shows that a thread-based server is fragile to high concurrency workload because of its limited queue size (e.g., limited by thread pool size) for high concurrent requests, causing queue overflow and requests with very long response time due to TCP retransmissions. On the other hand, if we configure a thread-based server with large thread pool size to avoid queue overflow, the maximum achievable throughput can be significantly lower than that of the asynchronous version due to the multi-threading overhead. Our initial results suggest that asynchronous architecture should be considered to construct high responsive and robust web applications that involve hundreds of servers in cloud data centers.
{"title":"Performance Comparison of Web Servers with Different Architectures: A Case Study Using High Concurrency Workload","authors":"Qingwen Fan, Qingyang Wang","doi":"10.1109/HotWeb.2015.11","DOIUrl":"https://doi.org/10.1109/HotWeb.2015.11","url":null,"abstract":"Responsiveness is increasingly important for web servers to seamlessly interact with end-users and enhance user experience. In this paper, we studied how different server architectures -- asynchronous and thread-based -- impact the responsiveness of web servers under high concurrency workload. Through extensive measurements of a standard web server benchmark (Apache Bench), we show that the web servers with asynchronous architecture can achieve much better tail-latency than the thread-based version due to their robustness to handle high concurrency workload. Our fine-grained timeline analysis shows that a thread-based server is fragile to high concurrency workload because of its limited queue size (e.g., limited by thread pool size) for high concurrent requests, causing queue overflow and requests with very long response time due to TCP retransmissions. On the other hand, if we configure a thread-based server with large thread pool size to avoid queue overflow, the maximum achievable throughput can be significantly lower than that of the asynchronous version due to the multi-threading overhead. Our initial results suggest that asynchronous architecture should be considered to construct high responsive and robust web applications that involve hundreds of servers in cloud data centers.","PeriodicalId":252318,"journal":{"name":"2015 Third IEEE Workshop on Hot Topics in Web Systems and Technologies (HotWeb)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114212125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ultraviolet (UV) radiation has a great impact on human health. Nowadays, the public basically gets information about UV radiance through weather forecasts. However, weather forecasts just provide rough and average predication for a certain large region. Since CMOS sensors in mobile phone cameras are very sensitive to UV, mobile phones have potential to be ideal equipment to measure UV radiance. This paper introduced a method that can measure UV radiance by just using mobile phone cameras. In addition, by utilizing fog computing, results can be gathered and amended locally through fog server to provide relatively accurate UV measurement. In this paper, advantages of using mobile phones to do UV measurement were discussed at first. Then, theoretical foundations were meticulously illustrated. Later, a procedure that can be implemented in mobile phones was provided. Furthermore, an Android app called UV Meter was developed based on the procedure. At last, verification was conducted under different weather conditions. Results showed that the procedure is valid and can be easily implemented onto mobile phones for everyday UV measurement.
{"title":"Fog Computing Based Ultraviolet Radiation Measurement via Smartphones","authors":"Bo Mei, Wei Cheng, Xiuzhen Cheng","doi":"10.1109/HotWeb.2015.16","DOIUrl":"https://doi.org/10.1109/HotWeb.2015.16","url":null,"abstract":"Ultraviolet (UV) radiation has a great impact on human health. Nowadays, the public basically gets information about UV radiance through weather forecasts. However, weather forecasts just provide rough and average predication for a certain large region. Since CMOS sensors in mobile phone cameras are very sensitive to UV, mobile phones have potential to be ideal equipment to measure UV radiance. This paper introduced a method that can measure UV radiance by just using mobile phone cameras. In addition, by utilizing fog computing, results can be gathered and amended locally through fog server to provide relatively accurate UV measurement. In this paper, advantages of using mobile phones to do UV measurement were discussed at first. Then, theoretical foundations were meticulously illustrated. Later, a procedure that can be implemented in mobile phones was provided. Furthermore, an Android app called UV Meter was developed based on the procedure. At last, verification was conducted under different weather conditions. Results showed that the procedure is valid and can be easily implemented onto mobile phones for everyday UV measurement.","PeriodicalId":252318,"journal":{"name":"2015 Third IEEE Workshop on Hot Topics in Web Systems and Technologies (HotWeb)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125538397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
During the past decade, the Web has become increasingly more popular and thus more important for delivery of content and services over the Internet. At the same time, the number of requested objects, their size and delivery mechanisms for popular websites have become more complex. This in turn has various implications including the impact on page loading time that directly affects the experience of visiting users. Therefore, it is important to capture and characterize the complexity of popular web pages. An earlier study by Butkiewicz et al. have characterized the complexity of 1700 popular pages in 2011. In this study, we adopt the methodology proposed by Butkiewicz et al., develop the required tools and conduct a detailed measurement study to re-assess the complexity of 2000 popular web pages and present any observed trends in their complexity characteristics over the past four years. Our results show that the number of requested objects and contacted servers for each website has significantly increased. But a growing number of contacted servers are associated with third parties. Despite these changes, the page loading time remains rather unchanged and it is primarily affected by the same key parameters. Overall, our results sheds a useful light on trends in web site complexity and motivates a range of issues to be explored.
{"title":"Re-Examining the Complexity of Popular Websites","authors":"Ran Tian, R. Rejaie","doi":"10.1109/HotWeb.2015.23","DOIUrl":"https://doi.org/10.1109/HotWeb.2015.23","url":null,"abstract":"During the past decade, the Web has become increasingly more popular and thus more important for delivery of content and services over the Internet. At the same time, the number of requested objects, their size and delivery mechanisms for popular websites have become more complex. This in turn has various implications including the impact on page loading time that directly affects the experience of visiting users. Therefore, it is important to capture and characterize the complexity of popular web pages. An earlier study by Butkiewicz et al. have characterized the complexity of 1700 popular pages in 2011. In this study, we adopt the methodology proposed by Butkiewicz et al., develop the required tools and conduct a detailed measurement study to re-assess the complexity of 2000 popular web pages and present any observed trends in their complexity characteristics over the past four years. Our results show that the number of requested objects and contacted servers for each website has significantly increased. But a growing number of contacted servers are associated with third parties. Despite these changes, the page loading time remains rather unchanged and it is primarily affected by the same key parameters. Overall, our results sheds a useful light on trends in web site complexity and motivates a range of issues to be explored.","PeriodicalId":252318,"journal":{"name":"2015 Third IEEE Workshop on Hot Topics in Web Systems and Technologies (HotWeb)","volume":"154 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122874262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We describe a set of techniques to infer structured descriptions of web APIs from usage examples. Using trained classifiers, we identify fixed and variable segments in paths, and tag parameters according to their types. We implemented our techniques and evaluated their precision on 10 APIs for which we obtained: 1) descriptions, manually written by the API maintainers, and 2) server logs of the API usage. Our experiments show that our system is able to reconstruct the structure of both simple and complex web API descriptions, outperforming an existing tool with similar goals. Finally, we assess the impact of noise in the input data on the results of our method.
{"title":"Inferring Web API Descriptions from Usage Data","authors":"Philippe Suter, Erik Wittern","doi":"10.1109/HotWeb.2015.19","DOIUrl":"https://doi.org/10.1109/HotWeb.2015.19","url":null,"abstract":"We describe a set of techniques to infer structured descriptions of web APIs from usage examples. Using trained classifiers, we identify fixed and variable segments in paths, and tag parameters according to their types. We implemented our techniques and evaluated their precision on 10 APIs for which we obtained: 1) descriptions, manually written by the API maintainers, and 2) server logs of the API usage. Our experiments show that our system is able to reconstruct the structure of both simple and complex web API descriptions, outperforming an existing tool with similar goals. Finally, we assess the impact of noise in the input data on the results of our method.","PeriodicalId":252318,"journal":{"name":"2015 Third IEEE Workshop on Hot Topics in Web Systems and Technologies (HotWeb)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128190389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Malicious webpages are today one of the most prevalent threats in the Internet security landscape. To understand such problem, there has been several efforts of analysis, classification, and labeling of malicious webpages, ranging from the simple static techniques to the more elaborate dynamic techniques. Building on such efforts, this work summarizes our work in the design and evaluation of a system that utilizes machine learning techniques over network metadata to identify malicious webpages and classify them into broader classes of vulnerabilities. The system uses easy to interpret features, utilizes uniquely acquired dynamic network artifacts, and known labels for rendered webpages in a sandboxed environment. We report on the success (and failure) of our system, and the way forward by suggesting open directions for practical malicious web contents classification.
{"title":"Towards Automatic and Lightweight Detection and Classification of Malicious Web Contents","authors":"Aziz Mohaisen","doi":"10.1109/HotWeb.2015.20","DOIUrl":"https://doi.org/10.1109/HotWeb.2015.20","url":null,"abstract":"Malicious webpages are today one of the most prevalent threats in the Internet security landscape. To understand such problem, there has been several efforts of analysis, classification, and labeling of malicious webpages, ranging from the simple static techniques to the more elaborate dynamic techniques. Building on such efforts, this work summarizes our work in the design and evaluation of a system that utilizes machine learning techniques over network metadata to identify malicious webpages and classify them into broader classes of vulnerabilities. The system uses easy to interpret features, utilizes uniquely acquired dynamic network artifacts, and known labels for rendered webpages in a sandboxed environment. We report on the success (and failure) of our system, and the way forward by suggesting open directions for practical malicious web contents classification.","PeriodicalId":252318,"journal":{"name":"2015 Third IEEE Workshop on Hot Topics in Web Systems and Technologies (HotWeb)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129927139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Because demand for mobile data will outgrow the capacity of wireless networks for a long time, mobile applications should employ aggressive data caching and preloading to reduce the end users' data cost. This paper examines the algorithms that individual mobile applications can use for caching and preloading decisions, and proposes mechanisms for mobile operating systems to support such application-controlled caching and preloading. The paper ends with a call for data and tools to enable future research in the area.
{"title":"Opportunities and Challenges for Caching and Prefetching on Mobile Devices","authors":"P. Cao","doi":"10.1109/HotWeb.2015.18","DOIUrl":"https://doi.org/10.1109/HotWeb.2015.18","url":null,"abstract":"Because demand for mobile data will outgrow the capacity of wireless networks for a long time, mobile applications should employ aggressive data caching and preloading to reduce the end users' data cost. This paper examines the algorithms that individual mobile applications can use for caching and preloading decisions, and proposes mechanisms for mobile operating systems to support such application-controlled caching and preloading. The paper ends with a call for data and tools to enable future research in the area.","PeriodicalId":252318,"journal":{"name":"2015 Third IEEE Workshop on Hot Topics in Web Systems and Technologies (HotWeb)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131018934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We describe and present a prototype of a distributed computational infrastructure and associated high-level programming language that allow multiple parties to leverage their own computational resources capable of supporting MapReduce [1] operations in combination with multi-party computation (MPC). Our architecture allows a programmer to author and compile a protocol using a uniform collection of standard constructs, even when that protocol involves computations that take place locally within each participant's MapReduce cluster as well as across all the participants using an MPC protocol. The high-level programming language provided to the user is accompanied by static analysis algorithms that allow the programmer to reason about the efficiency of the protocol before compiling and running it. We present two example applications demonstrating how such an infrastructure can be employed.
{"title":"Programming Support for an Integrated Multi-Party Computation and MapReduce Infrastructure","authors":"Nikolaj Volgushev, A. Lapets, Azer Bestavros","doi":"10.1109/HOTWEB.2015.21","DOIUrl":"https://doi.org/10.1109/HOTWEB.2015.21","url":null,"abstract":"We describe and present a prototype of a distributed computational infrastructure and associated high-level programming language that allow multiple parties to leverage their own computational resources capable of supporting MapReduce [1] operations in combination with multi-party computation (MPC). Our architecture allows a programmer to author and compile a protocol using a uniform collection of standard constructs, even when that protocol involves computations that take place locally within each participant's MapReduce cluster as well as across all the participants using an MPC protocol. The high-level programming language provided to the user is accompanied by static analysis algorithms that allow the programmer to reason about the efficiency of the protocol before compiling and running it. We present two example applications demonstrating how such an infrastructure can be employed.","PeriodicalId":252318,"journal":{"name":"2015 Third IEEE Workshop on Hot Topics in Web Systems and Technologies (HotWeb)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126749938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}