Proceedings of the 4th Symposium on Information and Communication Technology最新文献

英文中文

P2P shared-caching model: using P2P to improve client-server application performance P2P共享缓存模型:利用P2P提高客户端-服务器应用程序性能

Proceedings of the 4th Symposium on Information and Communication Technology

Pub Date : 2013-12-05 DOI: 10.1145/2542050.2542090

Luong Quy Tho, Ha Quoc Trung

Client-server application model has several drawbacks such as server bottleneck and weak scalability. Peer-to-Peer (P2P) model resolves these problems by distributing tasks on the nodes participating in the system. P2P application development and protocol designing are much more difficult than client-server model. This paper proposes an approach to take the advantages of both models: the scalability of the P2P model and the simplicity of the client-server model. This paper presents a hybrid P2P and client-server model to achieve both goals based on caching mechanism which allows using cache content not only on a single client, but for all clients in the system. The proposed model has been applied to implement a Web application.

客户机-服务器应用程序模型存在服务器瓶颈、可伸缩性弱等缺点。P2P (Peer-to-Peer)模型通过在参与系统的节点上分配任务来解决这些问题。P2P应用程序的开发和协议设计比客户端-服务器模型要困难得多。本文提出了一种利用这两种模型的优点的方法:P2P模型的可扩展性和客户端-服务器模型的简单性。本文提出了一种基于缓存机制的混合P2P和客户端-服务器模型，该模型不仅允许在单个客户端上使用缓存内容，而且允许在系统中的所有客户端上使用缓存内容。提出的模型已经应用于实现一个Web应用程序。

引用次数: 0

The dawn of quantum communication 量子通信的曙光

Proceedings of the 4th Symposium on Information and Communication Technology

Pub Date : 2013-12-05 DOI: 10.1145/2542050.2542053

P. Verma

Dramatic paradigm shifts over the past few centuries have led to a rich landscape of options in human and machine communication. Communication today is deeply intertwined with our personal and social lives in addition to being a vital part of businesses and government operations---both overt and covert. This talk will address the evolving role that the fundamental laws of quantum physics are likely to play in giving communication yet another dimension in its richness. Quantum communication will bring about not only communication in the form as we know it, but also ensure that it is unconditionally secure as it transits through any medium. Given that information is the currency of a modern society, its security is paramount for the wellbeing of an individual, a society, a nation, or the globe as a whole. The talk will discuss the short history of quantum communication and draw upon the theoretical and experimental work that the author and his colleagues have conducted over the past few years in order to chart out the likely course of future events in the emerging age of secure communication.

在过去的几个世纪里，戏剧性的范式转变导致了人类和机器通信的丰富选择。今天，沟通与我们的个人和社会生活紧密交织在一起，也是企业和政府运作的重要组成部分——无论是公开的还是隐蔽的。本讲座将讨论量子物理的基本定律在赋予通信另一个维度的丰富性方面可能发挥的不断发展的作用。量子通信将带来的不仅是我们所知道的形式的通信，而且还将确保它在任何介质中传输时都是无条件安全的。鉴于信息是现代社会的货币，它的安全对于个人、社会、国家或整个地球的福祉至关重要。讲座将讨论量子通信的短暂历史，并借鉴作者和他的同事在过去几年中所进行的理论和实验工作，以便在新兴的安全通信时代描绘出未来事件的可能过程。

引用次数: 0

Applying time series analysis and neighbourhood voting in a decentralised approach for fault detection and classification in WSNs 将时间序列分析和邻域投票方法应用于wsn的故障检测和分类

Proceedings of the 4th Symposium on Information and Communication Technology

Pub Date : 2013-12-05 DOI: 10.1145/2542050.2542080

T. Nguyen, Doina Bucur, Marco Aiello, K. Tei

In pervasive computing environments, wireless sensor networks play an important infrastructure role, collecting reliable and accurate context information so that applications are able to provide services to users on demand. In such environments, sensors should be self-adaptive by taking correct decisions based on sensed data in real-time in a decentralised manner; however, sensed data is often faulty. We thus design a decentralised scheme for fault detection and classification in sensor data in which each sensor node does localised fault detection. A combination of neighbourhood voting and time series data analysis techniques are used to detect faults. We also study the comparative accuracy of both the union and the intersection of the two techniques. Then, detected faults are classified into known fault categories. An initial evaluation with SensorScope, an outdoor temperature dataset, confirms that our solution is able to detect and classify faulty readings into four fault types, namely, 1) random, 2) mal-function, 3) bias, and 4) drift with accuracy up to 95%. The results also show that, with the experimental dataset, the time series data analysis technique performs comparable well in most of the cases, whilst in some other cases the support from neighbourhood voting technique and histogram analysis helps our hybrid solution to successfully detects the faults of all types.

在普适计算环境中，无线传感器网络扮演着重要的基础设施角色，收集可靠、准确的上下文信息，使应用程序能够按需向用户提供服务。在这样的环境中，传感器应该通过以分散的方式根据实时感知的数据做出正确的决策来自适应;然而，感知到的数据往往是错误的。因此，我们设计了一种分散的方案，用于传感器数据的故障检测和分类，其中每个传感器节点都进行局部故障检测。结合邻域投票和时间序列数据分析技术进行故障检测。我们还研究了两种技术的联合和交叉的比较精度。然后，将检测到的故障分类到已知的故障类别中。对室外温度数据集SensorScope的初步评估证实，我们的解决方案能够检测并将故障读数分为四种故障类型，即1)随机，2)故障，3)偏差和4)漂移，准确率高达95%。结果还表明，在实验数据集上，时间序列数据分析技术在大多数情况下表现相当好，而在其他一些情况下，邻域投票技术和直方图分析的支持有助于我们的混合解决方案成功检测所有类型的故障。

{"title":"Applying time series analysis and neighbourhood voting in a decentralised approach for fault detection and classification in WSNs","authors":"T. Nguyen, Doina Bucur, Marco Aiello, K. Tei","doi":"10.1145/2542050.2542080","DOIUrl":"https://doi.org/10.1145/2542050.2542080","url":null,"abstract":"In pervasive computing environments, wireless sensor networks play an important infrastructure role, collecting reliable and accurate context information so that applications are able to provide services to users on demand. In such environments, sensors should be self-adaptive by taking correct decisions based on sensed data in real-time in a decentralised manner; however, sensed data is often faulty. We thus design a decentralised scheme for fault detection and classification in sensor data in which each sensor node does localised fault detection. A combination of neighbourhood voting and time series data analysis techniques are used to detect faults. We also study the comparative accuracy of both the union and the intersection of the two techniques. Then, detected faults are classified into known fault categories. An initial evaluation with SensorScope, an outdoor temperature dataset, confirms that our solution is able to detect and classify faulty readings into four fault types, namely, 1) random, 2) mal-function, 3) bias, and 4) drift with accuracy up to 95%. The results also show that, with the experimental dataset, the time series data analysis technique performs comparable well in most of the cases, whilst in some other cases the support from neighbourhood voting technique and histogram analysis helps our hybrid solution to successfully detects the faults of all types.","PeriodicalId":246033,"journal":{"name":"Proceedings of the 4th Symposium on Information and Communication Technology","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127445323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 26

Document clustering using dirichlet process mixture model of von Mises-Fisher distributions 利用von Mises-Fisher分布的dirichlet过程混合模型进行文档聚类

Proceedings of the 4th Symposium on Information and Communication Technology

Pub Date : 2013-12-05 DOI: 10.1145/2542050.2542079

N. K. Anh, Tam The Nguyen, Ngo Van Linh

Document clustering has become an increasingly important technique for unsupervised document organization, automatic topic extraction, and fast information retrieval or filtering. This paper proposes a Dirichlet process mixture (DPM) model approach to clustering directional data based on the von Mises-Fisher (vMF) distribution, which arises naturally for data distributed on the unit hypersphere. We have developed a mean-field variational inference algorithm for the DPM model of vMFs that is applied to clustering text documents. Using this model, the number of clusters is determined automatically after the clustering process rather than pre-estimated. We conducted extensive experiments to evaluate the proposed approach on a large number of high dimensional text datasets. Empirical experimental results over NMI (Normalized Mutual Information) and Purity evaluation measures demonstrate that our approach outperforms the four state-of-the-art clustering algorithms.

文档聚类已成为无监督文档组织、自动主题提取和快速信息检索或过滤的重要技术。本文提出了一种基于von Mises-Fisher (vMF)分布的Dirichlet过程混合(DPM)模型来聚类定向数据，这是分布在单位超球上的数据自然产生的。我们开发了一种用于聚类文本文档的vmf的DPM模型的平均场变分推理算法。使用该模型，簇的数量是在聚类过程后自动确定的，而不是预先估计。我们在大量高维文本数据集上进行了广泛的实验来评估所提出的方法。在NMI(归一化互信息)和纯度评估措施上的经验实验结果表明，我们的方法优于四种最先进的聚类算法。

引用次数: 7

VNLP: an open source framework for Vietnamese natural language processing VNLP:越南语自然语言处理的开源框架

Proceedings of the 4th Symposium on Information and Communication Technology

Pub Date : 2013-12-05 DOI: 10.1145/2542050.2542062

N. Le, Bich Ngoc Do, Vien Nguyen, Thi Dam Nguyen

Natural Language Processing (NLP) for Vietnamese has been researched for more than a decade but still lacks of an open-source NLP pipeline. As the result, researchers have to spend a lot of time on various fundamental tasks before working on the task of interest. Besides, the circumstance holds back text processing technology in Vietnam because an application costs much more money and time to reach a deliverable state. This work is an attempt to solve this issue. By incorporating available open-source software packages and implementing new ones, we have created an open-source, production-ready solution for Vietnamese text processing. Via three experiments, we demonstrated its effectiveness and efficiency. The software has helped us to develop our solution for Vietnamese sentiment analysis and online reputation management and we hope that it will also facilitate research in Vietnamese NLP.

越南语的自然语言处理(NLP)已经研究了十多年，但仍然缺乏开源的NLP管道。因此，研究人员在从事感兴趣的任务之前，必须在各种基础任务上花费大量时间。此外，这种情况阻碍了越南的文本处理技术，因为应用程序要达到可交付状态需要花费更多的资金和时间。这项工作就是试图解决这个问题。通过合并可用的开源软件包和实现新的软件包，我们为越南文本处理创建了一个开源的、可用于生产的解决方案。通过三个实验，验证了该方法的有效性和高效性。该软件帮助我们开发了越南情感分析和在线声誉管理的解决方案，我们希望它也能促进越南NLP的研究。

引用次数: 4

Optimized data management for e-learning in the clouds towards Cloodle 面向cloudle的云端电子学习优化数据管理

Proceedings of the 4th Symposium on Information and Communication Technology

Pub Date : 2013-12-05 DOI: 10.1145/2542050.2542089

M. Adriani, Y. W. Choong, Ba-Hung Ngo, Laurent d'Orazio, D. Laurent, N. Spyratos

Cloud computing provides access to "infinite" storage and computing resources, offering promising perspectives for many applications, particularly e-learning. However, this new paradigm requires rethinking of database management principles in order to allow deployment on scalable, easy to access infrastructures, applying a pay-as-you-go model in which failures are not exceptions but rather the norm. The GOD project aims to provide an optimized data management system for e-learning in the cloud by rethinking traditional database management techniques, extending them to consider the specificities of this paradigm.

云计算提供了对“无限”存储和计算资源的访问，为许多应用程序，特别是电子学习，提供了有希望的前景。然而，这种新范式需要重新考虑数据库管理原则，以便允许在可伸缩的、易于访问的基础设施上进行部署，并应用按需付费的模型，在这种模型中，故障不是例外，而是常态。GOD项目旨在通过重新思考传统的数据库管理技术，扩展它们以考虑这种范式的特殊性，为云中的电子学习提供优化的数据管理系统。

引用次数: 0

On approaching 2D-FPCA technique to improve image representation in frequency domain 接近2D-FPCA技术改善频域图像表示

Proceedings of the 4th Symposium on Information and Communication Technology

Pub Date : 2013-12-05 DOI: 10.1145/2542050.2542061

T. Le, Hung Phuoc Truong, H. T. Do, Duc Minh Vo

A novel approach based on structure information extraction in frequency domain is proposed for image representation problem. Regarding this problem, a new subspace method based on Two-dimensional Fractional Principle Component Analysis (2D-FPCA) in frequency domain is applied to images, thus extracting the texture information. In order to extract the structure information, the system utilizes this new subspace as the bilateral consideration of 2D-FPCA technique called B2D-FPCA. For this purpose: (1) we first introduce the theory of 2D-FPCA based on the definition of fractional variance and fractional covariance matrix; (2) then show its improvement called Bilateral 2D-FPCA and (3) the robustness of 2D-DCT is also described as the preprocessing step. This approach is applied to facial expression representation problem to prove the stability and robustness of the proposed framework. For demonstration, facial expressions datasets (JAFFE, Pain expression subset and Cohn-Kanade) are used in order to compare the proposed framework with some other approaches.

提出了一种基于频域结构信息提取的图像表示方法。针对这一问题，将一种新的基于频域二维分数主成分分析(2D-FPCA)的子空间方法应用于图像，提取纹理信息。为了提取结构信息，系统利用这一新的子空间作为2D-FPCA技术的双边考虑，称为B2D-FPCA。为此:(1)首先介绍了基于分数阶方差和分数阶协方差矩阵定义的2D-FPCA理论;(2)然后展示其改进称为双边2D-FPCA; (3) 2D-DCT的鲁棒性也被描述为预处理步骤。将该方法应用于面部表情表示问题，证明了该框架的稳定性和鲁棒性。为了演示，使用面部表情数据集(JAFFE, Pain expression子集和Cohn-Kanade)将所提出的框架与其他一些方法进行比较。

引用次数: 6

Combining maturity with agility: lessons learnt from a case study 将成熟度与敏捷性相结合:从案例研究中获得的经验教训

Proceedings of the 4th Symposium on Information and Communication Technology

Pub Date : 2013-12-05 DOI: 10.1145/2542050.2542072

N. Tuan, H. Thang

Although both high maturity and agility appeared as different ways to address and overcome issues related to software development (including maximizing resources and minimizing risks), there has been a mixed understanding about the possibility for their co-existence within an organization. Outside of the dogmatic debate regarding their co-existence, however, voices have been raised recently that recognize that both approaches have their merits. This paper presents the results of a case study on the practices that a purely agile organization has put in place in order to profit from the opportunities that higher maturity can offer in respect to value creation for clients. Our conclusion is that both high maturity and agility contribute to customer satisfaction, high quality and waste reduction; and that complying with standards does not necessarily impose restriction on 'being agile'. Implication for practice is that companies and their clients can benefit from a development approach that embraces both maturity and agility. To achieve this goal, guidelines are needed that direct organizations towards adopting practices that are linked to higher maturity, as well as to agility.

尽管高成熟度和高敏捷性作为处理和克服与软件开发相关的问题(包括最大化资源和最小化风险)的不同方式出现，但是对于它们在组织中共存的可能性，人们有不同的理解。然而，在关于两者共存的教条式辩论之外，最近也有人提出认识到这两种方法都有其优点。本文展示了一个案例研究的结果，这个案例研究是一个纯敏捷组织为了从更高的成熟度为客户创造价值所提供的机会中获利而实施的实践。我们的结论是，高成熟度和敏捷性都有助于客户满意度、高质量和减少浪费;遵守标准并不一定会对“敏捷”施加限制。实践的含义是公司和他们的客户可以从包含成熟度和敏捷性的开发方法中获益。为了实现这一目标，需要指导组织采用与更高成熟度以及敏捷性相关的实践的指导方针。

{"title":"Combining maturity with agility: lessons learnt from a case study","authors":"N. Tuan, H. Thang","doi":"10.1145/2542050.2542072","DOIUrl":"https://doi.org/10.1145/2542050.2542072","url":null,"abstract":"Although both high maturity and agility appeared as different ways to address and overcome issues related to software development (including maximizing resources and minimizing risks), there has been a mixed understanding about the possibility for their co-existence within an organization. Outside of the dogmatic debate regarding their co-existence, however, voices have been raised recently that recognize that both approaches have their merits. This paper presents the results of a case study on the practices that a purely agile organization has put in place in order to profit from the opportunities that higher maturity can offer in respect to value creation for clients. Our conclusion is that both high maturity and agility contribute to customer satisfaction, high quality and waste reduction; and that complying with standards does not necessarily impose restriction on 'being agile'. Implication for practice is that companies and their clients can benefit from a development approach that embraces both maturity and agility. To achieve this goal, guidelines are needed that direct organizations towards adopting practices that are linked to higher maturity, as well as to agility.","PeriodicalId":246033,"journal":{"name":"Proceedings of the 4th Symposium on Information and Communication Technology","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116526873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 18

State-space modeling based on principal component analysis and oxygenated-deoxygenated correlation to improve near-infrared spectroscopy signals 基于主成分分析和氧-脱氧相关的状态空间建模改进近红外光谱信号

Proceedings of the 4th Symposium on Information and Communication Technology

Pub Date : 2013-12-05 DOI: 10.1145/2542050.2542094

N. Thang, Nguyen Huynh Minh Tam, Tran Le Giang, Vo Nhut Tuan, Lan Anh Trinh, Hoang-Hai Tran, V. Toi

Near infrared spectroscopy (NIRS) is currently becoming an effective technique for noninvasive functional brain imaging. Therefore, the methods to improve the quality of measured NIRS signals play an important role to make NIRS broadly accepted in practical applications. Previously, there have been approaches using state-space modeling to recover the NIRS signals from basic component signals to eliminate the artifacts presented in the NIRS measurements. However, the proposed approach requires us an onset vector to determine the starting position of stimulus that is not always available in practical situation. In this work, we provide a new way to find the basic components for efficient implementations of the state-space modeling. We apply principal component analysis to estimate eigenvector-based basis that presents the compact information of the whole signals. We utilize the oxygenated-deoxygenated correlation to find another set of basic components to enhance the quality of NIRS signals. The state-space modeling based on Kalman filter is used to reconstruct the NIRS signals from these basic components. We tested the proposed algorithm with actual data and showed significant improvements of the contrast-to-noise (CNR) of the NIRS signals after filtered by our proposed approach.

近红外光谱(NIRS)是目前一种有效的无创脑功能成像技术。因此，提高近红外光谱测量信号质量的方法对于近红外光谱在实际应用中被广泛接受具有重要作用。以前，有一些方法使用状态空间建模从基本分量信号中恢复近红外光谱信号，以消除近红外光谱测量中出现的伪影。然而，所提出的方法需要一个起始向量来确定刺激的起始位置，而这在实际情况中并不总是可用的。在这项工作中，我们提供了一种新的方法来寻找有效实现状态空间建模的基本组件。我们应用主成分分析来估计基于特征向量的基，该基表示整个信号的压缩信息。我们利用加氧-脱氧相关找到另一组基本分量来提高近红外信号的质量。利用基于卡尔曼滤波的状态空间建模方法从这些基本分量中重构近红外信号。我们用实际数据测试了所提出的算法，结果表明，经过我们提出的方法滤波后，近红外光谱信号的噪声对比(CNR)得到了显著改善。

{"title":"State-space modeling based on principal component analysis and oxygenated-deoxygenated correlation to improve near-infrared spectroscopy signals","authors":"N. Thang, Nguyen Huynh Minh Tam, Tran Le Giang, Vo Nhut Tuan, Lan Anh Trinh, Hoang-Hai Tran, V. Toi","doi":"10.1145/2542050.2542094","DOIUrl":"https://doi.org/10.1145/2542050.2542094","url":null,"abstract":"Near infrared spectroscopy (NIRS) is currently becoming an effective technique for noninvasive functional brain imaging. Therefore, the methods to improve the quality of measured NIRS signals play an important role to make NIRS broadly accepted in practical applications. Previously, there have been approaches using state-space modeling to recover the NIRS signals from basic component signals to eliminate the artifacts presented in the NIRS measurements. However, the proposed approach requires us an onset vector to determine the starting position of stimulus that is not always available in practical situation. In this work, we provide a new way to find the basic components for efficient implementations of the state-space modeling. We apply principal component analysis to estimate eigenvector-based basis that presents the compact information of the whole signals. We utilize the oxygenated-deoxygenated correlation to find another set of basic components to enhance the quality of NIRS signals. The state-space modeling based on Kalman filter is used to reconstruct the NIRS signals from these basic components. We tested the proposed algorithm with actual data and showed significant improvements of the contrast-to-noise (CNR) of the NIRS signals after filtered by our proposed approach.","PeriodicalId":246033,"journal":{"name":"Proceedings of the 4th Symposium on Information and Communication Technology","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121075489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Efficient query evaluation on distributed graphs with Hadoop environment 基于Hadoop环境的分布式图的高效查询评估

Proceedings of the 4th Symposium on Information and Communication Technology

Pub Date : 2013-12-05 DOI: 10.1145/2542050.2542086

Le-Duc Tung, Quyet Nguyen-Van, Zhenjiang Hu

Graph has emerged as a powerful data structure to describe various data. Query evaluation on distributed graphs takes much cost due to the complexity of links among sites. Dan Suciu has proposed algorithms for query evaluation on semistructured data that is a rooted, edge-labeled graph, and algorithms are proved to be efficient in terms of communication steps and data transferring during the evaluation. However, one disadvantage is that communication data are collected to one single site, which leads to a bottleneck in the evaluation for real-life data. In this paper, we propose two algorithms to improve Dan Suciu's algorithms: one-pass algorithm is to significantly reduce a large amount of redundant data in the evaluation, and iter_acc algorithm is to resolve the bottleneck. Then, we design an efficient implementation with only one MapReduce job for our algorithms in Hadoop environment by utilizing features of Hadoop file system. Experiments on cloud system show that one-pass algorithm can detect and remove 50% of data being redundant in the evaluation process on YouTube and DBLP datasets, and iter_acc algorithm is running without the bottleneck even when we double the size of input data.

图作为一种描述各种数据的强大数据结构已经出现。由于站点间链接的复杂性，对分布式图的查询评估花费了大量的成本。Dan Suciu提出了对半结构化数据(有根的、有边标记的图)进行查询评估的算法，并在评估过程中的通信步骤和数据传输方面证明了算法的有效性。然而，缺点是通信数据收集到一个单一的站点，这导致了对实际数据的评估的瓶颈。在本文中，我们提出了两种算法来改进Dan Suciu的算法:one-pass算法是为了在求值时显著减少大量冗余数据，iter_acc算法是为了解决瓶颈问题。然后，利用Hadoop文件系统的特性，设计了算法在Hadoop环境下只有一个MapReduce作业的高效实现。在云系统上的实验表明，在YouTube和DBLP数据集的评估过程中，一遍算法可以检测并去除50%的冗余数据，即使我们将输入数据的大小增加一倍，iter_acc算法也不会出现瓶颈。

{"title":"Efficient query evaluation on distributed graphs with Hadoop environment","authors":"Le-Duc Tung, Quyet Nguyen-Van, Zhenjiang Hu","doi":"10.1145/2542050.2542086","DOIUrl":"https://doi.org/10.1145/2542050.2542086","url":null,"abstract":"Graph has emerged as a powerful data structure to describe various data. Query evaluation on distributed graphs takes much cost due to the complexity of links among sites. Dan Suciu has proposed algorithms for query evaluation on semistructured data that is a rooted, edge-labeled graph, and algorithms are proved to be efficient in terms of communication steps and data transferring during the evaluation. However, one disadvantage is that communication data are collected to one single site, which leads to a bottleneck in the evaluation for real-life data. In this paper, we propose two algorithms to improve Dan Suciu's algorithms: one-pass algorithm is to significantly reduce a large amount of redundant data in the evaluation, and iter_acc algorithm is to resolve the bottleneck. Then, we design an efficient implementation with only one MapReduce job for our algorithms in Hadoop environment by utilizing features of Hadoop file system. Experiments on cloud system show that one-pass algorithm can detect and remove 50% of data being redundant in the evaluation process on YouTube and DBLP datasets, and iter_acc algorithm is running without the bottleneck even when we double the size of input data.","PeriodicalId":246033,"journal":{"name":"Proceedings of the 4th Symposium on Information and Communication Technology","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127089811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the 4th Symposium on Information and Communication Technology

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀