Detecting Clones, Copying and Reuse on the Web

2012 IEEE 28th International Conference on Data Engineering Pub Date : 2012-04-01 DOI:10.1109/ICDE.2012.146

X. Dong, D. Srivastava

引用次数: 3

Abstract

The Web has enabled the availability of a vast amount of useful information in recent years. However, the web technologies that have enabled sources to share their information have also made it easy for sources to copy from each other and often publish without proper attribution. Understanding the copying relationships between sources has many benefits, including helping data providers protect their own rights, improving various aspects of data integration, and facilitating in-depth analysis of information flow. The importance of copy detection has led to a substantial amount of research in many disciplines of Computer Science, based on the type of information considered, such as text, images, videos, software code, and structured data. This seminar explores the similarities and differences between the techniques proposed for copy detection across the different types of information. We also examine the computational challenges associated with large-scale copy detection, indicating how they could be detected efficiently, and identify a range of open problems for the community.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

检测克隆，复制和重用在Web上

近年来，网络使大量有用信息的可用性成为可能。然而，网络技术使消息来源能够共享他们的信息，也使消息来源之间的相互复制变得容易，并且经常在没有适当归属的情况下发布。了解源之间的复制关系有很多好处，包括帮助数据提供者保护自己的权利、改进数据集成的各个方面，以及促进对信息流的深入分析。基于所考虑的信息类型(如文本、图像、视频、软件代码和结构化数据)，复制检测的重要性在计算机科学的许多学科中引起了大量的研究。本次研讨会探讨了在不同类型的信息中提出的复制检测技术之间的异同。我们还研究了与大规模复制检测相关的计算挑战，指出了如何有效地检测它们，并为社区确定了一系列开放问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2012 IEEE 28th International Conference on Data Engineering

自引率

0.00%

发文量

期刊最新文献

Keyword Query Reformulation on Structured Data Accuracy-Aware Uncertain Stream Databases Extracting Analyzing and Visualizing Triangle K-Core Motifs within Networks Project Daytona: Data Analytics as a Cloud Service Automatic Extraction of Structured Web Data with Domain Knowledge