Web-KR '13最新文献

英文中文

Windowing mechanisms for web scale stream reasoning 网页规模流推理的窗口机制

Web-KR '13

Pub Date : 2013-11-01 DOI: 10.1145/2512405.2512409

Snehasish Banerjee, D. Mukherjee

Web-scale stream reasoning is based on continuous queries and reasoning on a snapshot of the dynamic knowledge combined with background knowledge. The existing stream reasoners usually use either time-based or count-based window techniques following the data stream principles, however they do not fit all scenarios in the stream reasoning area. In this paper, different types of windowing mechanisms are described with exemplary scenarios in which they are most suitable for reasoning on stream of facts. A new windowing technique namely Adaptive Window is also proposed. Lastly, some important questions related to windowing techniques for web-scale stream reasoning are positioned.

web规模的流推理是基于对动态知识快照的连续查询和推理，并与背景知识相结合。现有的流推理器通常使用基于时间或基于计数的窗口技术来遵循数据流原理，但它们并不适合流推理领域的所有场景。在本文中，不同类型的窗口机制描述了示例场景，其中它们最适合于对事实流的推理。提出了一种新的窗口技术——自适应窗口。最后，对web规模流推理中与窗口技术相关的一些重要问题进行了定位。

引用次数: 6

Leveraging related entities for knowledge base acceleration 利用相关实体加速知识库

Web-KR '13

Pub Date : 2013-11-01 DOI: 10.1145/2512405.2512407

Xitong Liu, Hui Fang

Knowledge bases such as Wikipedia have been shown to be effective to improve the performance in many information tasks. Clearly, the effectiveness is based upon the quality of these knowledge bases. A high-quality knowledge base should have up-to-date complete information. However, constructing a high-quality knowledge base is not an easy task because it would require significant manual efforts to collect relevant documents, extract valuable information and update the knowledge bases accordingly. In this paper, we aim to automate this labor-intensive process. Specifically, we focus on how to collect relevant documents with regard to an entity from sheer volume of Web data automatically. To solve the problem, we propose to construct the profile of the entity by leveraging a set of its related entities and then discuss how to use the training data to weight the related entities. Experiments over the TREC 2012 KBA collection shows that the proposed method can outperform state-of-the-art methods.

维基百科等知识库已被证明可以有效地提高许多信息任务的性能。显然，有效性取决于这些知识库的质量。高质量的知识库应该包含最新的完整信息。然而，构建一个高质量的知识库并不是一件容易的事情，因为它需要大量的手工工作来收集相关文档、提取有价值的信息并相应地更新知识库。在本文中，我们的目标是使这一劳动密集型过程自动化。具体来说，我们关注的是如何从大量的Web数据中自动收集与实体相关的文档。为了解决这个问题，我们提出利用实体的一组相关实体来构建实体的轮廓，然后讨论如何使用训练数据对相关实体进行加权。在TREC 2012 KBA数据集上的实验表明，该方法优于现有方法。

引用次数: 12

Automated faceted reporting for web analytics 用于web分析的自动分面报告

Web-KR '13

Pub Date : 2013-11-01 DOI: 10.1145/2512405.2512406

Deepak Pai, Balaraman Ravindran, S. Rajagopalan, Ramesh Srinivasaraghavan

Traditionally, web analytics has focused on analysis and reporting of business metrics of interest to marketers, such as page views and revenue, by various dimensions of session characteristics, that can be obtained from user request. We introduce the notion of faceted reporting in the context of web analytics, where aggregated business metrics are reported grouped by a facet, a dimension along which a document could be represented. For example, in the case of e-Commerce sites, facets are typically various product attributes such as price, color, manufacturer, etc. For a typical website one could think of thousands of facets, but not all of them are equally important for the marketer in all reporting scenarios. In this work, we propose a business-metric driven scheme for automatic selection of facets for various reporting scenarios. The facet selection is done based on optimizing an objective function involving business metrics and we present our evaluation results based on multiple objective functions. We observe that, marketers' intuitive selection of useful facets is inaccurate. On the other hand automated methods proposed in this paper can highlight insights from the data.

传统上，web分析侧重于分析和报告营销人员感兴趣的业务指标，如页面浏览量和收入，通过会话特征的各个维度，可以从用户请求中获得。我们在web分析的上下文中引入了分面报告的概念，其中聚合的业务指标按照一个面(一个可以表示文档的维度)分组进行报告。例如，在电子商务网站的情况下，方面通常是各种产品属性，如价格、颜色、制造商等。对于一个典型的网站，人们可以想到成千上万的方面，但并不是所有的方面对营销人员在所有的报告场景中都同样重要。在这项工作中，我们提出了一个业务度量驱动的方案，用于自动选择各种报告场景的方面。面选择是在优化涉及业务指标的目标函数的基础上完成的，我们基于多个目标函数呈现评估结果。我们观察到，营销人员对有用方面的直觉选择是不准确的。另一方面，本文提出的自动化方法可以突出数据中的见解。

{"title":"Automated faceted reporting for web analytics","authors":"Deepak Pai, Balaraman Ravindran, S. Rajagopalan, Ramesh Srinivasaraghavan","doi":"10.1145/2512405.2512406","DOIUrl":"https://doi.org/10.1145/2512405.2512406","url":null,"abstract":"Traditionally, web analytics has focused on analysis and reporting of business metrics of interest to marketers, such as page views and revenue, by various dimensions of session characteristics, that can be obtained from user request. We introduce the notion of faceted reporting in the context of web analytics, where aggregated business metrics are reported grouped by a facet, a dimension along which a document could be represented. For example, in the case of e-Commerce sites, facets are typically various product attributes such as price, color, manufacturer, etc. For a typical website one could think of thousands of facets, but not all of them are equally important for the marketer in all reporting scenarios. In this work, we propose a business-metric driven scheme for automatic selection of facets for various reporting scenarios. The facet selection is done based on optimizing an objective function involving business metrics and we present our evaluation results based on multiple objective functions. We observe that, marketers' intuitive selection of useful facets is inaccurate. On the other hand automated methods proposed in this paper can highlight insights from the data.","PeriodicalId":266349,"journal":{"name":"Web-KR '13","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114417830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

The deep web: woven to catch the middle ground 深网:编织来捕捉中间地带

Web-KR '13

Pub Date : 2013-11-01 DOI: 10.1145/2512405.2512408

Wensheng Wu

The massive and diverse data sources on the Deep Web presents a serious data integration challenge. Existing virtual integration approaches suffer from slow query response, while surfacing approaches demand hefty storage space and incur huge costs in maintaining data freshness. We propose a novel hybrid integration approach that strikes a balance between the virtual and surfacing approaches. The key idea is to capture user needs in query templates and focus the integration efforts on the templates. However, realizing this approach requires innovations in template-driven query planning, query parsing, and template discovery. We elaborate on these challenges and propose our solutions.

深度网络上海量多样的数据源对数据集成提出了严峻的挑战。现有的虚拟集成方法存在查询响应缓慢的问题，而表面方法需要大量的存储空间，并且在保持数据新鲜度方面产生巨大的成本。我们提出了一种新的混合集成方法，在虚拟和表面方法之间取得平衡。关键思想是在查询模板中捕获用户需求，并将集成工作集中在模板上。然而，实现这种方法需要在模板驱动的查询规划、查询解析和模板发现方面进行创新。我们详细阐述了这些挑战并提出了解决方案。

引用次数: 4

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Web-KR '13

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀