2009 Fourth International Conference on Digital Information Management最新文献

英文中文

Effectively and efficiently detect web page duplication 有效和高效地检测网页重复

2009 Fourth International Conference on Digital Information Management

Pub Date : 2009-12-18 DOI: 10.1109/ICDIM.2009.5356801

Zhongming Han, Qian Mo, Hongzhi Liu, Jianzhi Sun

There are a lot of redundant web pages on Internet. Based on tag statistic and text similarity comparison, we present a novel multilayer framework for detecting duplicated web pages in this paper. We propose two similarity text paragraphs detection algorithms and implement our framework. The experimental results show that our approach achieves high performance, which means that duplicated web pages can be efficiently detected simply by tag statistic and text comparison.

互联网上有很多冗余的网页。本文基于标签统计和文本相似度比较，提出了一种新的多层网页重复检测框架。我们提出了两种相似文本段落检测算法并实现了我们的框架。实验结果表明，该方法取得了较高的性能，仅通过标记统计和文本比较就能有效地检测出重复的网页。

引用次数: 7

Translating Persian documents into English using knowledge based WSD 使用基于知识的WSD将波斯语文档翻译成英语

2009 Fourth International Conference on Digital Information Management

Pub Date : 2009-12-18 DOI: 10.1109/ICDIM.2009.5356770

Chakaveh Saedi, M. Shamsfard

The necessity of machine translation systems is growing rapidly due to the increase of documents and translation requests. Reading web pages, news, articles, manuals and users' guides and getting the gist of a test are some of the examples for our daily need to translation. This paper introduces a Persian to English machine translation system, named PEnT2. It uses a transfer based approach and employs the grammatical role of the sentence words as the main clue to perform the translation processes. PEnT2 translates simple Persian sentences into English using a hybrid approach. It exploits the advantages of rule based, knowledge based and corpus based methods in different components of a machine translation system including word sense disambiguation, structural transfer and structure optimization. Experiments show improved results in comparison with other available systems.

由于文件和翻译需求的增加，机器翻译系统的必要性正在迅速增长。阅读网页、新闻、文章、手册和用户指南以及获取测试要点是我们日常翻译需求的一些例子。本文介绍了一个波斯语到英语的机器翻译系统——PEnT2。它采用以迁移为基础的翻译方法，以句子单词的语法作用为主要线索进行翻译过程。PEnT2使用混合方法将简单的波斯语句子翻译成英语。它利用基于规则、基于知识和基于语料库的方法在机器翻译系统的不同组成部分的优势，包括词义消歧、结构迁移和结构优化。实验结果表明，与其他现有系统相比，该系统的效果有所改善。

引用次数: 1

Document cluster detection on latent projections 基于潜在投影的文档聚类检测

2009 Fourth International Conference on Digital Information Management

Pub Date : 2009-12-18 DOI: 10.1109/ICDIM.2009.5356765

Dora Alvarez-Medina, H. Hidalgo-Silva

Probabilistic text data modeling is usually considered with Bernoulli or multinomial event models. The main problem of text mining is the large amount of zero account in the matrix representation. Recently a document visualization technique incorporating the Zero Inflated Poisson model in the Generative Topographic Mapping algorithm has been proposed. This probabilistic model can be applied as a text document visualization tool. In this work, an algorithm for automatically extracting the clusters in the visualization results is presented. The combination of visualization-cluster extraction algorithms allows to obtain and evaluate document collections. Several results are presented for 20-Newsgroups and Reuters data.

概率文本数据建模通常与伯努利或多项事件模型一起考虑。文本挖掘的主要问题是矩阵表示中存在大量的零账户。近年来提出了一种将零膨胀泊松模型引入生成式地形映射算法的文档可视化技术。该概率模型可作为文本文档可视化工具。本文提出了一种从可视化结果中自动提取聚类的算法。可视化聚类提取算法的组合允许获取和评估文档集合。本文给出了20个新闻组和路透社数据的几个结果。

引用次数: 0

Feature based similarity search in simplified surface 3D model using interpolation method 基于特征的简化曲面三维模型相似度插值搜索

2009 Fourth International Conference on Digital Information Management

Pub Date : 2009-12-18 DOI: 10.1109/ICDIM.2009.5356790

A. Kim, O. Gwun, Juwhan Song

This paper proposes the feature descriptor for 3D model similarity search using the distribution of normal directions on the simplified surface. Feature descriptor of 3D model should be invariant to translation, rotation and scale for its model. So this paper normalizes all the model using PCA and preprocesses surface mesh simplification to robust against noise. The normal is sampled in proportion to each polygon's area and then it is calculated by weight average method via angles and interpolated. We implemented the 3D model retrieval system and performed the similarity search test with the shape bench mark data provided by the Princeton University. Experimental results show the performance improvement of proposed algorithm from 24.7% to 32.2% in comparison with conventional methods by ANMRR.

本文提出了利用简化曲面上法线方向分布进行三维模型相似性搜索的特征描述符。三维模型的特征描述符应该对模型的平移、旋转和缩放保持不变。因此，本文采用主成分分析法对所有模型进行归一化，并对表面网格简化进行预处理，以增强对噪声的鲁棒性。法线按每个多边形的面积比例采样，然后通过角度加权平均法计算并插值。我们实现了三维模型检索系统，并利用普林斯顿大学提供的形状基准数据进行了相似度搜索测试。实验结果表明，与传统算法相比，该算法的性能提高了24.7% ~ 32.2%。

引用次数: 0

Improving manufacturing efficiency at ford using product centred knowledge management 利用以产品为中心的知识管理提高福特的生产效率

2009 Fourth International Conference on Digital Information Management

Pub Date : 2009-12-18 DOI: 10.1109/ICDIM.2009.5356779

M. Raza, T. Kirkham, R. Harrison, Quentin Hugues Reul

Western manufacturing is under pressure to produce high quality customized products particularly in large manufacturing such as the car industry at low costs. Here the development of such product customization requires the adoption of innovative agile manufacturing techniques. To date this innovation has focused on the improved process development between the different stages of manufacturing Product Lifecycle Management (PLM). However in terms of implementing the application, data management techniques have lagged behind often leaving these processes disjointed and lacking in automation. This paper proposes an improved model based on innovation in the manufacturing PLM. Building on existing work in the use of ontologies for knowledge management, the paper applies these techniques to PLM. The implementation has been applied to develop a case study around a Ford production line. The prototype presents an innovative approach to PLM and tested using a state of the art Web Service infrastructure implemented on a Ford Powertrain test rig.

西方制造业面临着以低成本生产高质量定制产品的压力，尤其是在汽车等大型制造业。在这里，这种产品定制的发展需要采用创新的敏捷制造技术。迄今为止，这种创新主要集中在制造产品生命周期管理(PLM)的不同阶段之间改进的过程开发。然而，在实现应用程序方面，数据管理技术落后，经常使这些过程脱节，缺乏自动化。本文提出了一种基于创新的制造业PLM改进模型。在现有知识管理本体工作的基础上，本文将这些技术应用于PLM。该实现已应用于围绕福特生产线开发一个案例研究。该原型提出了一种创新的PLM方法，并在福特动力总成测试平台上使用了最先进的Web服务基础设施进行了测试。

引用次数: 3

Discovering political tendency in bulletin board discussions by social community analysis 通过社会群体分析发现论坛讨论中的政治倾向

2009 Fourth International Conference on Digital Information Management

Pub Date : 2009-12-18 DOI: 10.1109/ICDIM.2009.5356800

Kang-Che Lee, M. Shan

Bulletin Board System (BBS) is very popular and provide an asynchronous, text-based environment for users to exchange information and idea. A BBS consists of a number of discussion boards, each of which focuses on a particular subject. A discussion on a topic consists of a seed articles followed by some articles responsive to the seed article or other responsive articles. This paper investigates the social community analysis technique to discover the political tendency of users within the boards from discussions. We first extract the social interactions between users, such as "reply" and "advocate" of posts between users. A social network among users is constructed based on the extracted social interaction. After building the social network, we employ the graph partition, graph coloring, and graph clustering algorithms respectively to discover the social communities. Users of the same community have more potential of political opinion agreement with each other. By using this approach, we are able to partition users into two opposite groups and identify their political tendency effectively without linguistic analysis of discussion content.

电子公告板系统(BBS)非常流行，它为用户提供了一个异步的、基于文本的交流信息和想法的环境。BBS由许多讨论板组成，每个讨论板都关注一个特定的主题。关于一个主题的讨论由种子文章组成，随后是一些响应种子文章或其他响应文章的文章。本文采用社会社区分析技术，从讨论中发现论坛内用户的政治倾向。我们首先提取用户之间的社交互动，例如用户之间的帖子“回复”和“倡导”。基于提取的社交交互，构建用户间的社交网络。在构建社会网络后，我们分别采用图划分、图着色和图聚类算法来发现社会社区。同一社区的用户具有更大的政治观点一致的潜力。通过使用这种方法，我们能够将用户划分为两个相反的群体，并有效地识别他们的政治倾向，而无需对讨论内容进行语言分析。

{"title":"Discovering political tendency in bulletin board discussions by social community analysis","authors":"Kang-Che Lee, M. Shan","doi":"10.1109/ICDIM.2009.5356800","DOIUrl":"https://doi.org/10.1109/ICDIM.2009.5356800","url":null,"abstract":"Bulletin Board System (BBS) is very popular and provide an asynchronous, text-based environment for users to exchange information and idea. A BBS consists of a number of discussion boards, each of which focuses on a particular subject. A discussion on a topic consists of a seed articles followed by some articles responsive to the seed article or other responsive articles. This paper investigates the social community analysis technique to discover the political tendency of users within the boards from discussions. We first extract the social interactions between users, such as \"reply\" and \"advocate\" of posts between users. A social network among users is constructed based on the extracted social interaction. After building the social network, we employ the graph partition, graph coloring, and graph clustering algorithms respectively to discover the social communities. Users of the same community have more potential of political opinion agreement with each other. By using this approach, we are able to partition users into two opposite groups and identify their political tendency effectively without linguistic analysis of discussion content.","PeriodicalId":300287,"journal":{"name":"2009 Fourth International Conference on Digital Information Management","volume":"98 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128003259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Confounded factor effects on battery life in wireless sensor networks 混合因素对无线传感器网络电池寿命的影响

2009 Fourth International Conference on Digital Information Management

Pub Date : 2009-12-18 DOI: 10.1109/ICDIM.2009.5356794

N. Xu, K. Subbu, Shijun Tang

Widely utilized for numerous applications, wireless sensor networks have become a boon to the academia and industrial communities. However, the stringent energy constraints place a hurdle to the continuous functioning of the sensing devices. Clustering a network leads to a lesser number of nodes participating in active transmissions. Data aggregation reduces redundant packet sending. Given that transmission is the primary energy consuming activity for a sensor mote, this paper exploits the joint advantages of clustering and data aggregation in decreasing communication cost. A data centric analysis was performed to justify the use of data aggregation. Experimental results on a realistic platform with MICAz motes running TinyOS embedded system showed 22% energy savings and 13% overhead reduction, confirming the attractive advantages of data aggregation and clustering such as more efficient transmissions.

无线传感器网络被广泛应用于许多领域，已成为学术界和工业界的福音。然而，严格的能量限制对传感装置的连续工作造成了障碍。网络集群化导致参与主动传输的节点数量减少。数据聚合减少了冗余报文的发送。考虑到传输是传感器节点的主要能量消耗活动，本文利用聚类和数据聚合的联合优势来降低通信成本。执行了以数据为中心的分析，以证明使用数据聚合的合理性。在运行TinyOS嵌入式系统的MICAz motes实际平台上的实验结果表明，该系统节能22%，开销降低13%，证实了数据聚合和集群的优势，如更高效的传输。

引用次数: 2

On a visual frequent itemset mining 对可视化频繁项集进行挖掘

2009 Fourth International Conference on Digital Information Management

Pub Date : 2009-12-18 DOI: 10.1109/ICDIM.2009.5356762

S. Lim

Given a large, dense transaction database, generating interesting frequent patterns in a user friendly manner remains as an important issue in data mining. It is because the minimum support, the most popular statistical significance measurement, is not capable of reflecting the domain user's interest. This paper presents visual frequent itemset mining (VFIM) as an alternative to the traditional apriori-like frequent itemset mining. VFIM pushes the domain user's cognitive power into the data mining process. To this end, a formal visual data mining model is proposed and a prototype of the model is created. The effectiveness of the proposed model is demonstrated by showing that VFIM generates frequent patterns, by means of user interaction, that are compatible with those generated by traditional apriori-like algorithms without executing them.

给定一个大型、密集的事务数据库，以用户友好的方式生成有趣的频繁模式仍然是数据挖掘中的一个重要问题。这是因为最小支持度，最流行的统计显著性度量，不能反映领域用户的兴趣。本文提出了视觉频繁项集挖掘(VFIM)作为传统类先验频繁项集挖掘的替代方法。VFIM将领域用户的认知能力引入到数据挖掘过程中。为此，提出了一种形式化的可视化数据挖掘模型，并建立了模型原型。通过显示VFIM通过用户交互生成的频繁模式与传统的类先验算法生成的模式兼容，该模型的有效性得到了证明。

引用次数: 5

A user model for personalization services 个性化服务的用户模型

2009 Fourth International Conference on Digital Information Management

Pub Date : 2009-12-18 DOI: 10.1109/ICDIM.2009.5356766

YeSun Joung, M. Zarki, R. Jain

A user model is essential to support personalization services. However, the user models that are used in most systems are designed in an ad-hoc manner and often related to their application domains. It hinders service interoperability and increases the amount of work. In this paper, we focus on defining a general user model in order to represent user information and the user context. This paper contributes two points: a survey on existing context based systems and a user model for personalization. The survey analyses previous context based systems and provides different kinds of features used in their systems. Based on this survey, we propose a user model to capture user information and contexts for personalization application.

用户模型对于支持个性化服务至关重要。然而，在大多数系统中使用的用户模型是以特别的方式设计的，并且通常与其应用程序域相关。它阻碍了服务互操作性并增加了工作量。在本文中，我们着重于定义一个通用的用户模型，以表示用户信息和用户上下文。本文对现有的基于上下文的系统进行了综述，并提出了一种个性化的用户模型。该调查分析了以前基于上下文的系统，并提供了在其系统中使用的不同类型的功能。在此基础上，我们提出了一个用户模型来捕获用户信息和个性化应用的上下文。

引用次数: 20

A rule-based conversion of an object-oriented database schema to a schema in XML schema 从面向对象的数据库模式到XML模式模式的基于规则的转换

2009 Fourth International Conference on Digital Information Management

Pub Date : 2009-12-18 DOI: 10.1109/ICDIM.2009.5356777

F. F. F. Peres, R. Mello

Data interchange between different computer systems is a common task today, and the use of XML as an exchanging protocol has increasing. XML Schema recommendation allows the definition of an XML structure to be used for applications that communicate data each other. This paper proposes a rule-based approach for converting object-oriented (OO) database schemata to XML Schema schemata, as well as an algorithm that defines the application of the rules. We consider a schema mapping process in the OO→XML direction based on a detailed analysis of the OO database model concepts. A prototype tool was implemented to validate the process. Compared to related work, our proposal considers the mapping of all OODB model concepts to equivalent data structures in XML.

不同计算机系统之间的数据交换是当今的一项常见任务，并且越来越多地使用XML作为交换协议。XML模式推荐允许定义XML结构，用于相互通信数据的应用程序。本文提出了一种将面向对象(OO)数据库模式转换为XML Schema模式的基于规则的方法，以及定义规则应用的算法。基于对OO数据库模型概念的详细分析，我们考虑了OO→XML方向上的模式映射过程。实现了一个原型工具来验证该过程。与相关工作相比，我们的建议考虑将所有OODB模型概念映射到XML中的等效数据结构。

引用次数: 5

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2009 Fourth International Conference on Digital Information Management

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀