EfficientTreeMiner: Mining Frequent Induced Substructures from XML Documents without Candidate Generation

P. S. Thilagam, V. S. Ananthanarayana
{"title":"EfficientTreeMiner: Mining Frequent Induced Substructures from XML Documents without Candidate Generation","authors":"P. S. Thilagam, V. S. Ananthanarayana","doi":"10.1109/ADCOM.2006.4289951","DOIUrl":null,"url":null,"abstract":"Tree structures are used extensively in domains such as XML databases, computational biology, pattern recognition, computer networks, Web mining, multi-relational data mining and so on. In this paper, we present an EfficientTreeMiner, a computationally efficient algorithm that discovers all frequently occurring induced subtrees in a database of labeled rooted unordered trees. The proposed algorithm mines frequent subtrees without generating any candidate subtrees. Efficiency is achieved by compressing the large database into a condensed data structure, namely prefix string representation, which reduces space complexity and by adopting a frequent immediate descendents method that avoids the costly generation of candidate sets. Experimental results show that our algorithm has less time complexity when compared to existing approaches and is also scalable for mining both long and short frequent subtrees.","PeriodicalId":296627,"journal":{"name":"2006 International Conference on Advanced Computing and Communications","volume":"74 3","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 International Conference on Advanced Computing and Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ADCOM.2006.4289951","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Tree structures are used extensively in domains such as XML databases, computational biology, pattern recognition, computer networks, Web mining, multi-relational data mining and so on. In this paper, we present an EfficientTreeMiner, a computationally efficient algorithm that discovers all frequently occurring induced subtrees in a database of labeled rooted unordered trees. The proposed algorithm mines frequent subtrees without generating any candidate subtrees. Efficiency is achieved by compressing the large database into a condensed data structure, namely prefix string representation, which reduces space complexity and by adopting a frequent immediate descendents method that avoids the costly generation of candidate sets. Experimental results show that our algorithm has less time complexity when compared to existing approaches and is also scalable for mining both long and short frequent subtrees.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
EfficientTreeMiner:从XML文档中挖掘频繁的诱导子结构,而不需要候选生成
树形结构广泛应用于XML数据库、计算生物学、模式识别、计算机网络、Web挖掘、多关系数据挖掘等领域。在本文中,我们提出了一种高效的算法——高效树算法,它可以在标记的有根无序树数据库中发现所有频繁出现的诱导子树。该算法在不生成候选子树的情况下挖掘频繁子树。通过将大型数据库压缩成压缩的数据结构,即前缀字符串表示,降低了空间复杂性,并采用频繁的直接后代方法,避免了昂贵的候选集生成,从而提高了效率。实验结果表明,与现有方法相比,该算法具有较低的时间复杂度,并且在挖掘长、短频繁子树方面具有可扩展性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Message Integrity in the World Wide Web: Use of Nested Hash Function and a Fast Stream Cipher Feature Extraction Learning for Stereovision Based Robot Navigation System Semantics for a Distributed Programming Language Using SACS and Weakest Pre-Conditions On Evaluating Obfuscatory Strength of Alias-based Transforms using Static Analysis A Multi-Algorithmic Face Recognition System
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1