自动从网络中提取意义

Rudi Cilibrasi, P. Vitányi
{"title":"自动从网络中提取意义","authors":"Rudi Cilibrasi, P. Vitányi","doi":"10.1109/ISIT.2006.261979","DOIUrl":null,"url":null,"abstract":"We consider similarity distances for two types of objects: literal objects that as such contain all of their meaning, like genomes or books, and names for objects. The latter may have literal embodiments like the first type, but may also be abstract like \"red\" or \"Christianity\". For the first type we consider a family of computable distance measures corresponding to parameters expressing similarity according to particular features between pairs of literal objects. For the second type we consider similarity distances generated by Web users corresponding to particular semantic relations between the (names for) the designated objects. For both families we give universal similarity distance measures, incorporating all particular distance measures in the family. In the first case the universal distance is based on compression and in the second case it is based on Google page counts related to search terms. In both cases experiments on a massive scale give evidence of the viability of the approaches","PeriodicalId":115298,"journal":{"name":"2006 IEEE International Symposium on Information Theory","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"30","resultStr":"{\"title\":\"Automatic Extraction of Meaning from the Web\",\"authors\":\"Rudi Cilibrasi, P. Vitányi\",\"doi\":\"10.1109/ISIT.2006.261979\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We consider similarity distances for two types of objects: literal objects that as such contain all of their meaning, like genomes or books, and names for objects. The latter may have literal embodiments like the first type, but may also be abstract like \\\"red\\\" or \\\"Christianity\\\". For the first type we consider a family of computable distance measures corresponding to parameters expressing similarity according to particular features between pairs of literal objects. For the second type we consider similarity distances generated by Web users corresponding to particular semantic relations between the (names for) the designated objects. For both families we give universal similarity distance measures, incorporating all particular distance measures in the family. In the first case the universal distance is based on compression and in the second case it is based on Google page counts related to search terms. In both cases experiments on a massive scale give evidence of the viability of the approaches\",\"PeriodicalId\":115298,\"journal\":{\"name\":\"2006 IEEE International Symposium on Information Theory\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-07-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"30\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2006 IEEE International Symposium on Information Theory\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISIT.2006.261979\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 IEEE International Symposium on Information Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISIT.2006.261979","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 30

摘要

我们考虑两种对象的相似距离:一种是包含其所有含义的文字对象,如基因组或书籍,另一种是对象的名称。后者可能有像第一种类型的文字体现,但也可能是抽象的,如“红色”或“基督教”。对于第一种类型,我们考虑一组可计算的距离度量,对应于根据文字对象对之间的特定特征表示相似性的参数。对于第二种类型,我们考虑对应于指定对象(名称)之间特定语义关系的Web用户生成的相似距离。对于这两个家庭,我们给出了普遍的相似距离度量,包括家庭中所有特定的距离度量。在第一种情况下,通用距离是基于压缩的,在第二种情况下,它是基于与搜索词相关的谷歌页面数。在这两种情况下,大规模的实验都证明了这些方法的可行性
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Automatic Extraction of Meaning from the Web
We consider similarity distances for two types of objects: literal objects that as such contain all of their meaning, like genomes or books, and names for objects. The latter may have literal embodiments like the first type, but may also be abstract like "red" or "Christianity". For the first type we consider a family of computable distance measures corresponding to parameters expressing similarity according to particular features between pairs of literal objects. For the second type we consider similarity distances generated by Web users corresponding to particular semantic relations between the (names for) the designated objects. For both families we give universal similarity distance measures, incorporating all particular distance measures in the family. In the first case the universal distance is based on compression and in the second case it is based on Google page counts related to search terms. In both cases experiments on a massive scale give evidence of the viability of the approaches
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Approximately Lower Triangular Ensembles of LPDC Codes with Linear Encoding Complexity Comparison of Network Coding and Non-Network Coding Schemes for Multi-hop Wireless Networks A New Family of Space-Time Codes for Pulse Amplitude and Position Modulated UWB Systems Constructions of Cooperative Diversity Schemes for Asynchronous Wireless Networks Union Bound Analysis of Bit Interleaved Coded Orthogonal Modulation with Differential Precoding
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1