{"title":"在大众分类法中挖掘标签相似度","authors":"Geir Solskinnsbakk, J. Gulla","doi":"10.1145/2065023.2065037","DOIUrl":null,"url":null,"abstract":"Folksonomies are becoming increasingly popular, both among users who find them simple and intuitive to use, and scientists as interesting research objects. Folksonomies can be viewed as large informal sources of semantics. Harnessing the semantics for search or concept extraction requires us to be able to recognize linguistic similarity between tags. In this paper we propose an approach that uses a combination of morpho-syntactic and semantic similarity measures without using any external linguistic resources to mine tag pairs that can be reduced to base tags. Our approach is based on the Levenshtein distance for morpho-syntactic similarity and tag signatures for semantic similarity. The evaluation of our approach, based on a data set crawled from Delicious, shows that we are able to recognize a wide range of linguistic variations with high quality.","PeriodicalId":341071,"journal":{"name":"SMUC '11","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":"{\"title\":\"Mining tag similarity in folksonomies\",\"authors\":\"Geir Solskinnsbakk, J. Gulla\",\"doi\":\"10.1145/2065023.2065037\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Folksonomies are becoming increasingly popular, both among users who find them simple and intuitive to use, and scientists as interesting research objects. Folksonomies can be viewed as large informal sources of semantics. Harnessing the semantics for search or concept extraction requires us to be able to recognize linguistic similarity between tags. In this paper we propose an approach that uses a combination of morpho-syntactic and semantic similarity measures without using any external linguistic resources to mine tag pairs that can be reduced to base tags. Our approach is based on the Levenshtein distance for morpho-syntactic similarity and tag signatures for semantic similarity. The evaluation of our approach, based on a data set crawled from Delicious, shows that we are able to recognize a wide range of linguistic variations with high quality.\",\"PeriodicalId\":341071,\"journal\":{\"name\":\"SMUC '11\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-10-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"21\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"SMUC '11\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2065023.2065037\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"SMUC '11","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2065023.2065037","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Folksonomies are becoming increasingly popular, both among users who find them simple and intuitive to use, and scientists as interesting research objects. Folksonomies can be viewed as large informal sources of semantics. Harnessing the semantics for search or concept extraction requires us to be able to recognize linguistic similarity between tags. In this paper we propose an approach that uses a combination of morpho-syntactic and semantic similarity measures without using any external linguistic resources to mine tag pairs that can be reduced to base tags. Our approach is based on the Levenshtein distance for morpho-syntactic similarity and tag signatures for semantic similarity. The evaluation of our approach, based on a data set crawled from Delicious, shows that we are able to recognize a wide range of linguistic variations with high quality.