{"title":"Statistical string similarity model for information linkage","authors":"A. Takasu","doi":"10.2201/NIIPI.2009.6.7","DOIUrl":null,"url":null,"abstract":"This paper proposes a statistical string similarity model for approximate matching in information linkage. The proposed similarity model is an extension of hidden Markov model and its learnable ability realizes string matching function adaptable to various information sources. The main contribution of this paper is to develop an efficient learning algorithm for estimating parameters of the statistical similarity model. The proposed algorithm is based on the Expectation-Maximization (EM) technique where dynamic programing technique is used to update parameters in EM process.","PeriodicalId":91638,"journal":{"name":"... Proceedings of the ... IEEE International Conference on Progress in Informatics and Computing. IEEE International Conference on Progress in Informatics and Computing","volume":"39 1","pages":"57"},"PeriodicalIF":0.0000,"publicationDate":"2009-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"... Proceedings of the ... IEEE International Conference on Progress in Informatics and Computing. IEEE International Conference on Progress in Informatics and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2201/NIIPI.2009.6.7","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper proposes a statistical string similarity model for approximate matching in information linkage. The proposed similarity model is an extension of hidden Markov model and its learnable ability realizes string matching function adaptable to various information sources. The main contribution of this paper is to develop an efficient learning algorithm for estimating parameters of the statistical similarity model. The proposed algorithm is based on the Expectation-Maximization (EM) technique where dynamic programing technique is used to update parameters in EM process.