{"title":"Improving precision and recall for Soundex retrieval","authors":"David Holmes, David Holmes","doi":"10.1109/ITCC.2002.1000354","DOIUrl":null,"url":null,"abstract":"We present a phonetic algorithm for name searches that fuses existing techniques [the Soundex system of Russell and the techniques of J. Celko (1995) and U. Pfeifer et al.] and that introduces new features. This combination offers improved precision and recall. The described experiments assign multiple phonetic codes to each name. Counting common phonetic codes and digrams, the experiments implement the Dice coefficient to assign a similarity score between names. We use the Pfeifer corpus and relevance assessments to compare and contrast our experimental results with traditional techniques.","PeriodicalId":115190,"journal":{"name":"Proceedings. International Conference on Information Technology: Coding and Computing","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"97","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. International Conference on Information Technology: Coding and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITCC.2002.1000354","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 97
Abstract
We present a phonetic algorithm for name searches that fuses existing techniques [the Soundex system of Russell and the techniques of J. Celko (1995) and U. Pfeifer et al.] and that introduces new features. This combination offers improved precision and recall. The described experiments assign multiple phonetic codes to each name. Counting common phonetic codes and digrams, the experiments implement the Dice coefficient to assign a similarity score between names. We use the Pfeifer corpus and relevance assessments to compare and contrast our experimental results with traditional techniques.