ME-Match:基于音调分组的跨脚本名称匹配方法

Kyaw Zar Zar Phyu, Khin Mar Lar Tun
{"title":"ME-Match:基于音调分组的跨脚本名称匹配方法","authors":"Kyaw Zar Zar Phyu, Khin Mar Lar Tun","doi":"10.1109/ICFCC.2009.24","DOIUrl":null,"url":null,"abstract":"Even though matching between different scripts could be immensely useful for news organizations, author recognition with cross-script matches in digital libraries and homeland security, it is impossible to automatically match. Now, we propose a new approach, ME-Match, for matching the proper names across different scripts. The foremost concept of our approach is to match them via phoneme strings. The main steps in ME-Match are creation of bilingual pronouncing mapping, tokenization of query names, transformation of query names to IPA forms based on tonal grouping approach, searching possible various words in both scripts for each query IPA phoneme string, combination of various words to become full name strings and then searching names. The performance is measured by standard information-retrieval metrics: recall, precision, and f-measures.","PeriodicalId":338489,"journal":{"name":"2009 International Conference on Future Computer and Communication","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ME-Match: Tonal Grouping Based Approach in Cross-Script Name Matching\",\"authors\":\"Kyaw Zar Zar Phyu, Khin Mar Lar Tun\",\"doi\":\"10.1109/ICFCC.2009.24\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Even though matching between different scripts could be immensely useful for news organizations, author recognition with cross-script matches in digital libraries and homeland security, it is impossible to automatically match. Now, we propose a new approach, ME-Match, for matching the proper names across different scripts. The foremost concept of our approach is to match them via phoneme strings. The main steps in ME-Match are creation of bilingual pronouncing mapping, tokenization of query names, transformation of query names to IPA forms based on tonal grouping approach, searching possible various words in both scripts for each query IPA phoneme string, combination of various words to become full name strings and then searching names. The performance is measured by standard information-retrieval metrics: recall, precision, and f-measures.\",\"PeriodicalId\":338489,\"journal\":{\"name\":\"2009 International Conference on Future Computer and Communication\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-04-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 International Conference on Future Computer and Communication\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICFCC.2009.24\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 International Conference on Future Computer and Communication","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICFCC.2009.24","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

尽管不同脚本之间的匹配对于新闻机构、数字图书馆和国土安全中的跨脚本匹配的作者识别非常有用,但不可能自动匹配。现在,我们提出一种新的方法ME-Match,用于在不同的脚本之间匹配专有名称。我们方法的首要概念是通过音素字符串匹配它们。ME-Match的主要步骤是创建双语发音映射、对查询名称进行标记化、基于音调分组方法将查询名称转换为国际音标形式、为每个查询国际音标音位字符串在两种文字中搜索可能的各种单词、将各种单词组合成全名字符串、然后搜索名称。性能是通过标准的信息检索度量来衡量的:召回率、精确度和f度量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
ME-Match: Tonal Grouping Based Approach in Cross-Script Name Matching
Even though matching between different scripts could be immensely useful for news organizations, author recognition with cross-script matches in digital libraries and homeland security, it is impossible to automatically match. Now, we propose a new approach, ME-Match, for matching the proper names across different scripts. The foremost concept of our approach is to match them via phoneme strings. The main steps in ME-Match are creation of bilingual pronouncing mapping, tokenization of query names, transformation of query names to IPA forms based on tonal grouping approach, searching possible various words in both scripts for each query IPA phoneme string, combination of various words to become full name strings and then searching names. The performance is measured by standard information-retrieval metrics: recall, precision, and f-measures.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A UK Case Study - Technology Enhances Educational Experiences in the University of Glamorgan Sequence-Based Data Dissemination Algorithms for Peer-to-Peer Multicast Protocols IMS Network Architecture Securing a Low Level Reader Protocol Connection and a Study of its Performance Inter-symbol-Interference Reduction in Indoor Infrared Systems by Effective Sampling
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1