{"title":"Named Entity Transcription with Pair n-Gram Models","authors":"Martin Jansche, R. Sproat","doi":"10.3115/1699705.1699713","DOIUrl":null,"url":null,"abstract":"We submitted results for each of the eight shared tasks. Except for Japanese name kanji restoration, which uses a noisy channel model, our Standard Run submissions were produced by generative long-range pair n-gram models, which we mostly augmented with publicly available data (either from LDC datasets or mined from Wikipedia) for the Non-Standard Runs.","PeriodicalId":262513,"journal":{"name":"NEWS@IJCNLP","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"NEWS@IJCNLP","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3115/1699705.1699713","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14
Abstract
We submitted results for each of the eight shared tasks. Except for Japanese name kanji restoration, which uses a noisy channel model, our Standard Run submissions were produced by generative long-range pair n-gram models, which we mostly augmented with publicly available data (either from LDC datasets or mined from Wikipedia) for the Non-Standard Runs.