{"title":"比较自动丰富转录葡萄牙语,西班牙语和英语广播新闻","authors":"Fernando Batista, I. Trancoso, N. Mamede","doi":"10.1109/ASRU.2009.5373371","DOIUrl":null,"url":null,"abstract":"This paper describes and evaluates a language independent approach for automatically enriching the speech recognition output with punctuation marks and capitalization information. The two tasks are treated as two classification problems, using a maximum entropy modeling approach, which achieves results within state-of-the-art. The language independence of the approach is attested with experiments conducted on Portuguese, Spanish and English Broadcast News corpora. This paper provides the first comparative study between the three languages, concerning these tasks.","PeriodicalId":292194,"journal":{"name":"2009 IEEE Workshop on Automatic Speech Recognition & Understanding","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Comparing automatic rich transcription for Portuguese, Spanish and English Broadcast News\",\"authors\":\"Fernando Batista, I. Trancoso, N. Mamede\",\"doi\":\"10.1109/ASRU.2009.5373371\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper describes and evaluates a language independent approach for automatically enriching the speech recognition output with punctuation marks and capitalization information. The two tasks are treated as two classification problems, using a maximum entropy modeling approach, which achieves results within state-of-the-art. The language independence of the approach is attested with experiments conducted on Portuguese, Spanish and English Broadcast News corpora. This paper provides the first comparative study between the three languages, concerning these tasks.\",\"PeriodicalId\":292194,\"journal\":{\"name\":\"2009 IEEE Workshop on Automatic Speech Recognition & Understanding\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 IEEE Workshop on Automatic Speech Recognition & Understanding\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASRU.2009.5373371\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE Workshop on Automatic Speech Recognition & Understanding","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU.2009.5373371","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Comparing automatic rich transcription for Portuguese, Spanish and English Broadcast News
This paper describes and evaluates a language independent approach for automatically enriching the speech recognition output with punctuation marks and capitalization information. The two tasks are treated as two classification problems, using a maximum entropy modeling approach, which achieves results within state-of-the-art. The language independence of the approach is attested with experiments conducted on Portuguese, Spanish and English Broadcast News corpora. This paper provides the first comparative study between the three languages, concerning these tasks.