{"title":"Á bilingual comparison of MaxEnt-and RNN-based punctuation restoration in speech transcripts","authors":"Máté Ákos Tündik, Balázs Tarján, György Szaszák","doi":"10.1109/COGINFOCOM.2017.8268227","DOIUrl":null,"url":null,"abstract":"Closed captioning is a common method to improve accessibility of TV programs for people who are hearing impaired or hard of hearing, while representing an application relevant for cognitive infocommunication. However, live captions provided by automatic speech recognition systems usually lack punctuation, making them hard to follow. In this paper, Maximum Entropy and Recurrent Neural Network based punctuation restoration models are compared on two closed captioning tasks in real-time and off-line setups. We present the first results in restoring punctuation for Hungarian broadcast speech, where the RNN significantly outperforms our MaxEnt baseline system. Our approach is also evaluated on TED talks within the IWSLT English dataset providing comparable results to the state-of-the-art systems.","PeriodicalId":212559,"journal":{"name":"2017 8th IEEE International Conference on Cognitive Infocommunications (CogInfoCom)","volume":"118 ","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 8th IEEE International Conference on Cognitive Infocommunications (CogInfoCom)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COGINFOCOM.2017.8268227","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
Closed captioning is a common method to improve accessibility of TV programs for people who are hearing impaired or hard of hearing, while representing an application relevant for cognitive infocommunication. However, live captions provided by automatic speech recognition systems usually lack punctuation, making them hard to follow. In this paper, Maximum Entropy and Recurrent Neural Network based punctuation restoration models are compared on two closed captioning tasks in real-time and off-line setups. We present the first results in restoring punctuation for Hungarian broadcast speech, where the RNN significantly outperforms our MaxEnt baseline system. Our approach is also evaluated on TED talks within the IWSLT English dataset providing comparable results to the state-of-the-art systems.