用于自动文本标点的宽残差网络 1D

IberSPEECH Conference Pub Date : 2018-11-21 DOI:10.21437/IBERSPEECH.2018-62

Jorge Llombart, A. Miguel, A. Ortega, EDUARDO LLEIDA SOLANO

{"title":"用于自动文本标点的宽残差网络 1D","authors":"Jorge Llombart, A. Miguel, A. Ortega, EDUARDO LLEIDA SOLANO","doi":"10.21437/IBERSPEECH.2018-62","DOIUrl":null,"url":null,"abstract":"Documentation and analysis of multimedia resources usually requires a large pipeline with many stages. It is common to obtain texts without punctuation at some point, although later steps might need some accurate punctuation, like the ones related to natural language processing. This paper is focused on the task of recovering pause punctuation from a text without prosodic or acoustic information. We propose the use of Wide Residual Networks to predict which words should have a comma or stop from a text with removed punctuation. Wide Residual Networks are a well-known technique in image processing, but they are not commonly used in other areas as speech or natural language processing. We propose the use of Wide residual networks because they show great stability and the ability to work with long and short contextual dependencies in deep structures. Unlike for image processing, we will use 1-Dimensional convolutions because in text processing we only focus on the temporal dimension. Moreover, this architecture allows us to work with past and future context. This paper compares this architecture with Long-Short Term Memory cells which are used in this task and also combine the two architectures to get better results than each of them separately.","PeriodicalId":115963,"journal":{"name":"IberSPEECH Conference","volume":"81 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Wide Residual Networks 1D for Automatic Text Punctuation\",\"authors\":\"Jorge Llombart, A. Miguel, A. Ortega, EDUARDO LLEIDA SOLANO\",\"doi\":\"10.21437/IBERSPEECH.2018-62\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Documentation and analysis of multimedia resources usually requires a large pipeline with many stages. It is common to obtain texts without punctuation at some point, although later steps might need some accurate punctuation, like the ones related to natural language processing. This paper is focused on the task of recovering pause punctuation from a text without prosodic or acoustic information. We propose the use of Wide Residual Networks to predict which words should have a comma or stop from a text with removed punctuation. Wide Residual Networks are a well-known technique in image processing, but they are not commonly used in other areas as speech or natural language processing. We propose the use of Wide residual networks because they show great stability and the ability to work with long and short contextual dependencies in deep structures. Unlike for image processing, we will use 1-Dimensional convolutions because in text processing we only focus on the temporal dimension. Moreover, this architecture allows us to work with past and future context. This paper compares this architecture with Long-Short Term Memory cells which are used in this task and also combine the two architectures to get better results than each of them separately.\",\"PeriodicalId\":115963,\"journal\":{\"name\":\"IberSPEECH Conference\",\"volume\":\"81 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-11-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IberSPEECH Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.21437/IBERSPEECH.2018-62\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IberSPEECH Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21437/IBERSPEECH.2018-62","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

多媒体资源的文档化和分析通常需要一个包含多个阶段的大型流程。在某些时候获得没有标点符号的文本是很常见的，尽管后面的步骤可能需要一些准确的标点符号，比如与自然语言处理相关的标点符号。本文主要研究从没有韵律或声学信息的文本中恢复暂停标点符号的任务。我们建议使用宽残差网络来预测在删除标点符号的文本中哪些单词应该有逗号或句号。宽残差网络是一种众所周知的图像处理技术，但在语音或自然语言处理等其他领域并不常用。我们建议使用宽残差网络，因为它们在深度结构中表现出很强的稳定性和处理长和短上下文依赖关系的能力。与图像处理不同，我们将使用一维卷积，因为在文本处理中我们只关注时间维度。此外，这种架构允许我们处理过去和未来的环境。本文将这种结构与长短期记忆单元进行了比较，并将两种结构结合起来，得到了比单独使用更好的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Wide Residual Networks 1D for Automatic Text Punctuation

Documentation and analysis of multimedia resources usually requires a large pipeline with many stages. It is common to obtain texts without punctuation at some point, although later steps might need some accurate punctuation, like the ones related to natural language processing. This paper is focused on the task of recovering pause punctuation from a text without prosodic or acoustic information. We propose the use of Wide Residual Networks to predict which words should have a comma or stop from a text with removed punctuation. Wide Residual Networks are a well-known technique in image processing, but they are not commonly used in other areas as speech or natural language processing. We propose the use of Wide residual networks because they show great stability and the ability to work with long and short contextual dependencies in deep structures. Unlike for image processing, we will use 1-Dimensional convolutions because in text processing we only focus on the temporal dimension. Moreover, this architecture allows us to work with past and future context. This paper compares this architecture with Long-Short Term Memory cells which are used in this task and also combine the two architectures to get better results than each of them separately.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IberSPEECH Conference

自引率

0.00%

发文量