An Alternative Deep Feature Approach to Line Level Keyword Spotting

George Retsinas, G. Louloudis, N. Stamatopoulos, Giorgos Sfikas, B. Gatos
{"title":"An Alternative Deep Feature Approach to Line Level Keyword Spotting","authors":"George Retsinas, G. Louloudis, N. Stamatopoulos, Giorgos Sfikas, B. Gatos","doi":"10.1109/CVPR.2019.01294","DOIUrl":null,"url":null,"abstract":"Keyword spotting (KWS) is defined as the problem of detecting all instances of a given word, provided by the user either as a query word image (Query-by-Example, QbE) or a query word string (Query-by-String, QbS) in a body of digitized documents. Keyword detection is typically preceded by a preprocessing step where the text is segmented into text lines (line-level KWS). Methods following this paradigm are monopolized by test-time computationally expensive handwritten text recognition (HTR)-based approaches; furthermore, they typically cannot handle image queries (QbE). In this work, we propose a time and storage-efficient, deep feature-based approach that enables both the image and textual search options. Three distinct components, all modeled as neural networks, are combined: normalization, feature extraction and representation of image and textual input into a common space. These components, even if designed on word level image representations, collaborate in order to achieve an efficient line level keyword spotting system. The experimental results indicate that the proposed system is on par with state-of-the-art KWS methods.","PeriodicalId":6711,"journal":{"name":"2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"1 1","pages":"12650-12658"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR.2019.01294","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

Keyword spotting (KWS) is defined as the problem of detecting all instances of a given word, provided by the user either as a query word image (Query-by-Example, QbE) or a query word string (Query-by-String, QbS) in a body of digitized documents. Keyword detection is typically preceded by a preprocessing step where the text is segmented into text lines (line-level KWS). Methods following this paradigm are monopolized by test-time computationally expensive handwritten text recognition (HTR)-based approaches; furthermore, they typically cannot handle image queries (QbE). In this work, we propose a time and storage-efficient, deep feature-based approach that enables both the image and textual search options. Three distinct components, all modeled as neural networks, are combined: normalization, feature extraction and representation of image and textual input into a common space. These components, even if designed on word level image representations, collaborate in order to achieve an efficient line level keyword spotting system. The experimental results indicate that the proposed system is on par with state-of-the-art KWS methods.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
一种行级关键字定位的深度特征替代方法
关键字查找(KWS)被定义为检测给定单词的所有实例的问题,这些单词由用户在数字化文档主体中以查询词图像(按示例查询)或查询词字符串(按字符串查询)的形式提供。关键字检测之前通常有一个预处理步骤,其中文本被分割成文本行(行级KWS)。遵循这种范式的方法被基于测试时间的计算昂贵的手写文本识别(HTR)方法所垄断;此外,它们通常不能处理图像查询(QbE)。在这项工作中,我们提出了一种时间和存储效率高、基于深度特征的方法,该方法支持图像和文本搜索选项。三个不同的组成部分,所有建模为神经网络,结合:归一化,特征提取和表示图像和文本输入到一个共同的空间。这些组件,即使设计在字级图像表示,协作,以实现一个有效的行级关键字定位系统。实验结果表明,所提出的系统与最先进的KWS方法相当。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Multi-Level Context Ultra-Aggregation for Stereo Matching Leveraging Heterogeneous Auxiliary Tasks to Assist Crowd Counting Incremental Object Learning From Contiguous Views Progressive Teacher-Student Learning for Early Action Prediction Inverse Discriminative Networks for Handwritten Signature Verification
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1