基于深度神经网络的视障自然场景文本到语音信号的转换

R. Kapoor, M. Sushama, Bhavani Reddy Andem, Akhila Sri Phani Sai Sindhura S.
{"title":"基于深度神经网络的视障自然场景文本到语音信号的转换","authors":"R. Kapoor, M. Sushama, Bhavani Reddy Andem, Akhila Sri Phani Sai Sindhura S.","doi":"10.1109/i-PACT52855.2021.9696523","DOIUrl":null,"url":null,"abstract":"The deep neural network architectures can be utilized to detect and recognize the text in natural or camera images. If this text can be converted to voice signals it will be very helpful for persons with partial or no vision. Selective search-based segmentation along with VGG architecture of Deep neural networks is utilized in this work for text detection. The Py-Tesseract Optimal character recognizer is then utilized for recognizing the detected text, which is then converted to voice signals. The system can beneficial for recognizing road side or corridor boards, thus adding some independence to the life of people with special needs. The system can be modified with a search mode for a particular text in the nearby location.","PeriodicalId":335956,"journal":{"name":"2021 Innovations in Power and Advanced Computing Technologies (i-PACT)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Natural Scene Text to Voice Signal Conversion for Visually Impaired using Deep Neural Network\",\"authors\":\"R. Kapoor, M. Sushama, Bhavani Reddy Andem, Akhila Sri Phani Sai Sindhura S.\",\"doi\":\"10.1109/i-PACT52855.2021.9696523\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The deep neural network architectures can be utilized to detect and recognize the text in natural or camera images. If this text can be converted to voice signals it will be very helpful for persons with partial or no vision. Selective search-based segmentation along with VGG architecture of Deep neural networks is utilized in this work for text detection. The Py-Tesseract Optimal character recognizer is then utilized for recognizing the detected text, which is then converted to voice signals. The system can beneficial for recognizing road side or corridor boards, thus adding some independence to the life of people with special needs. The system can be modified with a search mode for a particular text in the nearby location.\",\"PeriodicalId\":335956,\"journal\":{\"name\":\"2021 Innovations in Power and Advanced Computing Technologies (i-PACT)\",\"volume\":\"40 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 Innovations in Power and Advanced Computing Technologies (i-PACT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/i-PACT52855.2021.9696523\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Innovations in Power and Advanced Computing Technologies (i-PACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/i-PACT52855.2021.9696523","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

深度神经网络架构可以用来检测和识别自然图像或相机图像中的文本。如果这段文字可以转换成语音信号,这将对有部分视力或没有视力的人非常有帮助。本文利用深度神经网络的VGG结构和选择性搜索分割来进行文本检测。然后利用Py-Tesseract最优字符识别器来识别检测到的文本,然后将其转换为语音信号。该系统有助于识别路边或走廊板,从而为有特殊需要的人的生活增加一些独立性。该系统可以修改为在附近位置的特定文本的搜索模式。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Natural Scene Text to Voice Signal Conversion for Visually Impaired using Deep Neural Network
The deep neural network architectures can be utilized to detect and recognize the text in natural or camera images. If this text can be converted to voice signals it will be very helpful for persons with partial or no vision. Selective search-based segmentation along with VGG architecture of Deep neural networks is utilized in this work for text detection. The Py-Tesseract Optimal character recognizer is then utilized for recognizing the detected text, which is then converted to voice signals. The system can beneficial for recognizing road side or corridor boards, thus adding some independence to the life of people with special needs. The system can be modified with a search mode for a particular text in the nearby location.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Abnormality Detection in Humerus Bone Radiographs Using DenseNet Random Optimal Search Based Significant Gene Identification and Classification of Disease Samples Co-Design Approach of Converter Control for Battery Charging Electric Vehicle Applications Typical Analysis of Different Natural Esters and their Performance: A Review Machine Learning-Based Medium Access Control Protocol for Heterogeneous Wireless Networks: A Review
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1