基于深度神经网络的视障自然场景文本到语音信号的转换

2021 Innovations in Power and Advanced Computing Technologies (i-PACT) Pub Date : 2021-11-27 DOI:10.1109/i-PACT52855.2021.9696523

R. Kapoor, M. Sushama, Bhavani Reddy Andem, Akhila Sri Phani Sai Sindhura S.

{"title":"基于深度神经网络的视障自然场景文本到语音信号的转换","authors":"R. Kapoor, M. Sushama, Bhavani Reddy Andem, Akhila Sri Phani Sai Sindhura S.","doi":"10.1109/i-PACT52855.2021.9696523","DOIUrl":null,"url":null,"abstract":"The deep neural network architectures can be utilized to detect and recognize the text in natural or camera images. If this text can be converted to voice signals it will be very helpful for persons with partial or no vision. Selective search-based segmentation along with VGG architecture of Deep neural networks is utilized in this work for text detection. The Py-Tesseract Optimal character recognizer is then utilized for recognizing the detected text, which is then converted to voice signals. The system can beneficial for recognizing road side or corridor boards, thus adding some independence to the life of people with special needs. The system can be modified with a search mode for a particular text in the nearby location.","PeriodicalId":335956,"journal":{"name":"2021 Innovations in Power and Advanced Computing Technologies (i-PACT)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Natural Scene Text to Voice Signal Conversion for Visually Impaired using Deep Neural Network\",\"authors\":\"R. Kapoor, M. Sushama, Bhavani Reddy Andem, Akhila Sri Phani Sai Sindhura S.\",\"doi\":\"10.1109/i-PACT52855.2021.9696523\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The deep neural network architectures can be utilized to detect and recognize the text in natural or camera images. If this text can be converted to voice signals it will be very helpful for persons with partial or no vision. Selective search-based segmentation along with VGG architecture of Deep neural networks is utilized in this work for text detection. The Py-Tesseract Optimal character recognizer is then utilized for recognizing the detected text, which is then converted to voice signals. The system can beneficial for recognizing road side or corridor boards, thus adding some independence to the life of people with special needs. The system can be modified with a search mode for a particular text in the nearby location.\",\"PeriodicalId\":335956,\"journal\":{\"name\":\"2021 Innovations in Power and Advanced Computing Technologies (i-PACT)\",\"volume\":\"40 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 Innovations in Power and Advanced Computing Technologies (i-PACT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/i-PACT52855.2021.9696523\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Innovations in Power and Advanced Computing Technologies (i-PACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/i-PACT52855.2021.9696523","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

深度神经网络架构可以用来检测和识别自然图像或相机图像中的文本。如果这段文字可以转换成语音信号，这将对有部分视力或没有视力的人非常有帮助。本文利用深度神经网络的VGG结构和选择性搜索分割来进行文本检测。然后利用Py-Tesseract最优字符识别器来识别检测到的文本，然后将其转换为语音信号。该系统有助于识别路边或走廊板，从而为有特殊需要的人的生活增加一些独立性。该系统可以修改为在附近位置的特定文本的搜索模式。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Natural Scene Text to Voice Signal Conversion for Visually Impaired using Deep Neural Network

The deep neural network architectures can be utilized to detect and recognize the text in natural or camera images. If this text can be converted to voice signals it will be very helpful for persons with partial or no vision. Selective search-based segmentation along with VGG architecture of Deep neural networks is utilized in this work for text detection. The Py-Tesseract Optimal character recognizer is then utilized for recognizing the detected text, which is then converted to voice signals. The system can beneficial for recognizing road side or corridor boards, thus adding some independence to the life of people with special needs. The system can be modified with a search mode for a particular text in the nearby location.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 Innovations in Power and Advanced Computing Technologies (i-PACT)

自引率

0.00%

发文量