R. Kapoor, M. Sushama, Bhavani Reddy Andem, Akhila Sri Phani Sai Sindhura S.
{"title":"Natural Scene Text to Voice Signal Conversion for Visually Impaired using Deep Neural Network","authors":"R. Kapoor, M. Sushama, Bhavani Reddy Andem, Akhila Sri Phani Sai Sindhura S.","doi":"10.1109/i-PACT52855.2021.9696523","DOIUrl":null,"url":null,"abstract":"The deep neural network architectures can be utilized to detect and recognize the text in natural or camera images. If this text can be converted to voice signals it will be very helpful for persons with partial or no vision. Selective search-based segmentation along with VGG architecture of Deep neural networks is utilized in this work for text detection. The Py-Tesseract Optimal character recognizer is then utilized for recognizing the detected text, which is then converted to voice signals. The system can beneficial for recognizing road side or corridor boards, thus adding some independence to the life of people with special needs. The system can be modified with a search mode for a particular text in the nearby location.","PeriodicalId":335956,"journal":{"name":"2021 Innovations in Power and Advanced Computing Technologies (i-PACT)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Innovations in Power and Advanced Computing Technologies (i-PACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/i-PACT52855.2021.9696523","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The deep neural network architectures can be utilized to detect and recognize the text in natural or camera images. If this text can be converted to voice signals it will be very helpful for persons with partial or no vision. Selective search-based segmentation along with VGG architecture of Deep neural networks is utilized in this work for text detection. The Py-Tesseract Optimal character recognizer is then utilized for recognizing the detected text, which is then converted to voice signals. The system can beneficial for recognizing road side or corridor boards, thus adding some independence to the life of people with special needs. The system can be modified with a search mode for a particular text in the nearby location.