iReader: An Intelligent Reader System for the Visually Impaired

J. G, A. Azar, B. Qureshi, Nashwa Ahmad Kamal
{"title":"iReader: An Intelligent Reader System for the Visually Impaired","authors":"J. G, A. Azar, B. Qureshi, Nashwa Ahmad Kamal","doi":"10.1109/CDMA54072.2022.00036","DOIUrl":null,"url":null,"abstract":"For visually impaired persons, it is quite difficult to read printed text. Non-visual forms of reading materials, such as Braille, are available as Blind Aiding Technology amoung many others. In recent times, many devices and assistive equipment have been developed and technologies made available to assist visually impaired persons with reading. Most of these research works and products support reading from printed text-based manuscripts only. Due to this limitation, it may not be possible for a visually impaired person to describe and comprehend a printed image. In this paper, we develop iReader, an Intelligent Reader system that not only helps a visually impaired reader to read but also vocally describes an image available in the printed text. The Convolution Neural Network (CNN) is employed to collect features from the printed image and its caption. The Long Short- Term Memory (LSTM) network is used to train the model for describing the image data. The resulting data is sent as a voice message using Text- To-Speech to be read out loud to the user. The efficiency of the LSTM model is examined using the ResNet50 and VGG16. The experimental results show that the LSTM-based training model delivers the best prediction of a picture's description with an accuracy of 83","PeriodicalId":313042,"journal":{"name":"2022 7th International Conference on Data Science and Machine Learning Applications (CDMA)","volume":"25 8","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 7th International Conference on Data Science and Machine Learning Applications (CDMA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CDMA54072.2022.00036","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

For visually impaired persons, it is quite difficult to read printed text. Non-visual forms of reading materials, such as Braille, are available as Blind Aiding Technology amoung many others. In recent times, many devices and assistive equipment have been developed and technologies made available to assist visually impaired persons with reading. Most of these research works and products support reading from printed text-based manuscripts only. Due to this limitation, it may not be possible for a visually impaired person to describe and comprehend a printed image. In this paper, we develop iReader, an Intelligent Reader system that not only helps a visually impaired reader to read but also vocally describes an image available in the printed text. The Convolution Neural Network (CNN) is employed to collect features from the printed image and its caption. The Long Short- Term Memory (LSTM) network is used to train the model for describing the image data. The resulting data is sent as a voice message using Text- To-Speech to be read out loud to the user. The efficiency of the LSTM model is examined using the ResNet50 and VGG16. The experimental results show that the LSTM-based training model delivers the best prediction of a picture's description with an accuracy of 83
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
iReader:视障人士的智能阅读系统
对于视障人士来说,阅读印刷文字是相当困难的。非视觉形式的阅读材料,如盲文,可以作为辅助盲人的技术。近年来,已经开发了许多辅助设备和技术来帮助视障人士阅读。大多数这些研究工作和产品只支持阅读基于印刷文本的手稿。由于这一限制,视障人士可能无法描述和理解打印图像。在本文中,我们开发了一种智能阅读系统iReader,它不仅可以帮助视障读者阅读,还可以语音描述印刷文本中的图像。使用卷积神经网络(CNN)从打印图像及其标题中收集特征。使用长短期记忆(LSTM)网络训练模型来描述图像数据。结果数据以语音消息的形式发送,使用文本到语音的方式大声朗读给用户。利用ResNet50和VGG16验证了LSTM模型的有效性。实验结果表明,基于lstm的训练模型对图片描述的预测准确率达到了83
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
The Accuracy Performance of Semantic Segmentation Network with Different Backbones On the Capabilities of Quantum Machine Learning Machine Learning Algorithms for Detection of Noisy/Artifact-Corrupted Epochs of Visual Oddball Paradigm ERP Data Deep Learning for Classifying of White Blood Cancer Machine Learning Based Preemptive Diagnosis of Lung Cancer Using Clinical Data
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1