彩色图像中嘴巴分割的一种轻量级深度学习方法

IF 12.3 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Applied Computing and Informatics Pub Date : 2022-12-05 DOI:10.1108/aci-08-2022-0225
Kittisak Chotikkakamthorn, P. Ritthipravat, Worapan Kusakunniran, Pimchanok Tuakta, Paitoon Benjapornlert
{"title":"彩色图像中嘴巴分割的一种轻量级深度学习方法","authors":"Kittisak Chotikkakamthorn, P. Ritthipravat, Worapan Kusakunniran, Pimchanok Tuakta, Paitoon Benjapornlert","doi":"10.1108/aci-08-2022-0225","DOIUrl":null,"url":null,"abstract":"PurposeMouth segmentation is one of the challenging tasks of development in lip reading applications due to illumination, low chromatic contrast and complex mouth appearance. Recently, deep learning methods effectively solved mouth segmentation problems with state-of-the-art performances. This study presents a modified Mobile DeepLabV3 based technique with a comprehensive evaluation based on mouth datasets.Design/methodology/approachThis paper presents a novel approach to mouth segmentation by Mobile DeepLabV3 technique with integrating decode and auxiliary heads. Extensive data augmentation, online hard example mining (OHEM) and transfer learning have been applied. CelebAMask-HQ and the mouth dataset from 15 healthy subjects in the department of rehabilitation medicine, Ramathibodi hospital, are used in validation for mouth segmentation performance.FindingsExtensive data augmentation, OHEM and transfer learning had been performed in this study. This technique achieved better performance on CelebAMask-HQ than existing segmentation techniques with a mean Jaccard similarity coefficient (JSC), mean classification accuracy and mean Dice similarity coefficient (DSC) of 0.8640, 93.34% and 0.9267, respectively. This technique also achieved better performance on the mouth dataset with a mean JSC, mean classification accuracy and mean DSC of 0.8834, 94.87% and 0.9367, respectively. The proposed technique achieved inference time usage per image of 48.12 ms.Originality/valueThe modified Mobile DeepLabV3 technique was developed with extensive data augmentation, OHEM and transfer learning. This technique gained better mouth segmentation performance than existing techniques. This makes it suitable for implementation in further lip-reading applications.","PeriodicalId":37348,"journal":{"name":"Applied Computing and Informatics","volume":" ","pages":""},"PeriodicalIF":12.3000,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A lightweight deep learning approach to mouth segmentation in color images\",\"authors\":\"Kittisak Chotikkakamthorn, P. Ritthipravat, Worapan Kusakunniran, Pimchanok Tuakta, Paitoon Benjapornlert\",\"doi\":\"10.1108/aci-08-2022-0225\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"PurposeMouth segmentation is one of the challenging tasks of development in lip reading applications due to illumination, low chromatic contrast and complex mouth appearance. Recently, deep learning methods effectively solved mouth segmentation problems with state-of-the-art performances. This study presents a modified Mobile DeepLabV3 based technique with a comprehensive evaluation based on mouth datasets.Design/methodology/approachThis paper presents a novel approach to mouth segmentation by Mobile DeepLabV3 technique with integrating decode and auxiliary heads. Extensive data augmentation, online hard example mining (OHEM) and transfer learning have been applied. CelebAMask-HQ and the mouth dataset from 15 healthy subjects in the department of rehabilitation medicine, Ramathibodi hospital, are used in validation for mouth segmentation performance.FindingsExtensive data augmentation, OHEM and transfer learning had been performed in this study. This technique achieved better performance on CelebAMask-HQ than existing segmentation techniques with a mean Jaccard similarity coefficient (JSC), mean classification accuracy and mean Dice similarity coefficient (DSC) of 0.8640, 93.34% and 0.9267, respectively. This technique also achieved better performance on the mouth dataset with a mean JSC, mean classification accuracy and mean DSC of 0.8834, 94.87% and 0.9367, respectively. The proposed technique achieved inference time usage per image of 48.12 ms.Originality/valueThe modified Mobile DeepLabV3 technique was developed with extensive data augmentation, OHEM and transfer learning. This technique gained better mouth segmentation performance than existing techniques. This makes it suitable for implementation in further lip-reading applications.\",\"PeriodicalId\":37348,\"journal\":{\"name\":\"Applied Computing and Informatics\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":12.3000,\"publicationDate\":\"2022-12-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Computing and Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1108/aci-08-2022-0225\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Computing and Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1108/aci-08-2022-0225","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

目的唇形分割是唇读应用发展中具有挑战性的任务之一,主要是由于光照、低颜色对比度和复杂的嘴形。近年来,深度学习方法以最先进的性能有效地解决了口腔分割问题。本研究提出了一种改进的基于移动DeepLabV3的技术,并基于口腔数据集进行了综合评估。设计/方法/方法本文提出了一种基于移动DeepLabV3技术的口部分割新方法,该方法集成了解码和辅助头。广泛的数据扩充、在线硬例挖掘(OHEM)和迁移学习已被应用。CelebAMask-HQ和来自Ramathibodi医院康复医学科的15名健康受试者的口腔数据集用于口腔分割性能的验证。本研究进行了大量的数据扩充、OHEM和迁移学习。该方法在CelebAMask-HQ上的平均Jaccard相似系数(JSC)、平均分类准确率(93.34%)和平均Dice相似系数(DSC)分别为0.8640、93.34%和0.9267,优于现有的分割技术。该技术在口腔数据集上也取得了较好的性能,平均JSC、平均分类精度和平均DSC分别为0.8834、94.87%和0.9367。该技术实现了每张图像48.12 ms的推理时间使用。原创性/价值改进的移动DeepLabV3技术采用了广泛的数据增强、OHEM和迁移学习。该方法获得了比现有方法更好的嘴巴分割性能。这使得它适合在进一步的唇读应用中实现。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A lightweight deep learning approach to mouth segmentation in color images
PurposeMouth segmentation is one of the challenging tasks of development in lip reading applications due to illumination, low chromatic contrast and complex mouth appearance. Recently, deep learning methods effectively solved mouth segmentation problems with state-of-the-art performances. This study presents a modified Mobile DeepLabV3 based technique with a comprehensive evaluation based on mouth datasets.Design/methodology/approachThis paper presents a novel approach to mouth segmentation by Mobile DeepLabV3 technique with integrating decode and auxiliary heads. Extensive data augmentation, online hard example mining (OHEM) and transfer learning have been applied. CelebAMask-HQ and the mouth dataset from 15 healthy subjects in the department of rehabilitation medicine, Ramathibodi hospital, are used in validation for mouth segmentation performance.FindingsExtensive data augmentation, OHEM and transfer learning had been performed in this study. This technique achieved better performance on CelebAMask-HQ than existing segmentation techniques with a mean Jaccard similarity coefficient (JSC), mean classification accuracy and mean Dice similarity coefficient (DSC) of 0.8640, 93.34% and 0.9267, respectively. This technique also achieved better performance on the mouth dataset with a mean JSC, mean classification accuracy and mean DSC of 0.8834, 94.87% and 0.9367, respectively. The proposed technique achieved inference time usage per image of 48.12 ms.Originality/valueThe modified Mobile DeepLabV3 technique was developed with extensive data augmentation, OHEM and transfer learning. This technique gained better mouth segmentation performance than existing techniques. This makes it suitable for implementation in further lip-reading applications.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Applied Computing and Informatics
Applied Computing and Informatics Computer Science-Information Systems
CiteScore
12.20
自引率
0.00%
发文量
0
审稿时长
39 weeks
期刊介绍: Applied Computing and Informatics aims to be timely in disseminating leading-edge knowledge to researchers, practitioners and academics whose interest is in the latest developments in applied computing and information systems concepts, strategies, practices, tools and technologies. In particular, the journal encourages research studies that have significant contributions to make to the continuous development and improvement of IT practices in the Kingdom of Saudi Arabia and other countries. By doing so, the journal attempts to bridge the gap between the academic and industrial community, and therefore, welcomes theoretically grounded, methodologically sound research studies that address various IT-related problems and innovations of an applied nature. The journal will serve as a forum for practitioners, researchers, managers and IT policy makers to share their knowledge and experience in the design, development, implementation, management and evaluation of various IT applications. Contributions may deal with, but are not limited to: • Internet and E-Commerce Architecture, Infrastructure, Models, Deployment Strategies and Methodologies. • E-Business and E-Government Adoption. • Mobile Commerce and their Applications. • Applied Telecommunication Networks. • Software Engineering Approaches, Methodologies, Techniques, and Tools. • Applied Data Mining and Warehousing. • Information Strategic Planning and Recourse Management. • Applied Wireless Computing. • Enterprise Resource Planning Systems. • IT Education. • Societal, Cultural, and Ethical Issues of IT. • Policy, Legal and Global Issues of IT. • Enterprise Database Technology.
期刊最新文献
Brain tumor classification using ResNet50-convolutional block attention module Automatic measurement of cardiothoracic ratio in chest x-ray images with ProGAN-generated dataset Cyber threat: its origins and consequence and the use of qualitative and quantitative methods in cyber risk assessment Detecting and staging diabetic retinopathy in retinal images using multi-branch CNN A lightweight deep learning approach to mouth segmentation in color images
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1