{"title":"An Algorithm for Auto-threshold for Mouth ROI","authors":"Shilpa Sonawane, P. Malathi, B.B. Musmade","doi":"10.1109/PuneCon55413.2022.10014872","DOIUrl":null,"url":null,"abstract":"Lip reading technology is best possible solution of speech recognition in noisy environments. Lip reading is a methodology to interpret by lip movement without the involvement of audio. The accuracy of lip-reading technology is based on accurate mouth region of interest (ROI). Viola Jones algorithm is used for mouth region extraction. The accuracy by viola jones is affected by merge threshold parameter of cascade object detector. Due to incorrect threshold multiple bounding boxes appears for mouth ROI. The correct selection of merge threshold leads to single bounding box on mouth region. The technique to find appropriate threshold to extract mouth ROI is presented in this paper. The algorithm is applied on GRID and LRW dataset. Experiment is tested on both frontal and profile face video frames. The accuracy obtained on frontal face frames from GRID dataset is 100 % while 86.20% accuracy achieved with profile video frames from LRW dataset.","PeriodicalId":258640,"journal":{"name":"2022 IEEE Pune Section International Conference (PuneCon)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE Pune Section International Conference (PuneCon)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PuneCon55413.2022.10014872","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Lip reading technology is best possible solution of speech recognition in noisy environments. Lip reading is a methodology to interpret by lip movement without the involvement of audio. The accuracy of lip-reading technology is based on accurate mouth region of interest (ROI). Viola Jones algorithm is used for mouth region extraction. The accuracy by viola jones is affected by merge threshold parameter of cascade object detector. Due to incorrect threshold multiple bounding boxes appears for mouth ROI. The correct selection of merge threshold leads to single bounding box on mouth region. The technique to find appropriate threshold to extract mouth ROI is presented in this paper. The algorithm is applied on GRID and LRW dataset. Experiment is tested on both frontal and profile face video frames. The accuracy obtained on frontal face frames from GRID dataset is 100 % while 86.20% accuracy achieved with profile video frames from LRW dataset.