Wei-Liang Ou, Yu-Hsiu Cheng, Chin-Chieh Chang, Hua-Luen Chen, Chih-Peng Fan
{"title":"Calibration-free and deep-learning-based customer gaze direction detection technology based on the YOLOv3-tiny model for smart advertising displays","authors":"Wei-Liang Ou, Yu-Hsiu Cheng, Chin-Chieh Chang, Hua-Luen Chen, Chih-Peng Fan","doi":"10.1080/02533839.2023.2262724","DOIUrl":null,"url":null,"abstract":"ABSTRACTBecause of the COVID-19 pandemic, gaze tracking for nontouch user interface designs used in advertising displays or automatic vending machines has become an emerging research topic. In this study, a cost-effective deep-learning-based customer gaze direction detection technology was developed for a smart advertising display. To achieve calibration-free interactions between customers and displays, the You-Only-Look-Once (YOLO)-v3-tiny-based deep learning model was used for determining the bounding boxes of eyes and pupils. Next, postprocessing was conducted using a voting mechanism and difference vectors between the central coordinates of the bounding boxes for effectively predicting customer gaze directions. Product images were separated into two or four gaze zones. For cross-person testing, the Recall, Precision, Accuracy, and F1-score for two gaze zones were approximately 77%, 99%, 88%, and 87%, respectively, and those for four gaze zones were approximately 72%, 91%, 91%, and 79%, respectively. Software implementations on NVIDIA graphics-processing-unit-accelerated embedded platforms exhibited a frame rate of nearly 30 frames per second. The proposed design achieved real-time gaze direction detection for a smart advertising platform.CO EDITOR-IN-CHIEF: Yuan, Shyan-MingASSOCIATE EDITOR: Yuan, Shyan-MingKEYWORDS: Deep learningYOLOv3-tinyintelligent systemssmart displaysnontouch user interface designgaze direction detectioncalibration-free Nomenclature UL=the gaze state estimated at the upper left directionUR=the gaze state estimated at the upper right directionDL=the gaze state estimated at the down left directionDR=the gaze state estimated1 at the down right directionC_pupil=the central coordinate position of the right or left pupilC_eye=the central coordinate position of the right or left eyeV_d=the difference vector between two central coordinate positionsX1=the central coordinate position of X-axis of the pupil’s bounding boxY1=the central coordinate position of Y-axis of the pupil’s bounding boxX2=the central coordinate position of X-axis of the eye’s bounding boxY2=the central coordinate position of Y-axis of the eye’s bounding boxTN=the number of true negative casesTP=the number of true positive casesFN=the number of false negative casesFP=the number of false positive casesF1 Score=it is a measure of a test’s accuracy by using 2×Precision×Recall/(Precision + Recall)mAP=it is a metric used to measure the performance of models doing object detection tasksDisclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingThis work was financially supported by the Ministry of Science and Technology (MOST) under Grant No. [109-2218-E-005-008].","PeriodicalId":17313,"journal":{"name":"Journal of the Chinese Institute of Engineers","volume":"75 1","pages":"0"},"PeriodicalIF":1.0000,"publicationDate":"2023-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Chinese Institute of Engineers","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/02533839.2023.2262724","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
ABSTRACTBecause of the COVID-19 pandemic, gaze tracking for nontouch user interface designs used in advertising displays or automatic vending machines has become an emerging research topic. In this study, a cost-effective deep-learning-based customer gaze direction detection technology was developed for a smart advertising display. To achieve calibration-free interactions between customers and displays, the You-Only-Look-Once (YOLO)-v3-tiny-based deep learning model was used for determining the bounding boxes of eyes and pupils. Next, postprocessing was conducted using a voting mechanism and difference vectors between the central coordinates of the bounding boxes for effectively predicting customer gaze directions. Product images were separated into two or four gaze zones. For cross-person testing, the Recall, Precision, Accuracy, and F1-score for two gaze zones were approximately 77%, 99%, 88%, and 87%, respectively, and those for four gaze zones were approximately 72%, 91%, 91%, and 79%, respectively. Software implementations on NVIDIA graphics-processing-unit-accelerated embedded platforms exhibited a frame rate of nearly 30 frames per second. The proposed design achieved real-time gaze direction detection for a smart advertising platform.CO EDITOR-IN-CHIEF: Yuan, Shyan-MingASSOCIATE EDITOR: Yuan, Shyan-MingKEYWORDS: Deep learningYOLOv3-tinyintelligent systemssmart displaysnontouch user interface designgaze direction detectioncalibration-free Nomenclature UL=the gaze state estimated at the upper left directionUR=the gaze state estimated at the upper right directionDL=the gaze state estimated at the down left directionDR=the gaze state estimated1 at the down right directionC_pupil=the central coordinate position of the right or left pupilC_eye=the central coordinate position of the right or left eyeV_d=the difference vector between two central coordinate positionsX1=the central coordinate position of X-axis of the pupil’s bounding boxY1=the central coordinate position of Y-axis of the pupil’s bounding boxX2=the central coordinate position of X-axis of the eye’s bounding boxY2=the central coordinate position of Y-axis of the eye’s bounding boxTN=the number of true negative casesTP=the number of true positive casesFN=the number of false negative casesFP=the number of false positive casesF1 Score=it is a measure of a test’s accuracy by using 2×Precision×Recall/(Precision + Recall)mAP=it is a metric used to measure the performance of models doing object detection tasksDisclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingThis work was financially supported by the Ministry of Science and Technology (MOST) under Grant No. [109-2218-E-005-008].
期刊介绍:
Encompassing a wide range of engineering disciplines and industrial applications, JCIE includes the following topics:
1.Chemical engineering
2.Civil engineering
3.Computer engineering
4.Electrical engineering
5.Electronics
6.Mechanical engineering
and fields related to the above.