Pub Date : 2019-07-01DOI: 10.1109/ICMLC48188.2019.8949247
T. Obo, Yuto Nakamura
This paper presents a method for controlling a mobile robot in dynamic environments, based on an investigation of human's impression of the robot movements. Human-aware robot navigation has been discussed in terms of comfort, naturalness and sociability. Even a robot has a function to perform and move safely around individuals, the people may feel annoyance and stress about the performance. The naturalness of robot's movements in human societies is one of the most important topics for human-aware robot navigation. Moreover, such robot systems are required to adaptively decide the priority of behaviors such as collision avoiding, target tracing, and wall following to achieve the navigation objective. In this study, we therefore developed a fuzzy controller for challenging the above issue. We conducted a questionnaire for investigating human's impression of the movements and modeling the degree of emotional intensity after the person followed close behind the robot. Moreover, we built a simulated environment to evaluate the performance of the mobile robot in a dynamic environment.
{"title":"Intelligent Robot Navigation Based on Human Emotional Model in Human-Aware Environment","authors":"T. Obo, Yuto Nakamura","doi":"10.1109/ICMLC48188.2019.8949247","DOIUrl":"https://doi.org/10.1109/ICMLC48188.2019.8949247","url":null,"abstract":"This paper presents a method for controlling a mobile robot in dynamic environments, based on an investigation of human's impression of the robot movements. Human-aware robot navigation has been discussed in terms of comfort, naturalness and sociability. Even a robot has a function to perform and move safely around individuals, the people may feel annoyance and stress about the performance. The naturalness of robot's movements in human societies is one of the most important topics for human-aware robot navigation. Moreover, such robot systems are required to adaptively decide the priority of behaviors such as collision avoiding, target tracing, and wall following to achieve the navigation objective. In this study, we therefore developed a fuzzy controller for challenging the above issue. We conducted a questionnaire for investigating human's impression of the movements and modeling the degree of emotional intensity after the person followed close behind the robot. Moreover, we built a simulated environment to evaluate the performance of the mobile robot in a dynamic environment.","PeriodicalId":221349,"journal":{"name":"2019 International Conference on Machine Learning and Cybernetics (ICMLC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115597888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-07-01DOI: 10.1109/ICMLC48188.2019.8949268
Naziah Tasnim, Fahad Parvez Mahdi, S. Alam, N. Yagi, A. Nakashima, I. Komesu, Yoshimitsu Tokunaga, T. Sakumoto, S. Kobashi
Uterine peristalsis, which occurs in waves in the uterine region, is one of the fundamental behaviours of uterus in a non-pregnant woman. There are two types of waves in uterine peristalsis; one propagates from the cervix, and the other one does from the fundus. Cine MR images can investigate the wave like uterine peristalsis. Hence, the goal of this study is to quantify the number of peristalsis propagated from the cervix or the fundus using the cine MR images. The proposed method is based on image registration and frequency analysis. The method quantifies the uterine peristalsis by analyzing the frequency spectrum of waves at the cervix and at the fundus individually. The correlation coefficient of the number of peristalsis between visual inspection and the proposed method was 0.9799 at the cervix, and was 0.9999 at the fundus. Thus, this study accurately estimated the number of peristalsis in order to support the diagnosis of the female infertility.
{"title":"A Quantification Approach of Uterine Peristalsis Propagated From the Cervix and the Fundus","authors":"Naziah Tasnim, Fahad Parvez Mahdi, S. Alam, N. Yagi, A. Nakashima, I. Komesu, Yoshimitsu Tokunaga, T. Sakumoto, S. Kobashi","doi":"10.1109/ICMLC48188.2019.8949268","DOIUrl":"https://doi.org/10.1109/ICMLC48188.2019.8949268","url":null,"abstract":"Uterine peristalsis, which occurs in waves in the uterine region, is one of the fundamental behaviours of uterus in a non-pregnant woman. There are two types of waves in uterine peristalsis; one propagates from the cervix, and the other one does from the fundus. Cine MR images can investigate the wave like uterine peristalsis. Hence, the goal of this study is to quantify the number of peristalsis propagated from the cervix or the fundus using the cine MR images. The proposed method is based on image registration and frequency analysis. The method quantifies the uterine peristalsis by analyzing the frequency spectrum of waves at the cervix and at the fundus individually. The correlation coefficient of the number of peristalsis between visual inspection and the proposed method was 0.9799 at the cervix, and was 0.9999 at the fundus. Thus, this study accurately estimated the number of peristalsis in order to support the diagnosis of the female infertility.","PeriodicalId":221349,"journal":{"name":"2019 International Conference on Machine Learning and Cybernetics (ICMLC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115762063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-07-01DOI: 10.1109/ICMLC48188.2019.8949256
Tung-Hua Yu, Chao-Cheng Wu
This paper proposed a gait analysis system to detect abnormal gaits based on each gait cycle. The proposed system took advantage of a tri-axial accelerometer to collect the gait signals in three dimensions. The collected signals were divided into four intervals for each gait cycle, including the step, swing, stance phase, and stride. The time domain and time-frequency domain features were generated for each interval. Later, Fisher score was calculated to determine discrimination ability for each feature. Support Vector Machine would be trained for classification of normal and abnormal gaits based on selected features with the highest Fisher scores. Cerebralspinal Meningitis (CSM) patients with/without spinal cord edema were used as samples to conduct the experiments. The results demonstrated that the proposed gait analysis system could provide 90% accuracy. The feature subset with the best accuracy includes kurtosis, crest factor, and mean of lateral acceleration data in stride interval. It implied the force to make the body left and right in stride interval is an critical indicator for diagnosis of spinal cord edema. The proposed gait analysis system could further be extended to more symptoms if other sets of training samples are available in the future.
{"title":"An Accelerometer Based Gait Analysis System to Detect Gait Abnormalities in Cerebralspinal Meningitis Patients","authors":"Tung-Hua Yu, Chao-Cheng Wu","doi":"10.1109/ICMLC48188.2019.8949256","DOIUrl":"https://doi.org/10.1109/ICMLC48188.2019.8949256","url":null,"abstract":"This paper proposed a gait analysis system to detect abnormal gaits based on each gait cycle. The proposed system took advantage of a tri-axial accelerometer to collect the gait signals in three dimensions. The collected signals were divided into four intervals for each gait cycle, including the step, swing, stance phase, and stride. The time domain and time-frequency domain features were generated for each interval. Later, Fisher score was calculated to determine discrimination ability for each feature. Support Vector Machine would be trained for classification of normal and abnormal gaits based on selected features with the highest Fisher scores. Cerebralspinal Meningitis (CSM) patients with/without spinal cord edema were used as samples to conduct the experiments. The results demonstrated that the proposed gait analysis system could provide 90% accuracy. The feature subset with the best accuracy includes kurtosis, crest factor, and mean of lateral acceleration data in stride interval. It implied the force to make the body left and right in stride interval is an critical indicator for diagnosis of spinal cord edema. The proposed gait analysis system could further be extended to more symptoms if other sets of training samples are available in the future.","PeriodicalId":221349,"journal":{"name":"2019 International Conference on Machine Learning and Cybernetics (ICMLC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124260561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-07-01DOI: 10.1109/ICMLC48188.2019.8949274
N. Costadopoulos, M. Islam, D. Tien
This paper discusses our work on discovering a set of emotional logic rules, derived from physiological data of individuals from a wearable technology perspective. We concentrated the analysis on physiological data such as plethysmography, respiration, galvanic skin response, and temperature that can be detected by wearable sensors. We sourced our data from the DEAP dataset, which is a popular labelled Affective Computing dataset. Our approach implemented a fusion of preprocessing and data mining techniques, to discover logic rules relating to the valence and arousal emotional dimensions. Our findings indicate that while there are similar changes in heart rates or galvanic skin response across individuals during emotional stimuli, every individual has a unique and quantifiable physiological reaction.
{"title":"Discovering Emotional Logic Rules From Physiological Data of Individuals","authors":"N. Costadopoulos, M. Islam, D. Tien","doi":"10.1109/ICMLC48188.2019.8949274","DOIUrl":"https://doi.org/10.1109/ICMLC48188.2019.8949274","url":null,"abstract":"This paper discusses our work on discovering a set of emotional logic rules, derived from physiological data of individuals from a wearable technology perspective. We concentrated the analysis on physiological data such as plethysmography, respiration, galvanic skin response, and temperature that can be detected by wearable sensors. We sourced our data from the DEAP dataset, which is a popular labelled Affective Computing dataset. Our approach implemented a fusion of preprocessing and data mining techniques, to discover logic rules relating to the valence and arousal emotional dimensions. Our findings indicate that while there are similar changes in heart rates or galvanic skin response across individuals during emotional stimuli, every individual has a unique and quantifiable physiological reaction.","PeriodicalId":221349,"journal":{"name":"2019 International Conference on Machine Learning and Cybernetics (ICMLC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121126686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-07-01DOI: 10.1109/ICMLC48188.2019.8949266
Yue Xu, Zhimin He, Haozhen Situ, Junjian Su, Yan Zhou
Traffic lights may cause a lot of braking, acceleration or even sharp braking, which leads to fuel waste, safety risks and increasing emission during driving. In this paper, we proposed a car navigation map equipped with speed recommendation, which guides the vehicle to timely arrival at green light with minimal use of braking. According to the location of the vehicle and the traffic signal information, we proposed a speed recommendation algorithm for driver to reduce the waiting time at the intersection with traffic lights, i.e., making the vehicle arrive at the intersection when the traffic light is green. The proposed algorithm allows the driver pass the intersection with traffic light without stopping. Thus, it makes driving safer, more efficient and environment-friendly. The simulation is conducted on a traffic simulation software named Vissim. Simulation result shows the satisfying performance of the proposed algorithm.
{"title":"A Car Navigation Map Equipped With Speed Recommendation","authors":"Yue Xu, Zhimin He, Haozhen Situ, Junjian Su, Yan Zhou","doi":"10.1109/ICMLC48188.2019.8949266","DOIUrl":"https://doi.org/10.1109/ICMLC48188.2019.8949266","url":null,"abstract":"Traffic lights may cause a lot of braking, acceleration or even sharp braking, which leads to fuel waste, safety risks and increasing emission during driving. In this paper, we proposed a car navigation map equipped with speed recommendation, which guides the vehicle to timely arrival at green light with minimal use of braking. According to the location of the vehicle and the traffic signal information, we proposed a speed recommendation algorithm for driver to reduce the waiting time at the intersection with traffic lights, i.e., making the vehicle arrive at the intersection when the traffic light is green. The proposed algorithm allows the driver pass the intersection with traffic light without stopping. Thus, it makes driving safer, more efficient and environment-friendly. The simulation is conducted on a traffic simulation software named Vissim. Simulation result shows the satisfying performance of the proposed algorithm.","PeriodicalId":221349,"journal":{"name":"2019 International Conference on Machine Learning and Cybernetics (ICMLC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125819750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-07-01DOI: 10.1109/ICMLC48188.2019.8949269
Cheng-Fa Tsai, Yu-Chieh Chen, Chia-En Tsai
Because of the rapid development of information technology, the deep learning for numerous applications is a fairly popular and hot research issue currently. Deep learning, as one of the most currently extraordinary machine learning methods, has obtained substantial success in considerable applications such as image analysis, speech recognition and text understanding. It uses supervised and unsupervised strategies to learn multi-level representations and features in hierarchical architectures for the tasks of classification and image recognition. This research is concerned with a real life image recognition for panama (banana) disease which optimizes the performance of deep learning techniques. This study is based on a deep learning technique called MResNet (modified ResNet) and modify activation function to enhance accuracy, precision and recall. According to the experimental results, the proposed approach is fairly effective to detect panama disease.
{"title":"Real Life Image Recognition of Panama Disease by an Effective Deep Learning Approach","authors":"Cheng-Fa Tsai, Yu-Chieh Chen, Chia-En Tsai","doi":"10.1109/ICMLC48188.2019.8949269","DOIUrl":"https://doi.org/10.1109/ICMLC48188.2019.8949269","url":null,"abstract":"Because of the rapid development of information technology, the deep learning for numerous applications is a fairly popular and hot research issue currently. Deep learning, as one of the most currently extraordinary machine learning methods, has obtained substantial success in considerable applications such as image analysis, speech recognition and text understanding. It uses supervised and unsupervised strategies to learn multi-level representations and features in hierarchical architectures for the tasks of classification and image recognition. This research is concerned with a real life image recognition for panama (banana) disease which optimizes the performance of deep learning techniques. This study is based on a deep learning technique called MResNet (modified ResNet) and modify activation function to enhance accuracy, precision and recall. According to the experimental results, the proposed approach is fairly effective to detect panama disease.","PeriodicalId":221349,"journal":{"name":"2019 International Conference on Machine Learning and Cybernetics (ICMLC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122651070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-07-01DOI: 10.1109/ICMLC48188.2019.8949285
Momoka Fujimoto, H. Nakajima, Yasuyo Kotake, Danni Wang, Y. Hata
This paper analyzed the electrocardiograms obtained from workers with different proficient levels and considered the stress index. As an example of a simple work, we analyzed the process of combining three cases (Case combination) and the step of inserting nine parts (DIP insertion) into the foundation. We have classified the subjects as beginners and experienced groups with different levels of proficiency, and performed frequency analysis on electrocardiograms measured during each process. Following that we calculated the heart beat interval time R-R interval (RRI) from the measurement result and calculated low-frequency (LF) and high-frequency (HF) by PSD estimation. Moreover, we calculated the ratio LF/ HF of sympathetic activity (LF) and parasympathetic activity (HF), and compared it with those of beginners and experts. As a result, we confirmed that the value of LF/HF during work based on beginner's resting time was larger than that of experienced person.
{"title":"Stress Assessment for Work Proficiency Analysis by Heart Rate Variability","authors":"Momoka Fujimoto, H. Nakajima, Yasuyo Kotake, Danni Wang, Y. Hata","doi":"10.1109/ICMLC48188.2019.8949285","DOIUrl":"https://doi.org/10.1109/ICMLC48188.2019.8949285","url":null,"abstract":"This paper analyzed the electrocardiograms obtained from workers with different proficient levels and considered the stress index. As an example of a simple work, we analyzed the process of combining three cases (Case combination) and the step of inserting nine parts (DIP insertion) into the foundation. We have classified the subjects as beginners and experienced groups with different levels of proficiency, and performed frequency analysis on electrocardiograms measured during each process. Following that we calculated the heart beat interval time R-R interval (RRI) from the measurement result and calculated low-frequency (LF) and high-frequency (HF) by PSD estimation. Moreover, we calculated the ratio LF/ HF of sympathetic activity (LF) and parasympathetic activity (HF), and compared it with those of beginners and experts. As a result, we confirmed that the value of LF/HF during work based on beginner's resting time was larger than that of experienced person.","PeriodicalId":221349,"journal":{"name":"2019 International Conference on Machine Learning and Cybernetics (ICMLC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122845258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-07-01DOI: 10.1109/ICMLC48188.2019.8949282
Y. Tsai, Qiangfu Zhao
Artificial Neural Network (ANN) is a promising tool for solving many recognition problems and has been a popular choice for researchers during the last decade. Machine learning tools such as Multi-Layer Perceptron (MLP) have proven effective in solving classification problems. Long Short Term Memory (LSTM) has been deemed to be the state of the art of the ANN family, which is specialized in tracking time series related data. The capability of LSTM as a powerful tool for making profit has been reported, along with its reputation for stock market prediction. In this study, Keras was used as a neural network library on top of Tensorflow as a machine learning backend using the Dow Jones Index (DJI) as the data source for the MLP and LSTM analyses. Our experimental results reveal that the prediction ability of MLP and LSTM possesses similar accuracy to the benchmark when providing only trading price and volume as the input data. This paper further discusses some difficulties in training MLP and LSTM that may have reduced the system capability to reach its expected potential.
{"title":"An Experimental Study on the Effectiveness of Artificial Neural Network-Based Stock Index Prediction","authors":"Y. Tsai, Qiangfu Zhao","doi":"10.1109/ICMLC48188.2019.8949282","DOIUrl":"https://doi.org/10.1109/ICMLC48188.2019.8949282","url":null,"abstract":"Artificial Neural Network (ANN) is a promising tool for solving many recognition problems and has been a popular choice for researchers during the last decade. Machine learning tools such as Multi-Layer Perceptron (MLP) have proven effective in solving classification problems. Long Short Term Memory (LSTM) has been deemed to be the state of the art of the ANN family, which is specialized in tracking time series related data. The capability of LSTM as a powerful tool for making profit has been reported, along with its reputation for stock market prediction. In this study, Keras was used as a neural network library on top of Tensorflow as a machine learning backend using the Dow Jones Index (DJI) as the data source for the MLP and LSTM analyses. Our experimental results reveal that the prediction ability of MLP and LSTM possesses similar accuracy to the benchmark when providing only trading price and volume as the input data. This paper further discusses some difficulties in training MLP and LSTM that may have reduced the system capability to reach its expected potential.","PeriodicalId":221349,"journal":{"name":"2019 International Conference on Machine Learning and Cybernetics (ICMLC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128322161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-07-01DOI: 10.1109/ICMLC48188.2019.8949240
Chen-Chun Huang, Yi-Leh Wu, Cheng-Yuan Tang
Image is one of the most important ways for users to express their emotions on social networks. In this paper, we use the deep convolutional neural networks to solve the problem of image sentiment analysis from visual content. Because training a neural network requires a large number of data sets to provide good training performance, we cannot obtain such a real human emotion training set, because emotions are subjective, and multiple people need to provide annotations for the images, which requires a lot of manpower. This study proposes to incorporate synthetic face images into the training set to substantially increase the size of the training set. We use only synthetic face images, real facial images, and mixtures of synthetic and real facial images in the training set. Our experiments show that by using only 4026 real images, where each image is supplemented by the synthetic image to the same data set size (Anger: 1063 + 937 true, Disgust: 1857 + 143 true, Fear: 1802 + 198 true, Happy: 2000 true, Sad: 1252 + 748 true) total of 10,000 images, can reach 87.79%, 74.19%, 86.99%, 79.80% average testing accuracy in each testing set in human face sentiment classification.
{"title":"Human Face Sentiment Classification Using Synthetic Sentiment Images with Deep Convolutional Neural Networks","authors":"Chen-Chun Huang, Yi-Leh Wu, Cheng-Yuan Tang","doi":"10.1109/ICMLC48188.2019.8949240","DOIUrl":"https://doi.org/10.1109/ICMLC48188.2019.8949240","url":null,"abstract":"Image is one of the most important ways for users to express their emotions on social networks. In this paper, we use the deep convolutional neural networks to solve the problem of image sentiment analysis from visual content. Because training a neural network requires a large number of data sets to provide good training performance, we cannot obtain such a real human emotion training set, because emotions are subjective, and multiple people need to provide annotations for the images, which requires a lot of manpower. This study proposes to incorporate synthetic face images into the training set to substantially increase the size of the training set. We use only synthetic face images, real facial images, and mixtures of synthetic and real facial images in the training set. Our experiments show that by using only 4026 real images, where each image is supplemented by the synthetic image to the same data set size (Anger: 1063 + 937 true, Disgust: 1857 + 143 true, Fear: 1802 + 198 true, Happy: 2000 true, Sad: 1252 + 748 true) total of 10,000 images, can reach 87.79%, 74.19%, 86.99%, 79.80% average testing accuracy in each testing set in human face sentiment classification.","PeriodicalId":221349,"journal":{"name":"2019 International Conference on Machine Learning and Cybernetics (ICMLC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133109680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-07-01DOI: 10.1109/ICMLC48188.2019.8949258
Yutaka Yoshino, Kazuki Nakada, M. Kobayashi, H. Tatsumi
This study aims to assist visually impaired people as well as animation novices by focusing on problems that arise at the time of viewing animation videos and images. We focus on the following problems: (1) difficulty of understanding behaviors and situations, (2) difficulty of discriminating animation characters, and (3) confusion caused by animation characters with similarities. We use deep neural networks to identify animation characters as preliminary verification by training a customized convolutional neural network (CNN) from scratch on a small class of data based on the original database of animation characters. The results show that some combinations of characters are difficult to discriminate in cross validation. To resolve this problem, we performed transfer learning based on the CNN variants pre-trained on the natural image database ImageNet. We confirmed that the learning proceeded steadily with a gradual learning curve, resulting in high accuracy. The results indicate that the bottleneck features of the CNN variants pre-trained on ImageNet are effective in identifying animation characters. Furthermore, we verified the operation speed of the inference of our trained CNN on a microcomputer board with a machine learning accelerator Intel Movidius and confirmed that the speed is sufficient in real-time execution.
{"title":"A Study on Machine Learning-Based Image Identification Towards Assitive Automation of Commentary on Animation Characters","authors":"Yutaka Yoshino, Kazuki Nakada, M. Kobayashi, H. Tatsumi","doi":"10.1109/ICMLC48188.2019.8949258","DOIUrl":"https://doi.org/10.1109/ICMLC48188.2019.8949258","url":null,"abstract":"This study aims to assist visually impaired people as well as animation novices by focusing on problems that arise at the time of viewing animation videos and images. We focus on the following problems: (1) difficulty of understanding behaviors and situations, (2) difficulty of discriminating animation characters, and (3) confusion caused by animation characters with similarities. We use deep neural networks to identify animation characters as preliminary verification by training a customized convolutional neural network (CNN) from scratch on a small class of data based on the original database of animation characters. The results show that some combinations of characters are difficult to discriminate in cross validation. To resolve this problem, we performed transfer learning based on the CNN variants pre-trained on the natural image database ImageNet. We confirmed that the learning proceeded steadily with a gradual learning curve, resulting in high accuracy. The results indicate that the bottleneck features of the CNN variants pre-trained on ImageNet are effective in identifying animation characters. Furthermore, we verified the operation speed of the inference of our trained CNN on a microcomputer board with a machine learning accelerator Intel Movidius and confirmed that the speed is sufficient in real-time execution.","PeriodicalId":221349,"journal":{"name":"2019 International Conference on Machine Learning and Cybernetics (ICMLC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131869643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}