Pub Date : 2022-12-01DOI: 10.1109/ICMLA55696.2022.00062
J. Schubert, U. W. Bolin
In this paper, we expand a methodology for horizon scanning of scientific literature to discover scientific trends. In this methodology, scientific articles are automatically clustered within a broadly defined field of research based on the topic. We develop a new method to allow an analyst to handle the large number of clusters that result from the automatic clustering of articles. The method is based on estimating an information-theoretical distance between all possible pairs of clusters. Each of the scientific articles has a probability distribution of affiliation over all possible clusters arising from the clustering process. Using these, we investigate possible pairwise mergers between all pairs of existing clusters and calculate the entropies of the probability distributions of all articles after each possible merger of two clusters. These entropies are visualized in a dendritic tree and a cluster graph. The merger with minimal total entropy is the proposed cluster pair to be merged.
{"title":"Cluster Management of Scientific Literature in HSTOOL","authors":"J. Schubert, U. W. Bolin","doi":"10.1109/ICMLA55696.2022.00062","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00062","url":null,"abstract":"In this paper, we expand a methodology for horizon scanning of scientific literature to discover scientific trends. In this methodology, scientific articles are automatically clustered within a broadly defined field of research based on the topic. We develop a new method to allow an analyst to handle the large number of clusters that result from the automatic clustering of articles. The method is based on estimating an information-theoretical distance between all possible pairs of clusters. Each of the scientific articles has a probability distribution of affiliation over all possible clusters arising from the clustering process. Using these, we investigate possible pairwise mergers between all pairs of existing clusters and calculate the entropies of the probability distributions of all articles after each possible merger of two clusters. These entropies are visualized in a dendritic tree and a cluster graph. The merger with minimal total entropy is the proposed cluster pair to be merged.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127851369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/ICMLA55696.2022.00069
Junya Saito, Sachihiro Youoku, Ryosuke Kawamura, A. Uchida, Kentaro Murase, Xiaoyue Mi
Facial action units (AUs) represent muscular activities, and their recognition from facial images can capture various psychological states, such as people’s interests as consumers and mental health states. However, degradation of conditions, such as occlusions by hand, often occurs and affects the accuracy of AUs recognition in the real world. Most existing studies on degraded conditions have adopted the approach using additional training images and advanced structures of neural networks to improve the robustness of AUs recognition from a degraded facial image. However, such an approach cannot deal with cases in which evidence of the AUs is completely or almost invisible. Therefore, we propose a novel method to address the degraded conditions by predicting the uncertainties of the AUs recognition caused by them. Our method interpolates the high-uncertainty data using surrounding data to reduce the influence of the degraded conditions, and visualizes the conditions causing the uncertainties to handle cases where the conditions are very poor and need to be improved. In the evaluation experiments, the public datasets BP4D+ and DISFA were modified to degrade them for testing. By evaluating the modified test data, we demonstrated that the maximum improvement with our method was 12% for BP4D+ and 17% for DISFA, and that our method can prevent the decrease in accuracy owing to degraded conditions. We also presented some visualization examples which demonstrate that our method can reasonably predict the conditions and uncertainties.
{"title":"Uncertainty Prediction for Facial Action Units Recognition under Degraded Conditions","authors":"Junya Saito, Sachihiro Youoku, Ryosuke Kawamura, A. Uchida, Kentaro Murase, Xiaoyue Mi","doi":"10.1109/ICMLA55696.2022.00069","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00069","url":null,"abstract":"Facial action units (AUs) represent muscular activities, and their recognition from facial images can capture various psychological states, such as people’s interests as consumers and mental health states. However, degradation of conditions, such as occlusions by hand, often occurs and affects the accuracy of AUs recognition in the real world. Most existing studies on degraded conditions have adopted the approach using additional training images and advanced structures of neural networks to improve the robustness of AUs recognition from a degraded facial image. However, such an approach cannot deal with cases in which evidence of the AUs is completely or almost invisible. Therefore, we propose a novel method to address the degraded conditions by predicting the uncertainties of the AUs recognition caused by them. Our method interpolates the high-uncertainty data using surrounding data to reduce the influence of the degraded conditions, and visualizes the conditions causing the uncertainties to handle cases where the conditions are very poor and need to be improved. In the evaluation experiments, the public datasets BP4D+ and DISFA were modified to degrade them for testing. By evaluating the modified test data, we demonstrated that the maximum improvement with our method was 12% for BP4D+ and 17% for DISFA, and that our method can prevent the decrease in accuracy owing to degraded conditions. We also presented some visualization examples which demonstrate that our method can reasonably predict the conditions and uncertainties.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"2016 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127349866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/ICMLA55696.2022.00273
Menore Tekeba Mengistu, Getachew Alemu, P. Chevaillier, P. D. Loor
In this paper, we provided an unsupervised contrastive representation learning method which uses contrastive views in which both spatial and temporal similarity-contrast is balanced. The balanced views are created by taking pixels from the anchor sample and any randomly selected negative sample and balancing the ratio of number of pixels taken from the anchor and the negative. Then these balanced views are paired with the anchor to create the positive contrastive views and all other samples paired with the anchor are taken as negative contrastive views. We made the evaluation using reinforcement learning tasks on Atari games and Deep Mind Control suites (DMControl). Our evaluations on 26 Atari games and six DMControl tasks show that the proposed method is superior in learning spatio-temporally evolving factors of the environment by capturing the relevant task controlling generative factors from the agents’ raw observations.
在本文中,我们提供了一种无监督的对比表示学习方法,该方法使用对比视图,其中空间和时间的相似性-对比度是平衡的。平衡视图是通过从锚点样本和任何随机选择的负样本中获取像素,并平衡从锚点和负样本中获取的像素数量的比例来创建的。然后将这些平衡的视图与锚配对以创建正对比视图,而与锚配对的所有其他样本都被视为负对比视图。我们使用Atari游戏和Deep Mind Control套件(DMControl)上的强化学习任务进行评估。我们对26个Atari游戏和6个DMControl任务的评估表明,该方法通过从智能体的原始观察中捕获相关的任务控制生成因素,在学习环境的时空演变因素方面具有优势。
{"title":"Balancing Similarity-Contrast in Unsupervised Representation Learning: Evaluation with Reinforcement Learning","authors":"Menore Tekeba Mengistu, Getachew Alemu, P. Chevaillier, P. D. Loor","doi":"10.1109/ICMLA55696.2022.00273","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00273","url":null,"abstract":"In this paper, we provided an unsupervised contrastive representation learning method which uses contrastive views in which both spatial and temporal similarity-contrast is balanced. The balanced views are created by taking pixels from the anchor sample and any randomly selected negative sample and balancing the ratio of number of pixels taken from the anchor and the negative. Then these balanced views are paired with the anchor to create the positive contrastive views and all other samples paired with the anchor are taken as negative contrastive views. We made the evaluation using reinforcement learning tasks on Atari games and Deep Mind Control suites (DMControl). Our evaluations on 26 Atari games and six DMControl tasks show that the proposed method is superior in learning spatio-temporally evolving factors of the environment by capturing the relevant task controlling generative factors from the agents’ raw observations.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130664195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/ICMLA55696.2022.00204
Lin Zhou, Eric Fischer, C. M. Brahms, U. Granacher, B. Arnrich
Neural networks have been successfully applied to a wide range of human motion analysis topics in combination with wearable sensor data. However, their computation process is not readily comprehensible. Alternatively, many of the model interpretation efforts do not provide physiologically-relevant insights, thus still limiting their use in clinical settings. In this work, we take gait modifications under fatigue and cognitive task performance as a use case to present how in-depth investigations of neural networks can be performed using wearable sensor data. We collected walking data from 16 young healthy individuals in unfatigued and fatigued states and under single- (walking only) and dual-task (walking while concurrently performing a cognitive task) conditions using inertial measurement units. Convolutional neural networks were able to identify both fatigue and dual-task gait patterns with high classification accuracy. To interpret the model, the importance of each time step in the input time series was visualized using Layer-wise Relevance Propagation. The visualization revealed highly individualized gait changes among participants, as well as changes at precise time steps of the input signal that allow further investigations to infer potential underlying mechanisms. Our methods enable in-depth analysis of human movement using transparent neural networks with data collected from unobtrusive, mobile wearable sensors.
{"title":"Using Transparent Neural Networks and Wearable Inertial Sensors to Generate Physiologically-Relevant Insights for Gait","authors":"Lin Zhou, Eric Fischer, C. M. Brahms, U. Granacher, B. Arnrich","doi":"10.1109/ICMLA55696.2022.00204","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00204","url":null,"abstract":"Neural networks have been successfully applied to a wide range of human motion analysis topics in combination with wearable sensor data. However, their computation process is not readily comprehensible. Alternatively, many of the model interpretation efforts do not provide physiologically-relevant insights, thus still limiting their use in clinical settings. In this work, we take gait modifications under fatigue and cognitive task performance as a use case to present how in-depth investigations of neural networks can be performed using wearable sensor data. We collected walking data from 16 young healthy individuals in unfatigued and fatigued states and under single- (walking only) and dual-task (walking while concurrently performing a cognitive task) conditions using inertial measurement units. Convolutional neural networks were able to identify both fatigue and dual-task gait patterns with high classification accuracy. To interpret the model, the importance of each time step in the input time series was visualized using Layer-wise Relevance Propagation. The visualization revealed highly individualized gait changes among participants, as well as changes at precise time steps of the input signal that allow further investigations to infer potential underlying mechanisms. Our methods enable in-depth analysis of human movement using transparent neural networks with data collected from unobtrusive, mobile wearable sensors.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"18 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130997003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/ICMLA55696.2022.00248
Khensa Daoudi, Maroua Yamami, S. Benferhat, Lila Méziani
The representation and combination of imprecise information is an important topic present in many applications. This paper first deals with the representation of imprecise positions of objects detected from maps and images of urban networks. In particular, it deals with the question of the combination of uncertain information, from different sources, to address the problem of inaccuracies related to the geographical coordinates of the detected objects. To illustrate the representation and the combination modes presented in this paper, we focus on wastewater networks data. More precisely, we use the manhole detection problem as an example of object detection in our study. We will use two sources of data: i) the images obtained from the google street view utility and ii) the maps of the sanitation networks. As the geographical positions of the detected objects are imprecise, we will use possibility theory to represent this uncertainty. Possibility theory is particularly suitable for representing qualitative uncertainty, where only the plausibility relation (between the different geographical positions that are candidates to be the actual position of the manholes) is important. Finally, we propose to use two aggregation modes, conjunctive and disjunctive modes, to combine the possibility distributions associated with the detected objects.
{"title":"Managing imprecise map and image data in a possibility theory framework","authors":"Khensa Daoudi, Maroua Yamami, S. Benferhat, Lila Méziani","doi":"10.1109/ICMLA55696.2022.00248","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00248","url":null,"abstract":"The representation and combination of imprecise information is an important topic present in many applications. This paper first deals with the representation of imprecise positions of objects detected from maps and images of urban networks. In particular, it deals with the question of the combination of uncertain information, from different sources, to address the problem of inaccuracies related to the geographical coordinates of the detected objects. To illustrate the representation and the combination modes presented in this paper, we focus on wastewater networks data. More precisely, we use the manhole detection problem as an example of object detection in our study. We will use two sources of data: i) the images obtained from the google street view utility and ii) the maps of the sanitation networks. As the geographical positions of the detected objects are imprecise, we will use possibility theory to represent this uncertainty. Possibility theory is particularly suitable for representing qualitative uncertainty, where only the plausibility relation (between the different geographical positions that are candidates to be the actual position of the manholes) is important. Finally, we propose to use two aggregation modes, conjunctive and disjunctive modes, to combine the possibility distributions associated with the detected objects.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132472996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/ICMLA55696.2022.00236
Mohammed Terry-Jack, N. Rozanov
We introduce a novel zero-shot learning (ZSL) method, known as ‘self-alignment training’, and use it to train a vanilla autoencoder which is then evaluated on four prominent ZSL Tasks CUB, SUN, AWA1&2. Despite being a far simpler model than the competition, our method achieved results on par with SOTA. In addition, we also present a novel ‘contrastive-loss’ objective to allow autoencoders to learn from negative samples. In particular, we achieve new SOTA of 64.5 on AWA2 for Generalised ZSL and a new SOTA for standard ZSL of 47.7 on SUN. The code is publicly accessible on https://github.com/Wluper/satae.
{"title":"Connecting the Semantic Dots: Zero-shot Learning with Self-Aligning Autoencoders and a New Contrastive-Loss for Negative Sampling","authors":"Mohammed Terry-Jack, N. Rozanov","doi":"10.1109/ICMLA55696.2022.00236","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00236","url":null,"abstract":"We introduce a novel zero-shot learning (ZSL) method, known as ‘self-alignment training’, and use it to train a vanilla autoencoder which is then evaluated on four prominent ZSL Tasks CUB, SUN, AWA1&2. Despite being a far simpler model than the competition, our method achieved results on par with SOTA. In addition, we also present a novel ‘contrastive-loss’ objective to allow autoencoders to learn from negative samples. In particular, we achieve new SOTA of 64.5 on AWA2 for Generalised ZSL and a new SOTA for standard ZSL of 47.7 on SUN. The code is publicly accessible on https://github.com/Wluper/satae.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"168 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132526147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/ICMLA55696.2022.00235
Kavitha Karimbi Mahesh, A. Nishmitha, Gowda Karthik Balgopal, Kausalya K Naik, Mranali Gourish Gaonkar
We present a lexicon-based approach for classifying opinionated social media texts in English and Hindi. The effect of conjunctions, degree modifiers, negations, emojis and emoticons in scoring the intensity of opinion expressed is further explored. Using a manually built Hindi polarity lexicon, we achieve an accuracy of 86.45% in classifying 2,717 Hindi reviews. A real-time analysis on YouTube reviews showed 86% accuracy for English review classification task.
{"title":"Aspect-based Sentiment Analysis of English and Hindi Opinionated Social Media Texts","authors":"Kavitha Karimbi Mahesh, A. Nishmitha, Gowda Karthik Balgopal, Kausalya K Naik, Mranali Gourish Gaonkar","doi":"10.1109/ICMLA55696.2022.00235","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00235","url":null,"abstract":"We present a lexicon-based approach for classifying opinionated social media texts in English and Hindi. The effect of conjunctions, degree modifiers, negations, emojis and emoticons in scoring the intensity of opinion expressed is further explored. Using a manually built Hindi polarity lexicon, we achieve an accuracy of 86.45% in classifying 2,717 Hindi reviews. A real-time analysis on YouTube reviews showed 86% accuracy for English review classification task.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132894863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/ICMLA55696.2022.00076
Ruchita Mehta, V. Palade, S. Sharifzadeh, Bo Tan, Yordanka Karayaneva
Remote Human Activity Recognition (HAR) in a private residential area has a beneficial influence on the elderly population's life, since this group of people require regular monitoring of health conditions. This paper addresses the problem of continuous detection of daily human activities using mm-wave Doppler radar. Unlike most previous research, this work records the data in terms of continuous series of activities rather than individual activities. These series of activities are similar to real-life activity patterns. The Dynamic Time Warping (DTW) algorithm is used for the detection of human activities in the recorded time series of data and compared to other time-series classification methods. DTW requires less amount of labelled data. The input for DTW was provided using three strategies, and the obtained results were compared against each other. The first approach uses the pixel-level data of frames (named UnSup-PLevel). In the other two strategies, a Convolutional Variational Autoencoder (CVAE) is used to extract Un-Supervised Encoded features (UnSup-EnLevel) and Supervised Encoded features (Sup-EnLevel) from the series of Doppler frames. Results demonstrates the superiority of the Sup-EnLevel features over UnSup-EnLevel and UnSup-PLevel strategies. However, the performance of the UnSup-PLevel strategy worked surprisingly well without using annotations.
{"title":"Continuous Human Activity Recognition using Radar Imagery and Dynamic Time Warping","authors":"Ruchita Mehta, V. Palade, S. Sharifzadeh, Bo Tan, Yordanka Karayaneva","doi":"10.1109/ICMLA55696.2022.00076","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00076","url":null,"abstract":"Remote Human Activity Recognition (HAR) in a private residential area has a beneficial influence on the elderly population's life, since this group of people require regular monitoring of health conditions. This paper addresses the problem of continuous detection of daily human activities using mm-wave Doppler radar. Unlike most previous research, this work records the data in terms of continuous series of activities rather than individual activities. These series of activities are similar to real-life activity patterns. The Dynamic Time Warping (DTW) algorithm is used for the detection of human activities in the recorded time series of data and compared to other time-series classification methods. DTW requires less amount of labelled data. The input for DTW was provided using three strategies, and the obtained results were compared against each other. The first approach uses the pixel-level data of frames (named UnSup-PLevel). In the other two strategies, a Convolutional Variational Autoencoder (CVAE) is used to extract Un-Supervised Encoded features (UnSup-EnLevel) and Supervised Encoded features (Sup-EnLevel) from the series of Doppler frames. Results demonstrates the superiority of the Sup-EnLevel features over UnSup-EnLevel and UnSup-PLevel strategies. However, the performance of the UnSup-PLevel strategy worked surprisingly well without using annotations.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128187091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/ICMLA55696.2022.00113
Chenwei Sun, Martin Trat, Jane Bender, J. Ovtcharova, George Jeppesen, Jan Bär
We propose an unsupervised-learning-based method for anomaly detection and root cause analysis for an industrial press machine. A skip-connected autoencoder with 55% performance improvement measured by reconstruction root mean square error to vanilla variant in average is used to train the collected multivariant time series data in different schemes. We then conduct a stacked evaluation method for both machine- level anomalies with the root cause localization and anomaly on specific cylinder tracks. Both real-world and synthetic anomalies embedded in real data are used for evaluation. The result shows that the multi-models training scheme and the relatively short window length can gain better performance, i.e., fewer anomaly false alarms and misses.
{"title":"Unsupervised Anomaly Detection and Root Cause Analysis for an Industrial Press Machine based on Skip-Connected Autoencoder","authors":"Chenwei Sun, Martin Trat, Jane Bender, J. Ovtcharova, George Jeppesen, Jan Bär","doi":"10.1109/ICMLA55696.2022.00113","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00113","url":null,"abstract":"We propose an unsupervised-learning-based method for anomaly detection and root cause analysis for an industrial press machine. A skip-connected autoencoder with 55% performance improvement measured by reconstruction root mean square error to vanilla variant in average is used to train the collected multivariant time series data in different schemes. We then conduct a stacked evaluation method for both machine- level anomalies with the root cause localization and anomaly on specific cylinder tracks. Both real-world and synthetic anomalies embedded in real data are used for evaluation. The result shows that the multi-models training scheme and the relatively short window length can gain better performance, i.e., fewer anomaly false alarms and misses.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133362627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/ICMLA55696.2022.00167
Florian Meissl, F. Eibensteiner, P. Petz, J. Langer
The trend toward the Internet of Things has led to a rapid increase in the amount of data that needs to be processed. Artificial intelligence (AI) can serve as a very helpful tool to extract or compress essential information of data. However, AI places high demands on a system’s hardware. This is not exactly in line with the strengths of embedded systems.This paper combines AI on embedded systems with the not-yet fully explored subject of online handwriting recognition (HWR). The main contribution is the deployment and real-time operation of AI on a microcontroller (MCU). Model architectures using long short-term memory (LSTM) cells and 1D convolutional neural networks (CNNs) are used to process live data from inertial measurement units (IMUs) sensors. The dataset used for training the AI models was recorded with a self-developed prototype. After training, the models are converted and deployed on a MCU. The conversion process includes quantization from a 32-bit floating-point to an 8-bit fixed-point datatype. The TensorFlow Lite Micro (TFLM) framework is used to run inference on the MCU. For predictions in real-time optimizations are applied to the framework, which results in running inference approx. 827 times faster. The optimized AI model implementation is then used to classify handwritten characters using the live data from the IMU sensors. This first approach has shown, that the separation of the symbols is necessary to be able to classify characters from live sensor data with high accuracy.
物联网的趋势导致需要处理的数据量迅速增加。人工智能(AI)可以作为一种非常有用的工具来提取或压缩数据中的重要信息。然而,人工智能对系统的硬件要求很高。这并不完全符合嵌入式系统的优势。本文将嵌入式系统上的人工智能与尚未完全探索的在线手写识别(HWR)相结合。主要贡献是在微控制器(MCU)上部署和实时操作人工智能。使用长短期记忆(LSTM)单元和一维卷积神经网络(cnn)的模型架构来处理来自惯性测量单元(imu)传感器的实时数据。用于训练人工智能模型的数据集是用自主开发的原型记录的。经过训练后,将模型转换并部署在单片机上。转换过程包括从32位浮点到8位定点数据类型的量化。使用TensorFlow Lite Micro (TFLM)框架在MCU上运行推理。对于预测中的实时优化应用于框架,这导致运行推理近似。快了827倍。然后使用优化的AI模型实现使用来自IMU传感器的实时数据对手写字符进行分类。第一种方法表明,符号的分离对于能够从实时传感器数据中对字符进行高精度分类是必要的。
{"title":"Online Handwriting Recognition using LSTM on Microcontroller and IMU Sensors","authors":"Florian Meissl, F. Eibensteiner, P. Petz, J. Langer","doi":"10.1109/ICMLA55696.2022.00167","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00167","url":null,"abstract":"The trend toward the Internet of Things has led to a rapid increase in the amount of data that needs to be processed. Artificial intelligence (AI) can serve as a very helpful tool to extract or compress essential information of data. However, AI places high demands on a system’s hardware. This is not exactly in line with the strengths of embedded systems.This paper combines AI on embedded systems with the not-yet fully explored subject of online handwriting recognition (HWR). The main contribution is the deployment and real-time operation of AI on a microcontroller (MCU). Model architectures using long short-term memory (LSTM) cells and 1D convolutional neural networks (CNNs) are used to process live data from inertial measurement units (IMUs) sensors. The dataset used for training the AI models was recorded with a self-developed prototype. After training, the models are converted and deployed on a MCU. The conversion process includes quantization from a 32-bit floating-point to an 8-bit fixed-point datatype. The TensorFlow Lite Micro (TFLM) framework is used to run inference on the MCU. For predictions in real-time optimizations are applied to the framework, which results in running inference approx. 827 times faster. The optimized AI model implementation is then used to classify handwritten characters using the live data from the IMU sensors. This first approach has shown, that the separation of the symbols is necessary to be able to classify characters from live sensor data with high accuracy.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132043978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}