Pub Date : 2023-01-01DOI: 10.5220/0011672300003411
Yichun Li, Mina Maleki, Shadi Banitaan, Ming-Jie Chen
: In order to maintain the Li-ion batteries in a safe operating state and to optimize their performance, a precise estimation of the state of health (SOH), which indicates the degradation level of the Li-ion batteries, has to be taken into consideration urgently. In this paper, we present a regression machine learning framework that combines a convolutional neural network (CNN) with the Nyquist plot of Electrochemical Impedance Spectroscopy (EIS) as features to estimate the SOH of Li-ion batteries with a considerable improvement in the accuracy of SOH estimation. The results indicate that the Nyquist plot based on EIS features provides more detailed information regarding battery aging than simple impedance values due to its ability to reflect impedance change over time. Furthermore, convolutional layers in the CNN model were more effective in extracting different levels of features and characterizing the degradation patterns of Li-ion batteries from EIS measurement data than using simple impedance values with a DNN model, as well as other traditional machine learning methods, such as Gaussian process regression (GPR) and support vector machine (SVM).
{"title":"State of Health Estimation of Lithium-ion Batteries Using Convolutional Neural Network with Impedance Nyquist Plots","authors":"Yichun Li, Mina Maleki, Shadi Banitaan, Ming-Jie Chen","doi":"10.5220/0011672300003411","DOIUrl":"https://doi.org/10.5220/0011672300003411","url":null,"abstract":": In order to maintain the Li-ion batteries in a safe operating state and to optimize their performance, a precise estimation of the state of health (SOH), which indicates the degradation level of the Li-ion batteries, has to be taken into consideration urgently. In this paper, we present a regression machine learning framework that combines a convolutional neural network (CNN) with the Nyquist plot of Electrochemical Impedance Spectroscopy (EIS) as features to estimate the SOH of Li-ion batteries with a considerable improvement in the accuracy of SOH estimation. The results indicate that the Nyquist plot based on EIS features provides more detailed information regarding battery aging than simple impedance values due to its ability to reflect impedance change over time. Furthermore, convolutional layers in the CNN model were more effective in extracting different levels of features and characterizing the degradation patterns of Li-ion batteries from EIS measurement data than using simple impedance values with a DNN model, as well as other traditional machine learning methods, such as Gaussian process regression (GPR) and support vector machine (SVM).","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128013382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-19DOI: 10.48550/arXiv.2212.09517
Frederik Hasecke, P. Colling, A. Kummert
Segmentation of lidar data is a task that provides rich, point-wise information about the environment of robots or autonomous vehicles. Currently best performing neural networks for lidar segmentation are fine-tuned to specific datasets. Switching the lidar sensor without retraining on a big set of annotated data from the new sensor creates a domain shift, which causes the network performance to drop drastically. In this work we propose a new method for lidar domain adaption, in which we use annotated panoptic lidar datasets and recreate the recorded scenes in the structure of a different lidar sensor. We narrow the domain gap to the target data by recreating panoptic data from one domain in another and mixing the generated data with parts of (pseudo) labeled target domain data. Our method improves the nuScenes to SemanticKITTI unsupervised domain adaptation performance by 15.2 mean Intersection over Union points (mIoU) and by 48.3 mIoU in our semi-supervised approach. We demonstrate a similar improvement for the SemanticKITTI to nuScenes domain adaptation by 21.8 mIoU and 51.5 mIoU, respectively. We compare our method with two state of the art approaches for semantic lidar segmentation domain adaptation with a significant improvement for unsupervised and semi-supervised domain adaptation. Furthermore we successfully apply our proposed method to two entirely unlabeled datasets of two state of the art lidar sensors Velodyne Alpha Prime and InnovizTwo, and train well performing semantic segmentation networks for both.
激光雷达数据的分割是一项任务,可以提供关于机器人或自动驾驶汽车环境的丰富的、逐点的信息。目前表现最好的激光雷达分割神经网络是针对特定的数据集进行微调的。切换激光雷达传感器而不重新训练来自新传感器的大量注释数据会产生域移位,从而导致网络性能急剧下降。在这项工作中,我们提出了一种新的激光雷达域自适应方法,在该方法中,我们使用带注释的全光激光雷达数据集,并在不同的激光雷达传感器结构中重建记录的场景。我们通过从一个领域在另一个领域重新创建全景数据,并将生成的数据与部分(伪)标记的目标领域数据混合,来缩小与目标数据的领域差距。我们的方法将nuScenes到SemanticKITTI的无监督域自适应性能提高了15.2个平均相交点(Intersection over Union points, mIoU),而我们的半监督方法将nuScenes到SemanticKITTI的无监督域自适应性能提高了48.3个mIoU。我们展示了SemanticKITTI对nuScenes域的适应性的类似改进,分别提高了21.8 mIoU和51.5 mIoU。我们将我们的方法与语义激光雷达分割域自适应的两种最新方法进行了比较,在无监督和半监督域自适应方面有了显着改进。此外,我们成功地将我们提出的方法应用于两种最先进的激光雷达传感器Velodyne Alpha Prime和InnovizTwo的完全未标记数据集,并为两者训练了性能良好的语义分割网络。
{"title":"Fake it, Mix it, Segment it: Bridging the Domain Gap Between Lidar Sensors","authors":"Frederik Hasecke, P. Colling, A. Kummert","doi":"10.48550/arXiv.2212.09517","DOIUrl":"https://doi.org/10.48550/arXiv.2212.09517","url":null,"abstract":"Segmentation of lidar data is a task that provides rich, point-wise information about the environment of robots or autonomous vehicles. Currently best performing neural networks for lidar segmentation are fine-tuned to specific datasets. Switching the lidar sensor without retraining on a big set of annotated data from the new sensor creates a domain shift, which causes the network performance to drop drastically. In this work we propose a new method for lidar domain adaption, in which we use annotated panoptic lidar datasets and recreate the recorded scenes in the structure of a different lidar sensor. We narrow the domain gap to the target data by recreating panoptic data from one domain in another and mixing the generated data with parts of (pseudo) labeled target domain data. Our method improves the nuScenes to SemanticKITTI unsupervised domain adaptation performance by 15.2 mean Intersection over Union points (mIoU) and by 48.3 mIoU in our semi-supervised approach. We demonstrate a similar improvement for the SemanticKITTI to nuScenes domain adaptation by 21.8 mIoU and 51.5 mIoU, respectively. We compare our method with two state of the art approaches for semantic lidar segmentation domain adaptation with a significant improvement for unsupervised and semi-supervised domain adaptation. Furthermore we successfully apply our proposed method to two entirely unlabeled datasets of two state of the art lidar sensors Velodyne Alpha Prime and InnovizTwo, and train well performing semantic segmentation networks for both.","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124423040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-15DOI: 10.48550/arXiv.2212.07671
Sravan Kumar Jagadeesh, René Schuster, D. Stricker
In this paper, we introduce a novel network that generates semantic, instance, and part segmentation using a shared encoder and effectively fuses them to achieve panoptic-part segmentation. Unifying these three segmentation problems allows for mutually improved and consistent representation learning. To fuse the predictions of all three heads efficiently, we introduce a parameter-free joint fusion module that dynamically balances the logits and fuses them to create panoptic-part segmentation. Our method is evaluated on the Cityscapes Panoptic Parts (CPP) and Pascal Panoptic Parts (PPP) datasets. For CPP, the PartPQ of our proposed model with joint fusion surpasses the previous state-of-the-art by 1.6 and 4.7 percentage points for all areas and segments with parts, respectively. On PPP, our joint fusion outperforms a model using the previous top-down merging strategy by 3.3 percentage points in PartPQ and 10.5 percentage points in PartPQ for partitionable classes.
在本文中,我们介绍了一种新的网络,该网络使用共享编码器生成语义、实例和部分分割,并有效地融合它们以实现全景部分分割。统一这三个分割问题允许相互改进和一致的表示学习。为了有效地融合所有三个头部的预测,我们引入了一个无参数的关节融合模块,该模块动态平衡逻辑并融合它们以创建全景部分分割。我们的方法在cityscape Panoptic Parts (CPP)和Pascal Panoptic Parts (PPP)数据集上进行了评估。对于CPP,我们提出的具有关节融合的模型的PartPQ在所有有零件的区域和部分上分别超过了以前最先进的1.6和4.7个百分点。在PPP上,我们的联合融合在PartPQ上比使用以前的自顶向下合并策略的模型高出3.3个百分点,在可分区类上比PartPQ高出10.5个百分点。
{"title":"Multi-task Fusion for Efficient Panoptic-Part Segmentation","authors":"Sravan Kumar Jagadeesh, René Schuster, D. Stricker","doi":"10.48550/arXiv.2212.07671","DOIUrl":"https://doi.org/10.48550/arXiv.2212.07671","url":null,"abstract":"In this paper, we introduce a novel network that generates semantic, instance, and part segmentation using a shared encoder and effectively fuses them to achieve panoptic-part segmentation. Unifying these three segmentation problems allows for mutually improved and consistent representation learning. To fuse the predictions of all three heads efficiently, we introduce a parameter-free joint fusion module that dynamically balances the logits and fuses them to create panoptic-part segmentation. Our method is evaluated on the Cityscapes Panoptic Parts (CPP) and Pascal Panoptic Parts (PPP) datasets. For CPP, the PartPQ of our proposed model with joint fusion surpasses the previous state-of-the-art by 1.6 and 4.7 percentage points for all areas and segments with parts, respectively. On PPP, our joint fusion outperforms a model using the previous top-down merging strategy by 3.3 percentage points in PartPQ and 10.5 percentage points in PartPQ for partitionable classes.","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123938766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-09DOI: 10.48550/arXiv.2212.04786
O. Zell, Joel Pålsson, Kevin Hernandez-Diaz, F. Alonso-Fernandez, Felix Nilsson
Fires have destructive power when they break out and affect their surroundings on a devastatingly large scale. The best way to minimize their damage is to detect the fire as quickly as possible before it has a chance to grow. Accordingly, this work looks into the potential of AI to detect and recognize fires and reduce detection time using object detection on an image stream. Object detection has made giant leaps in speed and accuracy over the last six years, making real-time detection feasible. To our end, we collected and labeled appropriate data from several public sources, which have been used to train and evaluate several models based on the popular YOLOv4 object detector. Our focus, driven by a collaborating industrial partner, is to implement our system in an industrial warehouse setting, which is characterized by high ceilings. A drawback of traditional smoke detectors in this setup is that the smoke has to rise to a sufficient height. The AI models brought forward in this research managed to outperform these detectors by a significant amount of time, providing precious anticipation that could help to minimize the effects of fires further.
{"title":"Image-Based Fire Detection in Industrial Environments with YOLOv4","authors":"O. Zell, Joel Pålsson, Kevin Hernandez-Diaz, F. Alonso-Fernandez, Felix Nilsson","doi":"10.48550/arXiv.2212.04786","DOIUrl":"https://doi.org/10.48550/arXiv.2212.04786","url":null,"abstract":"Fires have destructive power when they break out and affect their surroundings on a devastatingly large scale. The best way to minimize their damage is to detect the fire as quickly as possible before it has a chance to grow. Accordingly, this work looks into the potential of AI to detect and recognize fires and reduce detection time using object detection on an image stream. Object detection has made giant leaps in speed and accuracy over the last six years, making real-time detection feasible. To our end, we collected and labeled appropriate data from several public sources, which have been used to train and evaluate several models based on the popular YOLOv4 object detector. Our focus, driven by a collaborating industrial partner, is to implement our system in an industrial warehouse setting, which is characterized by high ceilings. A drawback of traditional smoke detectors in this setup is that the smoke has to rise to a sufficient height. The AI models brought forward in this research managed to outperform these detectors by a significant amount of time, providing precious anticipation that could help to minimize the effects of fires further.","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130697742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-09DOI: 10.48550/arXiv.2212.04790
August Baaz, Yonan Yonan, Kevin Hernandez-Diaz, F. Alonso-Fernandez, Felix Nilsson
One of the biggest challenges in machine learning is data collection. Training data is an important part since it determines how the model will behave. In object classification, capturing a large number of images per object and in different conditions is not always possible and can be very time-consuming and tedious. Accordingly, this work explores the creation of artificial images using a game engine to cope with limited data in the training dataset. We combine real and synthetic data to train the object classification engine, a strategy that has shown to be beneficial to increase confidence in the decisions made by the classifier, which is often critical in industrial setups. To combine real and synthetic data, we first train the classifier on a massive amount of synthetic data, and then we fine-tune it on real images. Another important result is that the amount of real images needed for fine-tuning is not very high, reaching top accuracy with just 12 or 24 images per class. This substantially reduces the requirements of capturing a great amount of real data.
{"title":"Synthetic Data for Object Classification in Industrial Applications","authors":"August Baaz, Yonan Yonan, Kevin Hernandez-Diaz, F. Alonso-Fernandez, Felix Nilsson","doi":"10.48550/arXiv.2212.04790","DOIUrl":"https://doi.org/10.48550/arXiv.2212.04790","url":null,"abstract":"One of the biggest challenges in machine learning is data collection. Training data is an important part since it determines how the model will behave. In object classification, capturing a large number of images per object and in different conditions is not always possible and can be very time-consuming and tedious. Accordingly, this work explores the creation of artificial images using a game engine to cope with limited data in the training dataset. We combine real and synthetic data to train the object classification engine, a strategy that has shown to be beneficial to increase confidence in the decisions made by the classifier, which is often critical in industrial setups. To combine real and synthetic data, we first train the classifier on a massive amount of synthetic data, and then we fine-tune it on real images. Another important result is that the amount of real images needed for fine-tuning is not very high, reaching top accuracy with just 12 or 24 images per class. This substantially reduces the requirements of capturing a great amount of real data.","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"95 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129120383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-20DOI: 10.5220/0011644400003411
Sara L. Almonacid-Uribe, Oliverio J. Santana, D. Hernández-Sosa, David Freire-Obregón
An article published on Medical News Today in June 2022 presented a fundamental question in its title: Can an earlobe crease predict heart attacks? The author explained that end arteries supply the heart and ears. In other words, if they lose blood supply, no other arteries can take over, resulting in tissue damage. Consequently, some earlobes have a diagonal crease, line, or deep fold that resembles a wrinkle. In this paper, we take a step toward detecting this specific marker, commonly known as DELC or Frank's Sign. For this reason, we have made the first DELC dataset available to the public. In addition, we have investigated the performance of numerous cutting-edge backbones on annotated photos. Experimentally, we demonstrate that it is possible to solve this challenge by combining pre-trained encoders with a customized classifier to achieve 97.7% accuracy. Moreover, we have analyzed the backbone trade-off between performance and size, estimating MobileNet as the most promising encoder.
{"title":"Deep Learning for Diagonal Earlobe Crease Detection","authors":"Sara L. Almonacid-Uribe, Oliverio J. Santana, D. Hernández-Sosa, David Freire-Obregón","doi":"10.5220/0011644400003411","DOIUrl":"https://doi.org/10.5220/0011644400003411","url":null,"abstract":"An article published on Medical News Today in June 2022 presented a fundamental question in its title: Can an earlobe crease predict heart attacks? The author explained that end arteries supply the heart and ears. In other words, if they lose blood supply, no other arteries can take over, resulting in tissue damage. Consequently, some earlobes have a diagonal crease, line, or deep fold that resembles a wrinkle. In this paper, we take a step toward detecting this specific marker, commonly known as DELC or Frank's Sign. For this reason, we have made the first DELC dataset available to the public. In addition, we have investigated the performance of numerous cutting-edge backbones on annotated photos. Experimentally, we demonstrate that it is possible to solve this challenge by combining pre-trained encoders with a customized classifier to achieve 97.7% accuracy. Moreover, we have analyzed the backbone trade-off between performance and size, estimating MobileNet as the most promising encoder.","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131633194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-15DOI: 10.48550/arXiv.2206.07391
André Artelt, Alexander Schulz, Barbara Hammer
Dimensionality reduction is a popular preprocessing and a widely used tool in data mining. Transparency, which is usually achieved by means of explanations, is nowadays a widely accepted and crucial requirement of machine learning based systems like classifiers and recommender systems. However, transparency of dimensionality reduction and other data mining tools have not been considered in much depth yet, still it is crucial to understand their behavior -- in particular practitioners might want to understand why a specific sample got mapped to a specific location. In order to (locally) understand the behavior of a given dimensionality reduction method, we introduce the abstract concept of contrasting explanations for dimensionality reduction, and apply a realization of this concept to the specific application of explaining two dimensional data visualization.
{"title":"\"Why Here and Not There?\" - Diverse Contrasting Explanations of Dimensionality Reduction","authors":"André Artelt, Alexander Schulz, Barbara Hammer","doi":"10.48550/arXiv.2206.07391","DOIUrl":"https://doi.org/10.48550/arXiv.2206.07391","url":null,"abstract":"Dimensionality reduction is a popular preprocessing and a widely used tool in data mining. Transparency, which is usually achieved by means of explanations, is nowadays a widely accepted and crucial requirement of machine learning based systems like classifiers and recommender systems. However, transparency of dimensionality reduction and other data mining tools have not been considered in much depth yet, still it is crucial to understand their behavior -- in particular practitioners might want to understand why a specific sample got mapped to a specific location. In order to (locally) understand the behavior of a given dimensionality reduction method, we introduce the abstract concept of contrasting explanations for dimensionality reduction, and apply a realization of this concept to the specific application of explaining two dimensional data visualization.","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130868765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-03-02DOI: 10.5220/0011743700003411
Thomas Reynolds, Maruf A. Dhali, Lambert Schomaker
Researchers continually perform corroborative tests to classify ancient historical documents based on the physical materials of their writing surfaces. However, these tests, often performed on-site, requires actual access to the manuscript objects. The procedures involve a considerable amount of time and cost, and can damage the manuscripts. Developing a technique to classify such documents using only digital images can be very useful and efficient. In order to tackle this problem, this study uses images of a famous historical collection, the Dead Sea Scrolls, to propose a novel method to classify the materials of the manuscripts. The proposed classifier uses the two-dimensional Fourier Transform to identify patterns within the manuscript surfaces. Combining a binary classification system employing the transform with a majority voting process is shown to be effective for this classification task. This pilot study shows a successful classification percentage of up to 97% for a confined amount of manuscripts produced from either parchment or papyrus material. Feature vectors based on Fourier-space grid representation outperformed a concentric Fourier-space format.
{"title":"Image-based material analysis of ancient historical documents","authors":"Thomas Reynolds, Maruf A. Dhali, Lambert Schomaker","doi":"10.5220/0011743700003411","DOIUrl":"https://doi.org/10.5220/0011743700003411","url":null,"abstract":"Researchers continually perform corroborative tests to classify ancient historical documents based on the physical materials of their writing surfaces. However, these tests, often performed on-site, requires actual access to the manuscript objects. The procedures involve a considerable amount of time and cost, and can damage the manuscripts. Developing a technique to classify such documents using only digital images can be very useful and efficient. In order to tackle this problem, this study uses images of a famous historical collection, the Dead Sea Scrolls, to propose a novel method to classify the materials of the manuscripts. The proposed classifier uses the two-dimensional Fourier Transform to identify patterns within the manuscript surfaces. Combining a binary classification system employing the transform with a majority voting process is shown to be effective for this classification task. This pilot study shows a successful classification percentage of up to 97% for a confined amount of manuscripts produced from either parchment or papyrus material. Feature vectors based on Fourier-space grid representation outperformed a concentric Fourier-space format.","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126182607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-02-04DOI: 10.5220/0010906900003122
Sadique Adnan Siddiqui, L. Gutzeit, F. Kirchner
In this work, we investigate the influence of labeling methods on the classification of human movements on data recorded using a marker-based motion capture system. The dataset is labeled using two different approaches, one based on video data of the movements, the other based on the movement trajectories recorded using the motion capture system. The dataset is labeled using two different approaches, one based on video data of the movements, the other based on the movement trajectories recorded using the motion capture system. The data was recorded from one participant performing a stacking scenario comprising simple arm movements at three different speeds (slow, normal, fast). Machine learning algorithms that include k-Nearest Neighbor, Random Forest, Extreme Gradient Boosting classifier, Convolutional Neural networks (CNN), Long Short-Term Memory networks (LSTM), and a combination of CNN-LSTM networks are compared on their performance in recognition of these arm movements. The models were trained on actions performed on slow and normal speed movements segments and generalized on actions consisting of fast-paced human movement. It was observed that all the models trained on normal-paced data labeled using trajectories have almost 20% improvement in accuracy on test data in comparison to the models trained on data labeled using videos of the performed experiments.
{"title":"The influence of labeling techniques in classifying human manipulation movement of different speed","authors":"Sadique Adnan Siddiqui, L. Gutzeit, F. Kirchner","doi":"10.5220/0010906900003122","DOIUrl":"https://doi.org/10.5220/0010906900003122","url":null,"abstract":"In this work, we investigate the influence of labeling methods on the classification of human movements on data recorded using a marker-based motion capture system. The dataset is labeled using two different approaches, one based on video data of the movements, the other based on the movement trajectories recorded using the motion capture system. The dataset is labeled using two different approaches, one based on video data of the movements, the other based on the movement trajectories recorded using the motion capture system. The data was recorded from one participant performing a stacking scenario comprising simple arm movements at three different speeds (slow, normal, fast). Machine learning algorithms that include k-Nearest Neighbor, Random Forest, Extreme Gradient Boosting classifier, Convolutional Neural networks (CNN), Long Short-Term Memory networks (LSTM), and a combination of CNN-LSTM networks are compared on their performance in recognition of these arm movements. The models were trained on actions performed on slow and normal speed movements segments and generalized on actions consisting of fast-paced human movement. It was observed that all the models trained on normal-paced data labeled using trajectories have almost 20% improvement in accuracy on test data in comparison to the models trained on data labeled using videos of the performed experiments.","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125450838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-23DOI: 10.5220/0010762700003122
Zhijian Li, J. Xin, Guofa Zhou
We developed an integrated recurrent neural network and nonlinear regression spatio-temporal model for vector-borne disease evolution. We take into account climate data and seasonality as external factors that correlate with disease transmitting insects (e.g. flies), also spill-over infections from neighboring regions surrounding a region of interest. The climate data is encoded to the model through a quadratic embedding scheme motivated by recommendation systems. The neighboring regions' influence is modeled by a long short-term memory neural network. The integrated model is trained by stochastic gradient descent and tested on leish-maniasis data in Sri Lanka from 2013-2018 where infection outbreaks occurred. Our model outperformed ARIMA models across a number of regions with high infections, and an associated ablation study renders support to our modeling hypothesis and ideas.
{"title":"An integrated recurrent neural network and regression model with spatial and climatic couplings for vector-borne disease dynamics","authors":"Zhijian Li, J. Xin, Guofa Zhou","doi":"10.5220/0010762700003122","DOIUrl":"https://doi.org/10.5220/0010762700003122","url":null,"abstract":"We developed an integrated recurrent neural network and nonlinear regression spatio-temporal model for vector-borne disease evolution. We take into account climate data and seasonality as external factors that correlate with disease transmitting insects (e.g. flies), also spill-over infections from neighboring regions surrounding a region of interest. The climate data is encoded to the model through a quadratic embedding scheme motivated by recommendation systems. The neighboring regions' influence is modeled by a long short-term memory neural network. The integrated model is trained by stochastic gradient descent and tested on leish-maniasis data in Sri Lanka from 2013-2018 where infection outbreaks occurred. Our model outperformed ARIMA models across a number of regions with high infections, and an associated ablation study renders support to our modeling hypothesis and ideas.","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134531222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}