Pub Date : 2019-12-01DOI: 10.1109/ICMLA.2019.00191
Baik Dowoo, Yujin Jung, Changhee Choi
After the advent of GAN technology, many varied models have been studied and applied to various fields such as image and audio. However, in the field of cyber data, which has the same issue of data shortage, the research on data augmentation is insufficient. To solve this problem, we propose PcapGAN that can augment pcap data, a kind of network data. The proposed model includes an encoder, a data generator, and a decoder. The encoder subdivides network data into four parts. The generator generates new data for each part of the data. The decoder combines the generated data into realistic network data. We demonstrate the similarity between the generated data and original data, and validation of the generated data by increased performance of intrusion detection algorithms.
{"title":"PcapGAN: Packet Capture File Generator by Style-Based Generative Adversarial Networks","authors":"Baik Dowoo, Yujin Jung, Changhee Choi","doi":"10.1109/ICMLA.2019.00191","DOIUrl":"https://doi.org/10.1109/ICMLA.2019.00191","url":null,"abstract":"After the advent of GAN technology, many varied models have been studied and applied to various fields such as image and audio. However, in the field of cyber data, which has the same issue of data shortage, the research on data augmentation is insufficient. To solve this problem, we propose PcapGAN that can augment pcap data, a kind of network data. The proposed model includes an encoder, a data generator, and a decoder. The encoder subdivides network data into four parts. The generator generates new data for each part of the data. The decoder combines the generated data into realistic network data. We demonstrate the similarity between the generated data and original data, and validation of the generated data by increased performance of intrusion detection algorithms.","PeriodicalId":436714,"journal":{"name":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117149636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.1109/ICMLA.2019.00048
R. Silveira, M. Holanda, M. Victorino, M. Ladeira
This paper presents an analysis of data about the drop out of undergraduate engineering students at the University of Brasilia(UnB), Brazil. In Brazil, similar to other countries, there is a representative amount of engineering students that enroll in engineering majors, however, they don't get to graduate in those majors. Information about the reason for that phenomenon is important for action on the matter by university decisionmakers. This paper aims to answer the research question: What are the main factors that motivate engineering students to drop out of engineering majors at UnB? We have collected the social and performance data of engineering students from 2009 to 2019. Some of the data can be considered rare in similar studies, like students' distance from home to campus and factors like students' leave of absence requests rather than performance factors. We used three data mining techniques: Generalized Linear Model (GLM), Boosting algorithm (GBM) and Random Forest(RF). The results of the study showed that international students deserve some attention from the university and courses like Physics 1 can be challenging for engineering students.
{"title":"Educational Data Mining: Analysis of Drop out of Engineering Majors at the UnB - Brazil","authors":"R. Silveira, M. Holanda, M. Victorino, M. Ladeira","doi":"10.1109/ICMLA.2019.00048","DOIUrl":"https://doi.org/10.1109/ICMLA.2019.00048","url":null,"abstract":"This paper presents an analysis of data about the drop out of undergraduate engineering students at the University of Brasilia(UnB), Brazil. In Brazil, similar to other countries, there is a representative amount of engineering students that enroll in engineering majors, however, they don't get to graduate in those majors. Information about the reason for that phenomenon is important for action on the matter by university decisionmakers. This paper aims to answer the research question: What are the main factors that motivate engineering students to drop out of engineering majors at UnB? We have collected the social and performance data of engineering students from 2009 to 2019. Some of the data can be considered rare in similar studies, like students' distance from home to campus and factors like students' leave of absence requests rather than performance factors. We used three data mining techniques: Generalized Linear Model (GLM), Boosting algorithm (GBM) and Random Forest(RF). The results of the study showed that international students deserve some attention from the university and courses like Physics 1 can be challenging for engineering students.","PeriodicalId":436714,"journal":{"name":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","volume":"2010 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127353638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.1109/ICMLA.2019.00233
Sindhura Bonthu, P. Armijo, Tiffany Tanner, Qiuming A. Zhu
Predicting the severity of patient’s condition helps providing accurate clinical care. Mortality prediction is one of the challenges due to distinct characteristics of the patient’s data. It is a challenging problem to evaluate the patient’s data which is highly sparse, highly biased and imbalanced, and highly mixed. In this paper, we are focusing on processing large volumes of data using neural networks which can be further used for analysis to obtain useful insights, such as identifying the major features contributing to certain outcomes of events or classifying different objects based on the presences of certain attributes and their measurements.
{"title":"Using Machine Learning to Improve Surgical Outcomes","authors":"Sindhura Bonthu, P. Armijo, Tiffany Tanner, Qiuming A. Zhu","doi":"10.1109/ICMLA.2019.00233","DOIUrl":"https://doi.org/10.1109/ICMLA.2019.00233","url":null,"abstract":"Predicting the severity of patient’s condition helps providing accurate clinical care. Mortality prediction is one of the challenges due to distinct characteristics of the patient’s data. It is a challenging problem to evaluate the patient’s data which is highly sparse, highly biased and imbalanced, and highly mixed. In this paper, we are focusing on processing large volumes of data using neural networks which can be further used for analysis to obtain useful insights, such as identifying the major features contributing to certain outcomes of events or classifying different objects based on the presences of certain attributes and their measurements.","PeriodicalId":436714,"journal":{"name":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128913143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.1109/ICMLA.2019.00095
Revanth Akella, Teng-Sheng Moh
The paper presents research outcomes of classifying music into moods and provides an end-to-end, open source pipeline for mood classification using lyrics. It explores techniques that classify music using audio features and lyrics using various natural language processing methods and machine learning. The paper performs a comparative study across different classification models and mood frameworks. The linguistic aspects of lyrics are explored and are used as features for classification methods to understand what model classifies mood in the most adequate manner. The results show how lyrics are a valuable information source for classification of music. Term-frequency/inverse-document frequency and word embeddings are explored to connect words to mood classes. Various machine learning and deep learning classifiers are tested across different arrangements of the mood labels. The paper demonstrates that models which learn from lyrics using current methods of natural language processing using deep learning demonstrate higher levels of accuracy. Our final model achieves an accuracy of 71%.
{"title":"Mood Classification with Lyrics and ConvNets","authors":"Revanth Akella, Teng-Sheng Moh","doi":"10.1109/ICMLA.2019.00095","DOIUrl":"https://doi.org/10.1109/ICMLA.2019.00095","url":null,"abstract":"The paper presents research outcomes of classifying music into moods and provides an end-to-end, open source pipeline for mood classification using lyrics. It explores techniques that classify music using audio features and lyrics using various natural language processing methods and machine learning. The paper performs a comparative study across different classification models and mood frameworks. The linguistic aspects of lyrics are explored and are used as features for classification methods to understand what model classifies mood in the most adequate manner. The results show how lyrics are a valuable information source for classification of music. Term-frequency/inverse-document frequency and word embeddings are explored to connect words to mood classes. Various machine learning and deep learning classifiers are tested across different arrangements of the mood labels. The paper demonstrates that models which learn from lyrics using current methods of natural language processing using deep learning demonstrate higher levels of accuracy. Our final model achieves an accuracy of 71%.","PeriodicalId":436714,"journal":{"name":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130691414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.1109/ICMLA.2019.00150
Hamed Mohebbi-Kalkhoran, Chenyang Zhu, Matthew Schinault, P. Ratilal
Humpback whale behavior, population distribution and structure can be inferred from long term underwater passive acoustic monitoring of their vocalizations. Here we develop automatic approaches for classifying humpback whale vocalizations into the two categories of song and non-song, employing machine learning techniques. The vocalization behavior of humpback whales was monitored over instantaneous vast areas of the Gulf of Maine using a large aperture coherent hydrophone array system via the passive ocean acoustic waveguide remote sensing technique over multiple diel cycles in Fall 2006. We use wavelet signal denoising and coherent array processing to enhance the signal-to-noise ratio. To build features vector for every time sequence of the beamformed signals, we employ Bag of Words approach to time-frequency features. Finally, we apply Support Vector Machine (SVM), Neural Networks, and Naive Bayes to classify the acoustic data and compare their performances. Best results are obtained using Mel Frequency Cepstrum Coefficient (MFCC) features and SVM which leads to 94% accuracy and 72.73% F1-score for humpback whale song versus non-song vocalization classification, showing effectiveness of the proposed approach for real-time classification at sea.
座头鲸的行为、种群分布和结构可以通过对其发声的长期水下被动声学监测来推断。在这里,我们开发了使用机器学习技术将座头鲸的发声分为歌曲和非歌曲两类的自动方法。2006年秋季,利用大孔径相干水听器阵列系统,通过多周期被动海声波导遥感技术,对缅因湾大面积区域的座头鲸发声行为进行了瞬时监测。采用小波信号去噪和相干阵列处理来提高信号的信噪比。为了为波束形成信号的每一个时间序列构建特征向量,我们采用了Bag of Words方法来处理时频特征。最后,我们应用支持向量机、神经网络和朴素贝叶斯对声学数据进行分类,并比较它们的性能。使用Mel Frequency倒频谱系数(MFCC)特征和SVM对座头鲸鸣声与非鸣声进行分类的准确率为94%,f1得分为72.73%,显示了该方法在海上实时分类中的有效性。
{"title":"Classifying Humpback Whale Calls to Song and Non-Song Vocalizations using Bag of Words Descriptor on Acoustic Data","authors":"Hamed Mohebbi-Kalkhoran, Chenyang Zhu, Matthew Schinault, P. Ratilal","doi":"10.1109/ICMLA.2019.00150","DOIUrl":"https://doi.org/10.1109/ICMLA.2019.00150","url":null,"abstract":"Humpback whale behavior, population distribution and structure can be inferred from long term underwater passive acoustic monitoring of their vocalizations. Here we develop automatic approaches for classifying humpback whale vocalizations into the two categories of song and non-song, employing machine learning techniques. The vocalization behavior of humpback whales was monitored over instantaneous vast areas of the Gulf of Maine using a large aperture coherent hydrophone array system via the passive ocean acoustic waveguide remote sensing technique over multiple diel cycles in Fall 2006. We use wavelet signal denoising and coherent array processing to enhance the signal-to-noise ratio. To build features vector for every time sequence of the beamformed signals, we employ Bag of Words approach to time-frequency features. Finally, we apply Support Vector Machine (SVM), Neural Networks, and Naive Bayes to classify the acoustic data and compare their performances. Best results are obtained using Mel Frequency Cepstrum Coefficient (MFCC) features and SVM which leads to 94% accuracy and 72.73% F1-score for humpback whale song versus non-song vocalization classification, showing effectiveness of the proposed approach for real-time classification at sea.","PeriodicalId":436714,"journal":{"name":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130761010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.1109/ICMLA.2019.00167
Taesik Gong, Alberto Gil C. P. Ramos, S. Bhattacharya, Akhil Mathur, F. Kawsar
Deep learning has enabled personal and IoT devices to rethink microphones as a multi-purpose sensor for understanding conversation and the surrounding environment. This resulted in a proliferation of Voice Controllable Systems (VCS) around us. The increasing popularity of such systems is also prone to attracting miscreants, who often want to take advantage of the VCS without the knowledge of the user. Consequently, understanding the robustness of VCS, especially under adversarial attacks, has become an important research topic. Although there exists some previous work on audio adversarial attacks, their scopes are limited to embedding the attacks onto pre-recorded music clips, which when played through speakers cause VCS to misbehave. As an attack-audio needs to be played, the occurrence of this type of attacks can be suspected by a human listener. In this paper, we focus on audio-based Denial-of-Service (DoS) attack, which is unexplored in the literature. Contrary to previous work, we show that adversarial audio attacks in real-time and overthe-air are possible, while a user interacts with VCS. We show that the attacks are effective regardless of the user's command and interaction timings. In this paper, we present a first-of-itskind imperceptible and always-on universal audio perturbation technique that enables such DoS attack to be successful. We thoroughly evaluate the performance of the attacking scheme across (i) two learning tasks, (ii) two model architectures and (iii) three datasets. We demonstrate that the attack can introduce as high as 78% error rate in audio recognition tasks.
{"title":"AudiDoS: Real-Time Denial-of-Service Adversarial Attacks on Deep Audio Models","authors":"Taesik Gong, Alberto Gil C. P. Ramos, S. Bhattacharya, Akhil Mathur, F. Kawsar","doi":"10.1109/ICMLA.2019.00167","DOIUrl":"https://doi.org/10.1109/ICMLA.2019.00167","url":null,"abstract":"Deep learning has enabled personal and IoT devices to rethink microphones as a multi-purpose sensor for understanding conversation and the surrounding environment. This resulted in a proliferation of Voice Controllable Systems (VCS) around us. The increasing popularity of such systems is also prone to attracting miscreants, who often want to take advantage of the VCS without the knowledge of the user. Consequently, understanding the robustness of VCS, especially under adversarial attacks, has become an important research topic. Although there exists some previous work on audio adversarial attacks, their scopes are limited to embedding the attacks onto pre-recorded music clips, which when played through speakers cause VCS to misbehave. As an attack-audio needs to be played, the occurrence of this type of attacks can be suspected by a human listener. In this paper, we focus on audio-based Denial-of-Service (DoS) attack, which is unexplored in the literature. Contrary to previous work, we show that adversarial audio attacks in real-time and overthe-air are possible, while a user interacts with VCS. We show that the attacks are effective regardless of the user's command and interaction timings. In this paper, we present a first-of-itskind imperceptible and always-on universal audio perturbation technique that enables such DoS attack to be successful. We thoroughly evaluate the performance of the attacking scheme across (i) two learning tasks, (ii) two model architectures and (iii) three datasets. We demonstrate that the attack can introduce as high as 78% error rate in audio recognition tasks.","PeriodicalId":436714,"journal":{"name":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127925334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.1109/ICMLA.2019.00249
X. Liao, M. Tyagi
Recent improvements in technology and computational power have increased interest in the application of data driven modeling (DDM) in petroleum industry. Recovery process evaluation using numerical reservoir simulators are always time consuming and computational intensive with many assumptions and uncertainty involved and inefficient for fast decision making. Thus, DDM have been adopted as an alternative tool to predict production performance under waterflooding which is one of the most important techniques for improving oil recovery. A synthetic waterflooding dataset including production profile, operational parameters, reservoir properties and well locations is constructed using the numerical reservoir simulator. Exploratory data analysis provides several insights into the non-intuitive factors in building the reservoir model. K-means clustering analysis is performed to identify internal groupings among producers. Artificial neural network (ANN) and support vector regression (SVR) are used to decipher the nonlinear relationships between input attributes and waterflooding production. The trained models are subsequently used to predict cumulative oil and watercut on the unseen samples. Clustering analysis reveal that distance to the free water level has a dominant effect and the clustering assignment is controlled by the interplay among input attributes characterizing reservoir properties and relative well locations. Good agreements between predicted outputs from models and simulation targets present the satisfactory generalization performance and predictive capabilities of ANN and SVR methods. ANN model with one output provides the most accurate prediction result on the test data. SVR models provide similar but slightly worse forecast than ANN models. Proposed methodologies in this work can be utilized as a surrogate or complementary model to analyze and predict recovery process in other reservoirs fast and efficiently.
{"title":"Predictive Analytics and Statistical Learning for Waterflooding Operations in Reservoir Simulations","authors":"X. Liao, M. Tyagi","doi":"10.1109/ICMLA.2019.00249","DOIUrl":"https://doi.org/10.1109/ICMLA.2019.00249","url":null,"abstract":"Recent improvements in technology and computational power have increased interest in the application of data driven modeling (DDM) in petroleum industry. Recovery process evaluation using numerical reservoir simulators are always time consuming and computational intensive with many assumptions and uncertainty involved and inefficient for fast decision making. Thus, DDM have been adopted as an alternative tool to predict production performance under waterflooding which is one of the most important techniques for improving oil recovery. A synthetic waterflooding dataset including production profile, operational parameters, reservoir properties and well locations is constructed using the numerical reservoir simulator. Exploratory data analysis provides several insights into the non-intuitive factors in building the reservoir model. K-means clustering analysis is performed to identify internal groupings among producers. Artificial neural network (ANN) and support vector regression (SVR) are used to decipher the nonlinear relationships between input attributes and waterflooding production. The trained models are subsequently used to predict cumulative oil and watercut on the unseen samples. Clustering analysis reveal that distance to the free water level has a dominant effect and the clustering assignment is controlled by the interplay among input attributes characterizing reservoir properties and relative well locations. Good agreements between predicted outputs from models and simulation targets present the satisfactory generalization performance and predictive capabilities of ANN and SVR methods. ANN model with one output provides the most accurate prediction result on the test data. SVR models provide similar but slightly worse forecast than ANN models. Proposed methodologies in this work can be utilized as a surrogate or complementary model to analyze and predict recovery process in other reservoirs fast and efficiently.","PeriodicalId":436714,"journal":{"name":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131306008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.1109/ICMLA.2019.00271
Wenjing Li, R. Paffenroth
Ensemble methods for classification problems construct a set of models, often called "learners", and then assign class labels to new data points by taking a combination of the predictions from these models. Ensemble methods are popular and used in a wide range of problem domains because of their good performance. However, a theoretical understanding of the optimality of ensembles is, in many instances, an open problem. In particular, improving the performance of an ensemble requires an understanding of the subtle interplay between the accuracy of the individual learners and the diversity of the learners in the ensemble. For example, if all of the learners in an ensemble were identical, then clearly the accuracy of the ensemble cannot be any better than the accuracy of the individual learning, no matter how many learners one were to use. Accordingly, here we develop a theory for understanding when ensembles are optimal, in an appropriate sense, by balancing individual accuracy against ensemble diversity, from the perspective of statistical correlations. The theory that we derive is applicable for many practical ensembles, and we provide a set of metrics for assessing the optimality of any given ensemble. Perhaps most interestingly, the metrics that we develop lead naturally to a set of novel loss functions that can be optimized using backpropagation giving rise to optimal deep neural network based ensembles. We demonstrate the effectiveness of these deep neural network based ensembles using standard benchmark data sets.
{"title":"Optimal Ensembles for Deep Learning Classification: Theory and Practice","authors":"Wenjing Li, R. Paffenroth","doi":"10.1109/ICMLA.2019.00271","DOIUrl":"https://doi.org/10.1109/ICMLA.2019.00271","url":null,"abstract":"Ensemble methods for classification problems construct a set of models, often called \"learners\", and then assign class labels to new data points by taking a combination of the predictions from these models. Ensemble methods are popular and used in a wide range of problem domains because of their good performance. However, a theoretical understanding of the optimality of ensembles is, in many instances, an open problem. In particular, improving the performance of an ensemble requires an understanding of the subtle interplay between the accuracy of the individual learners and the diversity of the learners in the ensemble. For example, if all of the learners in an ensemble were identical, then clearly the accuracy of the ensemble cannot be any better than the accuracy of the individual learning, no matter how many learners one were to use. Accordingly, here we develop a theory for understanding when ensembles are optimal, in an appropriate sense, by balancing individual accuracy against ensemble diversity, from the perspective of statistical correlations. The theory that we derive is applicable for many practical ensembles, and we provide a set of metrics for assessing the optimality of any given ensemble. Perhaps most interestingly, the metrics that we develop lead naturally to a set of novel loss functions that can be optimized using backpropagation giving rise to optimal deep neural network based ensembles. We demonstrate the effectiveness of these deep neural network based ensembles using standard benchmark data sets.","PeriodicalId":436714,"journal":{"name":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129229914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.1109/ICMLA.2019.00321
Benjamin Lutz, Dominik Kißkalt, Daniel Regulin, Raven T. Reisch, A. Schiffler, J. Franke
Tool wear is one of the main factors of manufacturing costs in subtractive manufacturing processes. To control manufacturing processes while taking the tool wear into account, a variety of tool condition monitoring systems have been investigated. In this paper, we present a new approach to support the manual analysis of tool wear images by the means of semantic image segmentation. We utilize deep learning for image evaluation through semantic classification of different defect regions. In this study, a small-sized dataset of 100 cutting tool inserts at different tool conditions, exhibiting various wear defects, is acquired and masked by a process expert. A sliding window approach is used to extract small size feature maps from the raw images, with the class of the center pixel as the label. The relationship between the features and the label is trained using a convolutional neural network. Our investigation shows that this network can predict the wear defect class of each pixel with an accuracy of over 91%. Compared to other approaches, the proposed solution can differentiate between various defect types, for instance, flank wear, groove formation and build-up-edge. From the resulting segmented image, different wear metrics are computed, such as the maximum flank wear width or the occurrence and size of other wear defects. This information is fed back to the machine operator to support the decision process of whether to continue machining, adapt the cutting conditions or exchange the insert.
{"title":"Evaluation of Deep Learning for Semantic Image Segmentation in Tool Condition Monitoring","authors":"Benjamin Lutz, Dominik Kißkalt, Daniel Regulin, Raven T. Reisch, A. Schiffler, J. Franke","doi":"10.1109/ICMLA.2019.00321","DOIUrl":"https://doi.org/10.1109/ICMLA.2019.00321","url":null,"abstract":"Tool wear is one of the main factors of manufacturing costs in subtractive manufacturing processes. To control manufacturing processes while taking the tool wear into account, a variety of tool condition monitoring systems have been investigated. In this paper, we present a new approach to support the manual analysis of tool wear images by the means of semantic image segmentation. We utilize deep learning for image evaluation through semantic classification of different defect regions. In this study, a small-sized dataset of 100 cutting tool inserts at different tool conditions, exhibiting various wear defects, is acquired and masked by a process expert. A sliding window approach is used to extract small size feature maps from the raw images, with the class of the center pixel as the label. The relationship between the features and the label is trained using a convolutional neural network. Our investigation shows that this network can predict the wear defect class of each pixel with an accuracy of over 91%. Compared to other approaches, the proposed solution can differentiate between various defect types, for instance, flank wear, groove formation and build-up-edge. From the resulting segmented image, different wear metrics are computed, such as the maximum flank wear width or the occurrence and size of other wear defects. This information is fed back to the machine operator to support the decision process of whether to continue machining, adapt the cutting conditions or exchange the insert.","PeriodicalId":436714,"journal":{"name":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","volume":"180 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125537248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.1109/ICMLA.2019.00170
Iman Vasheghani Farahani, Alex Chien, R. King, M. Kay, Brad Klenz
This paper introduces a new method for the pattern-wise anomaly detection problem, which aims to find segments whose behaviors are different from the rest of the segments in the time series (as opposed to finding a single data-point in classic anomaly detection problems). An important motivation for studying this problem is to find anomalies whose data-points are within the normal range but they create an unusual pattern. To this end, normal characteristics of the data are found by clustering the overlapping subsequences of the training dataset and analyzing their orders by Markov chains. The trained model is used to assess how well the testing dataset suits the baseline behavior. The designed anomaly detection framework is capable of discovering unusual patterns in both streaming data (online) and stored data (offline). The performance of the methodology is evaluated by applying it to three datasets from different fields: a medical dataset (electrocardiogram), a utility usage dataset, and a New York City taxi demand dataset. The detected anomaly in the medical data agrees with the results of the studies in the literature. A domain expert confirmed the accuracy of the results for the utility usage data, and the anomalies of the New York City taxi demand data referred to major US holidays.
{"title":"Time Series Anomaly Detection from a Markov Chain Perspective","authors":"Iman Vasheghani Farahani, Alex Chien, R. King, M. Kay, Brad Klenz","doi":"10.1109/ICMLA.2019.00170","DOIUrl":"https://doi.org/10.1109/ICMLA.2019.00170","url":null,"abstract":"This paper introduces a new method for the pattern-wise anomaly detection problem, which aims to find segments whose behaviors are different from the rest of the segments in the time series (as opposed to finding a single data-point in classic anomaly detection problems). An important motivation for studying this problem is to find anomalies whose data-points are within the normal range but they create an unusual pattern. To this end, normal characteristics of the data are found by clustering the overlapping subsequences of the training dataset and analyzing their orders by Markov chains. The trained model is used to assess how well the testing dataset suits the baseline behavior. The designed anomaly detection framework is capable of discovering unusual patterns in both streaming data (online) and stored data (offline). The performance of the methodology is evaluated by applying it to three datasets from different fields: a medical dataset (electrocardiogram), a utility usage dataset, and a New York City taxi demand dataset. The detected anomaly in the medical data agrees with the results of the studies in the literature. A domain expert confirmed the accuracy of the results for the utility usage data, and the anomalies of the New York City taxi demand data referred to major US holidays.","PeriodicalId":436714,"journal":{"name":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114209494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}