Pub Date : 2019-12-01DOI: 10.1109/ICMLA.2019.00299
M. Vaida, Kevin Purcell
Graph neural networks (GNNs) have revolutionized deep learning on non-Euclidean data domains, and are extensively used in fields such as social media and recommendation systems. However, complex relational data structures such as hypergraphs, pose challenges for GNNs in terms of their ability to model, embed, and learn relational complexities of multigraphs. Most GNNs focus on capturing flat local neighborhoods of a node thus failing to account for structural properties of multi-relational graphs. This paper introduces Hypergraph Link Prediction (HLP), a novel approach of encoding the multilink structure of graphs. HLP allows pooling operations to incorporate a 360 degrees overview of a node interaction profile, by learning local neighborhood and global hypergraph structure simultaneously. Global graph information is injected into node representations, such that unique global structural patterns of every node are encoded at the node level. HLP leverages the augmented hypergraph adjacency matrix to incorporate the depth of the hypergraph in the convolutional layers. The model is applied to the task of predicting multi-drug interactions, by modeling relations between pairs of drugs as a hypergraph. The existence and the type of drug interactions between the same pair of drugs are mapped as multiple edges, and can be inferred by learning the multigraph local and global structure concurrently. To account for molecular graph properties of a drug, additional drug chemical graph structural fingerprints are included as node attributes.
图神经网络(gnn)已经彻底改变了非欧几里得数据领域的深度学习,并被广泛应用于社交媒体和推荐系统等领域。然而,复杂的关系数据结构(如超图)在建模、嵌入和学习多图关系复杂性的能力方面给gnn带来了挑战。大多数gnn专注于捕获节点的平面局部邻域,因此无法考虑多关系图的结构属性。介绍了一种对图的多链接结构进行编码的新方法——超图链接预测(Hypergraph Link Prediction, HLP)。通过同时学习局部邻域和全局超图结构,HLP允许池化操作合并节点交互配置文件的360度概述。全局图形信息被注入到节点表示中,这样每个节点的唯一全局结构模式就在节点级别被编码。HLP利用增广的超图邻接矩阵将超图的深度合并到卷积层中。该模型通过将药物对之间的关系建模为超图,应用于预测多种药物相互作用的任务。同一对药物之间相互作用的存在和类型被映射为多个边,并可以通过同时学习多图局部和全局结构来推断。为了说明药物的分子图属性,额外的药物化学图结构指纹被包括作为节点属性。
{"title":"Hypergraph Link Prediction: Learning Drug Interaction Networks Embeddings","authors":"M. Vaida, Kevin Purcell","doi":"10.1109/ICMLA.2019.00299","DOIUrl":"https://doi.org/10.1109/ICMLA.2019.00299","url":null,"abstract":"Graph neural networks (GNNs) have revolutionized deep learning on non-Euclidean data domains, and are extensively used in fields such as social media and recommendation systems. However, complex relational data structures such as hypergraphs, pose challenges for GNNs in terms of their ability to model, embed, and learn relational complexities of multigraphs. Most GNNs focus on capturing flat local neighborhoods of a node thus failing to account for structural properties of multi-relational graphs. This paper introduces Hypergraph Link Prediction (HLP), a novel approach of encoding the multilink structure of graphs. HLP allows pooling operations to incorporate a 360 degrees overview of a node interaction profile, by learning local neighborhood and global hypergraph structure simultaneously. Global graph information is injected into node representations, such that unique global structural patterns of every node are encoded at the node level. HLP leverages the augmented hypergraph adjacency matrix to incorporate the depth of the hypergraph in the convolutional layers. The model is applied to the task of predicting multi-drug interactions, by modeling relations between pairs of drugs as a hypergraph. The existence and the type of drug interactions between the same pair of drugs are mapped as multiple edges, and can be inferred by learning the multigraph local and global structure concurrently. To account for molecular graph properties of a drug, additional drug chemical graph structural fingerprints are included as node attributes.","PeriodicalId":436714,"journal":{"name":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132927363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.1109/ICMLA.2019.00056
Meiyan Xie, Yunzhe Xue, Usman Roshan
The 01 loss while hard to optimize is least sensitive to outliers compared to its continuous differentiable counterparts, namely hinge and logistic loss. Recently the 01 loss has been shown to be most robust compared to surrogate losses against corrupted labels which can be interpreted as adversarial attacks. Here we propose a stochastic coordinate descent heuristic for linear 01 loss classification. We implement and study our heuristic on real datasets from the UCI machine learning archive and find our method to be comparable to the support vector machine in accuracy and tractable in training time. We conjecture that the 01 loss may be harder to attack in a black box setting due to its non-continuity and infinite solution space. We train our linear classifier in a one-vs-one multi-class strategy on CIFAR10 and STL10 image benchmark datasets. In both cases we find our classifier to have the same accuracy as the linear support vector machine but more resilient to black box attacks. On CIFAR10 the linear support vector machine has 0% on adversarial examples while the 01 loss classifier hovers about 10%. On STL10 the linear support vector machine has 0% accuracy whereas 01 loss is at 10%. Our work here suggests that 01 loss may be more resilient to adversarial attacks than the hinge loss and further work is required.
{"title":"Stochastic Coordinate Descent for 01 Loss and Its Sensitivity to Adversarial Attacks","authors":"Meiyan Xie, Yunzhe Xue, Usman Roshan","doi":"10.1109/ICMLA.2019.00056","DOIUrl":"https://doi.org/10.1109/ICMLA.2019.00056","url":null,"abstract":"The 01 loss while hard to optimize is least sensitive to outliers compared to its continuous differentiable counterparts, namely hinge and logistic loss. Recently the 01 loss has been shown to be most robust compared to surrogate losses against corrupted labels which can be interpreted as adversarial attacks. Here we propose a stochastic coordinate descent heuristic for linear 01 loss classification. We implement and study our heuristic on real datasets from the UCI machine learning archive and find our method to be comparable to the support vector machine in accuracy and tractable in training time. We conjecture that the 01 loss may be harder to attack in a black box setting due to its non-continuity and infinite solution space. We train our linear classifier in a one-vs-one multi-class strategy on CIFAR10 and STL10 image benchmark datasets. In both cases we find our classifier to have the same accuracy as the linear support vector machine but more resilient to black box attacks. On CIFAR10 the linear support vector machine has 0% on adversarial examples while the 01 loss classifier hovers about 10%. On STL10 the linear support vector machine has 0% accuracy whereas 01 loss is at 10%. Our work here suggests that 01 loss may be more resilient to adversarial attacks than the hinge loss and further work is required.","PeriodicalId":436714,"journal":{"name":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117020413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.1109/ICMLA.2019.00134
Justin M. Johnson, T. Khoshgoftaar
Class imbalance is a regularly occurring problem in machine learning that has been studied extensively over the last two decades. Various methods for addressing class imbalance have been introduced, including algorithm-level methods, datalevel methods, and hybrid methods. While these methods are well studied using traditional machine learning algorithms, there are relatively few studies that explore their application to deep neural networks. Thresholding, in particular, is rarely discussed in the deep learning with class imbalance literature. This paper addresses this gap by conducting a systematic study on the application of thresholding with deep neural networks using a Big Data Medicare fraud data set. We use random oversampling (ROS), random under-sampling (RUS), and a hybrid ROS-RUS to create 15 training distributions with varying levels of class imbalance. With the fraudulent class size ranging from 0.03%–60%, we identify optimal classification thresholds for each distribution on random validation sets and then score the thresholds on a 20% holdout test set. Through repetition and statistical analysis, confidence intervals show that the default threshold is never optimal when training data is imbalanced. Results also show that the optimal threshold outperforms the default threshold in nearly all cases, and linear models indicate a strong linear relationship between the minority class size and the optimal decision threshold. To the best of our knowledge, this is the first study to provide statistical results that describe optimal classification thresholds for deep neural networks over a range of class distributions.
{"title":"Deep Learning and Thresholding with Class-Imbalanced Big Data","authors":"Justin M. Johnson, T. Khoshgoftaar","doi":"10.1109/ICMLA.2019.00134","DOIUrl":"https://doi.org/10.1109/ICMLA.2019.00134","url":null,"abstract":"Class imbalance is a regularly occurring problem in machine learning that has been studied extensively over the last two decades. Various methods for addressing class imbalance have been introduced, including algorithm-level methods, datalevel methods, and hybrid methods. While these methods are well studied using traditional machine learning algorithms, there are relatively few studies that explore their application to deep neural networks. Thresholding, in particular, is rarely discussed in the deep learning with class imbalance literature. This paper addresses this gap by conducting a systematic study on the application of thresholding with deep neural networks using a Big Data Medicare fraud data set. We use random oversampling (ROS), random under-sampling (RUS), and a hybrid ROS-RUS to create 15 training distributions with varying levels of class imbalance. With the fraudulent class size ranging from 0.03%–60%, we identify optimal classification thresholds for each distribution on random validation sets and then score the thresholds on a 20% holdout test set. Through repetition and statistical analysis, confidence intervals show that the default threshold is never optimal when training data is imbalanced. Results also show that the optimal threshold outperforms the default threshold in nearly all cases, and linear models indicate a strong linear relationship between the minority class size and the optimal decision threshold. To the best of our knowledge, this is the first study to provide statistical results that describe optimal classification thresholds for deep neural networks over a range of class distributions.","PeriodicalId":436714,"journal":{"name":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114950196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.1109/ICMLA.2019.00163
Mohammed Bany Muhammad, A. Moinuddin, M. Lee, Yanfei Zhang, V. Abedi, R. Zand, M. Yeasin
The assessment of knee joint gap and severity of Osteoarthritis (OA) is subjective and often inaccurate. The main source of error is due to the judgement of human expert from low resolution images (i.e., X-ray images). To address the problem, we developed an ensemble of Deep Learning (DL) model to objectively score the severity of OA only from the radiometric images. The proposed method consists of two main modules. First, we developed a scale invariant and aspect ratio preserving automatic localization and characterization of the kneecap area. Second, we developed multiple instances of "hyper parameter optimized" DL models and fused them using ensemble classification to score the severity of OA. In this implementation, we used three convolutional neural networks to improve the bias-variance trade-off, and boost accuracy and generalization. We tested our modeling framework using a collection of 4,796 X-ray images from Osteoarthritis Initiative (OAI). Our results show a higher performance (~ 2-8%) when compared to the state-of-the-art methods. Finally, this machine learning-based methodology provides a pipeline in decision support system for assessing and quantifying the OA severity.
{"title":"Deep Ensemble Network for Quantification and Severity Assessment of Knee Osteoarthritis","authors":"Mohammed Bany Muhammad, A. Moinuddin, M. Lee, Yanfei Zhang, V. Abedi, R. Zand, M. Yeasin","doi":"10.1109/ICMLA.2019.00163","DOIUrl":"https://doi.org/10.1109/ICMLA.2019.00163","url":null,"abstract":"The assessment of knee joint gap and severity of Osteoarthritis (OA) is subjective and often inaccurate. The main source of error is due to the judgement of human expert from low resolution images (i.e., X-ray images). To address the problem, we developed an ensemble of Deep Learning (DL) model to objectively score the severity of OA only from the radiometric images. The proposed method consists of two main modules. First, we developed a scale invariant and aspect ratio preserving automatic localization and characterization of the kneecap area. Second, we developed multiple instances of \"hyper parameter optimized\" DL models and fused them using ensemble classification to score the severity of OA. In this implementation, we used three convolutional neural networks to improve the bias-variance trade-off, and boost accuracy and generalization. We tested our modeling framework using a collection of 4,796 X-ray images from Osteoarthritis Initiative (OAI). Our results show a higher performance (~ 2-8%) when compared to the state-of-the-art methods. Finally, this machine learning-based methodology provides a pipeline in decision support system for assessing and quantifying the OA severity.","PeriodicalId":436714,"journal":{"name":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115078464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.1109/ICMLA.2019.00169
Okwudili M. Ezeme, Q. Mahmoud, Akramul Azim
One of the multiplier effects of the boom in mobile technologies ranging from cell phones to computers and wearables like smart watches is that every public and private common spaces are now dotted with Wi-Fi hotspots. These hotspots provide the convenience of accessing the internet on-the-go for either play or work. Also, with the increased automation of our daily routines by our mobile devices via a multitude of applications, our vulnerability to cyber fraud or attacks becomes higher too. Hence, the need for heightened security that is capable of detecting anomalies on-the-fly. However, these edge devices connected to the local area network come with diverse capabilities with varying degrees of limitations in compute and energy resources. Therefore, running a process-based anomaly detector is not given a high priority in these devices because; a) the primary functions of the applications running on the devices is not security; therefore, the device allocates much of its resources into satisfying the primary duty of the applications. b) the volume and velocity of the data are high. Therefore, in this paper, we introduce a multi-node (nodes and devices are used interchangeably in the paper) ad-hoc network that uses a novel offloading scheme to bring an online anomaly detection capability on the kernel events to the nodes in the network. We test the framework in a Wi-Fi-based ad-hoc network made up of several devices, and the results confirm our hypothesis that the scheme can reduce latency and increase the throughput of the anomaly detector, thereby making online anomaly detection in the edge possible without sacrificing the accuracy of the deep recurrent neural network.
{"title":"A Deep Learning Approach to Distributed Anomaly Detection for Edge Computing","authors":"Okwudili M. Ezeme, Q. Mahmoud, Akramul Azim","doi":"10.1109/ICMLA.2019.00169","DOIUrl":"https://doi.org/10.1109/ICMLA.2019.00169","url":null,"abstract":"One of the multiplier effects of the boom in mobile technologies ranging from cell phones to computers and wearables like smart watches is that every public and private common spaces are now dotted with Wi-Fi hotspots. These hotspots provide the convenience of accessing the internet on-the-go for either play or work. Also, with the increased automation of our daily routines by our mobile devices via a multitude of applications, our vulnerability to cyber fraud or attacks becomes higher too. Hence, the need for heightened security that is capable of detecting anomalies on-the-fly. However, these edge devices connected to the local area network come with diverse capabilities with varying degrees of limitations in compute and energy resources. Therefore, running a process-based anomaly detector is not given a high priority in these devices because; a) the primary functions of the applications running on the devices is not security; therefore, the device allocates much of its resources into satisfying the primary duty of the applications. b) the volume and velocity of the data are high. Therefore, in this paper, we introduce a multi-node (nodes and devices are used interchangeably in the paper) ad-hoc network that uses a novel offloading scheme to bring an online anomaly detection capability on the kernel events to the nodes in the network. We test the framework in a Wi-Fi-based ad-hoc network made up of several devices, and the results confirm our hypothesis that the scheme can reduce latency and increase the throughput of the anomaly detector, thereby making online anomaly detection in the edge possible without sacrificing the accuracy of the deep recurrent neural network.","PeriodicalId":436714,"journal":{"name":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115715692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.1109/ICMLA.2019.00239
Rosemarie J. Day, H. Salehi, Mahsa Javadi
Everyday migraines are affecting more than one billion people worldwide. This headache disorder is classified as the sixth most disabling disease in the world. Migraines are just one chronic illness affected by environmental triggers due to changes that occur inside the home. Migraines share this characteristic with sinus headaches and thus are often misdiagnosed. In this research work, an iOS-based environmental analyzer was designed, implemented and evaluated for migraine sufferers with the use of sensors. After the data collection and cleaning, five machine learning model were used to estimate prediction accuracy of migraines in terms of the environment. The data was evaluated against the models using K-Fold cross validation. The algorithm accuracy comparison showed that Linear Discriminant Analysis (LDA) produced highest accuracy for the testing data at a mean of 0.938. Preliminary results demonstrate the feasibility of using machine learning algorithms to perform the automated recognition of migraine trigger areas in the environment.
{"title":"IoT Environmental Analyzer using Sensors and Machine Learning for Migraine Occurrence Prevention","authors":"Rosemarie J. Day, H. Salehi, Mahsa Javadi","doi":"10.1109/ICMLA.2019.00239","DOIUrl":"https://doi.org/10.1109/ICMLA.2019.00239","url":null,"abstract":"Everyday migraines are affecting more than one billion people worldwide. This headache disorder is classified as the sixth most disabling disease in the world. Migraines are just one chronic illness affected by environmental triggers due to changes that occur inside the home. Migraines share this characteristic with sinus headaches and thus are often misdiagnosed. In this research work, an iOS-based environmental analyzer was designed, implemented and evaluated for migraine sufferers with the use of sensors. After the data collection and cleaning, five machine learning model were used to estimate prediction accuracy of migraines in terms of the environment. The data was evaluated against the models using K-Fold cross validation. The algorithm accuracy comparison showed that Linear Discriminant Analysis (LDA) produced highest accuracy for the testing data at a mean of 0.938. Preliminary results demonstrate the feasibility of using machine learning algorithms to perform the automated recognition of migraine trigger areas in the environment.","PeriodicalId":436714,"journal":{"name":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121335751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.1109/ICMLA.2019.00291
Iuliia Gavriushina, Oliver R. Sampson, M. Berthold, W. Pohlmeier, C. Borgelt
Index investing has an advantage over active investment strategies, because less frequent trading results in lower expenses, yielding higher long-term returns. Index tracking is a popular investment strategy that attempts to find a portfolio replicating the performance of a collection of investment vehicles. This paper considers index tracking from the perspective of solution space exploration. Three search space heuristics in combination with three portfolio tracking error methods are compared in order to select a tracking portfolio with returns that mimic a benchmark index. Experimental results conducted on real-world datasets show that Widening, a metaheuristic using diverse parallel search paths, finds superior solutions than those found by the reference heuristics. Presented here are the first results using Widening on time-series data.
{"title":"Widened Learning of Index Tracking Portfolios","authors":"Iuliia Gavriushina, Oliver R. Sampson, M. Berthold, W. Pohlmeier, C. Borgelt","doi":"10.1109/ICMLA.2019.00291","DOIUrl":"https://doi.org/10.1109/ICMLA.2019.00291","url":null,"abstract":"Index investing has an advantage over active investment strategies, because less frequent trading results in lower expenses, yielding higher long-term returns. Index tracking is a popular investment strategy that attempts to find a portfolio replicating the performance of a collection of investment vehicles. This paper considers index tracking from the perspective of solution space exploration. Three search space heuristics in combination with three portfolio tracking error methods are compared in order to select a tracking portfolio with returns that mimic a benchmark index. Experimental results conducted on real-world datasets show that Widening, a metaheuristic using diverse parallel search paths, finds superior solutions than those found by the reference heuristics. Presented here are the first results using Widening on time-series data.","PeriodicalId":436714,"journal":{"name":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125235128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.1109/ICMLA.2019.00126
Nadia Burkart, Marco F. Huber, Phillip Faller
Remarkable progress in the field of machine learning strongly drives the research in many application domains. For some domains, it is mandatory that the output of machine learning algorithms needs to be interpretable. In this paper, we propose a rule-based regularization technique to enforce interpretability for neural networks (NN). For this purpose, we train a rule-based surrogate model simultaneously with the NN. From the surrogate, a metric quantifying its degree of explainability is derived and fed back to the training of the NN as a regularization term. We evaluate our model on four datasets and compare it to unregularized models as well as a decision tree (DT) based baseline. The rule-based regularization approach achieves interpretability and competitive accuracy.
{"title":"Forcing Interpretability for Deep Neural Networks through Rule-Based Regularization","authors":"Nadia Burkart, Marco F. Huber, Phillip Faller","doi":"10.1109/ICMLA.2019.00126","DOIUrl":"https://doi.org/10.1109/ICMLA.2019.00126","url":null,"abstract":"Remarkable progress in the field of machine learning strongly drives the research in many application domains. For some domains, it is mandatory that the output of machine learning algorithms needs to be interpretable. In this paper, we propose a rule-based regularization technique to enforce interpretability for neural networks (NN). For this purpose, we train a rule-based surrogate model simultaneously with the NN. From the surrogate, a metric quantifying its degree of explainability is derived and fed back to the training of the NN as a regularization term. We evaluate our model on four datasets and compare it to unregularized models as well as a decision tree (DT) based baseline. The rule-based regularization approach achieves interpretability and competitive accuracy.","PeriodicalId":436714,"journal":{"name":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123988394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.1109/ICMLA.2019.00111
Daniel Gutiérrez, S. Toral
Mobility plays an important role in the performance of wireless multi-hop networks. Since communications are established in a multi-hop fashion, the mobility of nodes can cause a significant degradation of the performance. Therefore, the analysis of nodes' mobility is relevant to improve the performance of the applications implemented over wireless multi-hop networks. This work evaluates two neuronal network models, such as fully connected or multi-layer perceptron and 1D convolutional models, for the classification of up to four widely used mobility models for wireless multi-hop networks. Several architectures are evaluated and parametrized for both models. The results indicate a considerable better performance of an architecture with 1D convolutional layers. The test results show that the best convolutional 1D model is able to reach an accuracy level of 0.91, outperforming the best multi-layer perceptron model in 13,9 %.
{"title":"Deep Neuronal Based Classifiers for Wireless Multi-hop Network Mobility Models","authors":"Daniel Gutiérrez, S. Toral","doi":"10.1109/ICMLA.2019.00111","DOIUrl":"https://doi.org/10.1109/ICMLA.2019.00111","url":null,"abstract":"Mobility plays an important role in the performance of wireless multi-hop networks. Since communications are established in a multi-hop fashion, the mobility of nodes can cause a significant degradation of the performance. Therefore, the analysis of nodes' mobility is relevant to improve the performance of the applications implemented over wireless multi-hop networks. This work evaluates two neuronal network models, such as fully connected or multi-layer perceptron and 1D convolutional models, for the classification of up to four widely used mobility models for wireless multi-hop networks. Several architectures are evaluated and parametrized for both models. The results indicate a considerable better performance of an architecture with 1D convolutional layers. The test results show that the best convolutional 1D model is able to reach an accuracy level of 0.91, outperforming the best multi-layer perceptron model in 13,9 %.","PeriodicalId":436714,"journal":{"name":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122261454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.1109/ICMLA.2019.00127
Yuriy Mishchenko, Yusuf Goren, Ming Sun, Chris Beauchene, Spyros Matsoukas, Oleg Rybakov, S. Vitaladevuni
In this paper, we investigate novel quantization approaches to reduce memory and computational footprint of deep neural network (DNN) based keyword spotters (KWS). We propose a new method for KWS offline and online quantization, which we call dynamic quantization, where we quantize DNN weight matrices column-wise, using each column's exact individual min-max range, and the DNN layers' inputs and outputs are quantized for every input audio frame individually, using the exact min-max range of each input and output vector. We further apply a new quantization-aware training approach that allows us to incorporate quantization errors into KWS model during training. Together, these approaches allow us to significantly improve the performance of KWS in 4-bit and 8-bit quantized precision, achieving the end-to-end accuracy close to that of full precision models while reducing the models' on-device memory footprint by up to 80%.
{"title":"Low-Bit Quantization and Quantization-Aware Training for Small-Footprint Keyword Spotting","authors":"Yuriy Mishchenko, Yusuf Goren, Ming Sun, Chris Beauchene, Spyros Matsoukas, Oleg Rybakov, S. Vitaladevuni","doi":"10.1109/ICMLA.2019.00127","DOIUrl":"https://doi.org/10.1109/ICMLA.2019.00127","url":null,"abstract":"In this paper, we investigate novel quantization approaches to reduce memory and computational footprint of deep neural network (DNN) based keyword spotters (KWS). We propose a new method for KWS offline and online quantization, which we call dynamic quantization, where we quantize DNN weight matrices column-wise, using each column's exact individual min-max range, and the DNN layers' inputs and outputs are quantized for every input audio frame individually, using the exact min-max range of each input and output vector. We further apply a new quantization-aware training approach that allows us to incorporate quantization errors into KWS model during training. Together, these approaches allow us to significantly improve the performance of KWS in 4-bit and 8-bit quantized precision, achieving the end-to-end accuracy close to that of full precision models while reducing the models' on-device memory footprint by up to 80%.","PeriodicalId":436714,"journal":{"name":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128708618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}