Pub Date : 2019-11-01DOI: 10.1109/ictai.2019.00001
{"title":"[Title page i]","authors":"","doi":"10.1109/ictai.2019.00001","DOIUrl":"https://doi.org/10.1109/ictai.2019.00001","url":null,"abstract":"","PeriodicalId":346657,"journal":{"name":"2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)","volume":"276 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131861228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-11-01DOI: 10.1109/ICTAI.2019.00130
Yusuke Ohtsubo, Tetsu Matsukawa, Einoshin Suzuki
We propose a GAN-based one-shot generation method on a fine-grained category, which represents a subclass of a category, typically with diverse examples. One-shot generation refers to a task of taking an image which belongs to a class not used in the training phase and then generating a set of new images belonging to the same class. Generative Adversarial Network (GAN), which represents a type of deep neural networks with competing generator and discriminator, has proven to be useful in generating realistic images. Especially DAGAN, which maps the input image to a low-dimensional space via an encoder and then back to the example space via a decoder, has been quite effective with datasets such as handwritten character datasets. However, when the class corresponds to a fine-grained category, DAGAN occasionally generates images which are regarded as belonging to other classes due to the rich variety of the examples in the class and the low dissimilarities of the examples among the classes. For example, it accidentally generates facial images of different persons when the class corresponds to a specific person. To circumvent this problem, we introduce a metric learning with a triplet loss to the bottleneck layer of DAGAN to penalize such a generation. We also extend the optimization algorithm of DAGAN to an alternating procedure for two types of loss functions. Our proposed method outperforms DAGAN in the GAN-test task for VGG-Face dataset and CompCars dataset by 5.6% and 4.8% in accuracy, respectively. We also conducted experiments for the data augmentation task and observed 4.5% higher accuracy for our proposed method over DAGAN for VGG-Face dataset.
{"title":"Harnessing GAN with Metric Learning for One-Shot Generation on a Fine-Grained Category","authors":"Yusuke Ohtsubo, Tetsu Matsukawa, Einoshin Suzuki","doi":"10.1109/ICTAI.2019.00130","DOIUrl":"https://doi.org/10.1109/ICTAI.2019.00130","url":null,"abstract":"We propose a GAN-based one-shot generation method on a fine-grained category, which represents a subclass of a category, typically with diverse examples. One-shot generation refers to a task of taking an image which belongs to a class not used in the training phase and then generating a set of new images belonging to the same class. Generative Adversarial Network (GAN), which represents a type of deep neural networks with competing generator and discriminator, has proven to be useful in generating realistic images. Especially DAGAN, which maps the input image to a low-dimensional space via an encoder and then back to the example space via a decoder, has been quite effective with datasets such as handwritten character datasets. However, when the class corresponds to a fine-grained category, DAGAN occasionally generates images which are regarded as belonging to other classes due to the rich variety of the examples in the class and the low dissimilarities of the examples among the classes. For example, it accidentally generates facial images of different persons when the class corresponds to a specific person. To circumvent this problem, we introduce a metric learning with a triplet loss to the bottleneck layer of DAGAN to penalize such a generation. We also extend the optimization algorithm of DAGAN to an alternating procedure for two types of loss functions. Our proposed method outperforms DAGAN in the GAN-test task for VGG-Face dataset and CompCars dataset by 5.6% and 4.8% in accuracy, respectively. We also conducted experiments for the data augmentation task and observed 4.5% higher accuracy for our proposed method over DAGAN for VGG-Face dataset.","PeriodicalId":346657,"journal":{"name":"2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115619919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-11-01DOI: 10.1109/ICTAI.2019.00202
Mathilde Fekom, N. Vayatis, Argyris Kalogeratos
In this paper we present the Warm-starting Dynamic Thresholding algorithm, developed using dynamic programming, for a variant of the standard online selection problem. The problem allows job positions to be either free or already occupied at the beginning of the process. Throughout the selection process, the decision maker interviews one after the other the new candidates and reveals a quality score for each of them. Based on that information, she can (re) assign each job at most once by taking immediate and irrevocable decisions. We relax the hard requirement of the class of dynamic programming algorithms to perfectly know the distribution from which the scores of candidates are drawn, by presenting extensions for the partial and no-information cases, in which the decision maker can learn the underlying score distribution sequentially while interviewing candidates.
{"title":"Optimal Multiple Stopping Rule for Warm-Starting Sequential Selection","authors":"Mathilde Fekom, N. Vayatis, Argyris Kalogeratos","doi":"10.1109/ICTAI.2019.00202","DOIUrl":"https://doi.org/10.1109/ICTAI.2019.00202","url":null,"abstract":"In this paper we present the Warm-starting Dynamic Thresholding algorithm, developed using dynamic programming, for a variant of the standard online selection problem. The problem allows job positions to be either free or already occupied at the beginning of the process. Throughout the selection process, the decision maker interviews one after the other the new candidates and reveals a quality score for each of them. Based on that information, she can (re) assign each job at most once by taking immediate and irrevocable decisions. We relax the hard requirement of the class of dynamic programming algorithms to perfectly know the distribution from which the scores of candidates are drawn, by presenting extensions for the partial and no-information cases, in which the decision maker can learn the underlying score distribution sequentially while interviewing candidates.","PeriodicalId":346657,"journal":{"name":"2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114281590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-11-01DOI: 10.1109/ICTAI.2019.00031
B. Jaumard, Kia Babashahi Ashtiani, Nicolas Huin
In the context of multi-agent systems, Automated Mechanism Design (AMD) is the computer-based design of the rules of a mechanism, which reaches an equilibrium despite the fact that agents can be selfish and lie about their preferences. Although it has been shown that AMD can be modelled as a linear program, it is with an exponential number of variables and consequently, there is no known efficient algorithm. We revisit the latter linear program model proposed for the AMD problem and introduce a new one with a polynomial number of variables. We show that the latter model corresponds to a Dantzig-Wolfe decomposition of the second one and design efficient solution schemes in polynomial time for both two models. Numerical experiments compare the solution efficiency of both models and show that we can solve very significantly larger data instances than before, up to 2,000 agents or 2,000 resources in about 35 seconds.
{"title":"Automated Mechanism Design: Compact and Decomposition Linear Programming Models","authors":"B. Jaumard, Kia Babashahi Ashtiani, Nicolas Huin","doi":"10.1109/ICTAI.2019.00031","DOIUrl":"https://doi.org/10.1109/ICTAI.2019.00031","url":null,"abstract":"In the context of multi-agent systems, Automated Mechanism Design (AMD) is the computer-based design of the rules of a mechanism, which reaches an equilibrium despite the fact that agents can be selfish and lie about their preferences. Although it has been shown that AMD can be modelled as a linear program, it is with an exponential number of variables and consequently, there is no known efficient algorithm. We revisit the latter linear program model proposed for the AMD problem and introduce a new one with a polynomial number of variables. We show that the latter model corresponds to a Dantzig-Wolfe decomposition of the second one and design efficient solution schemes in polynomial time for both two models. Numerical experiments compare the solution efficiency of both models and show that we can solve very significantly larger data instances than before, up to 2,000 agents or 2,000 resources in about 35 seconds.","PeriodicalId":346657,"journal":{"name":"2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114854129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-11-01DOI: 10.1109/ICTAI.2019.00192
Ashish Rana, A. Malhi, Kary Främling
Neural networks are not great generalizers outside their training range i.e. they are good at capturing bias but might miss the overall concept. Important issues with neural networks is that when testing data goes outside training range they fail to predict accurate results. Hence, they loose the ability to generalize a concept. For systematic numeric exploration neural accumulators (NAC) and neural arithmetic logic unit(NALU) are proposed which performs excellent for simple arithmetic operations. But, major limitation with these units is that they can't handle complex mathematical operations & equations. For example, NALU can predict accurate results for multiplication operation but not for factorial function which is essentially composition of multiplication operations only. It is unable to comprehend pattern behind an expression when composition of operations are involved. Hence, we propose a new neural network structure effectively which takes in complex compositional mathematical operations and produces best possible results with small NALU based neural networks as its pluggable modules which evaluates these expression at unitary level in a bottom-up manner. We call this effective neural network as CalcNet, as it helps in predicting accurate calculations for complex numerical expressions even for values that are out of training range. As part of our study we applied this network on numerically approximating complex equations, evaluating biquadratic equations and tested reusability of these modules. We arrived at far better generalizations for complex arithmetic extrapolation tasks as compare to both only NALU layer based neural networks and simple feed forward neural networks. Also, we achieved even better results for our golden ratio based modified NAC and NALU structures for both interpolating and extrapolating tasks in all evaluation experiments. Finally, from reusability standpoint this model demonstrate strong invariance for making predictions on different tasks.
{"title":"Exploring Numerical Calculations with CalcNet","authors":"Ashish Rana, A. Malhi, Kary Främling","doi":"10.1109/ICTAI.2019.00192","DOIUrl":"https://doi.org/10.1109/ICTAI.2019.00192","url":null,"abstract":"Neural networks are not great generalizers outside their training range i.e. they are good at capturing bias but might miss the overall concept. Important issues with neural networks is that when testing data goes outside training range they fail to predict accurate results. Hence, they loose the ability to generalize a concept. For systematic numeric exploration neural accumulators (NAC) and neural arithmetic logic unit(NALU) are proposed which performs excellent for simple arithmetic operations. But, major limitation with these units is that they can't handle complex mathematical operations & equations. For example, NALU can predict accurate results for multiplication operation but not for factorial function which is essentially composition of multiplication operations only. It is unable to comprehend pattern behind an expression when composition of operations are involved. Hence, we propose a new neural network structure effectively which takes in complex compositional mathematical operations and produces best possible results with small NALU based neural networks as its pluggable modules which evaluates these expression at unitary level in a bottom-up manner. We call this effective neural network as CalcNet, as it helps in predicting accurate calculations for complex numerical expressions even for values that are out of training range. As part of our study we applied this network on numerically approximating complex equations, evaluating biquadratic equations and tested reusability of these modules. We arrived at far better generalizations for complex arithmetic extrapolation tasks as compare to both only NALU layer based neural networks and simple feed forward neural networks. Also, we achieved even better results for our golden ratio based modified NAC and NALU structures for both interpolating and extrapolating tasks in all evaluation experiments. Finally, from reusability standpoint this model demonstrate strong invariance for making predictions on different tasks.","PeriodicalId":346657,"journal":{"name":"2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117279029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-11-01DOI: 10.1109/ICTAI.2019.00218
Henok Ghebrechristos, G. Alaghband
We present a new system that can automatically generate input paths (syllabus) for a convolutional neural network to follow through a curriculum learning to improve training performance. Our system utilizes information-theoretic content measures of training samples to form syllabus at training time. We treat every sample as 2D random variable where a data point contained in the sample (such as a pixel) is modelled as an independent and identically distributed random variable (i.i.d) realization. We use several information theory methods to rank and determine when a sample is fed to a network by measuring its pixel composition and its relationship to other samples in the training set. Comparative evaluation of multiple state-of-the-art networks, including, GoogleNet, and VGG, on benchmark datasets demonstrate a syllabus that ranks samples using measures such as Joint Entropy between adjacent samples, can improve learning and significantly reduce the amount of training steps required to achieve desirable training accuracy. We present results that indicate our approach can reduce training loss by as much as a factor of 9 compared to conventional training.
{"title":"Optimizing Training using Information Theory-Based Curriculum Learning Factory","authors":"Henok Ghebrechristos, G. Alaghband","doi":"10.1109/ICTAI.2019.00218","DOIUrl":"https://doi.org/10.1109/ICTAI.2019.00218","url":null,"abstract":"We present a new system that can automatically generate input paths (syllabus) for a convolutional neural network to follow through a curriculum learning to improve training performance. Our system utilizes information-theoretic content measures of training samples to form syllabus at training time. We treat every sample as 2D random variable where a data point contained in the sample (such as a pixel) is modelled as an independent and identically distributed random variable (i.i.d) realization. We use several information theory methods to rank and determine when a sample is fed to a network by measuring its pixel composition and its relationship to other samples in the training set. Comparative evaluation of multiple state-of-the-art networks, including, GoogleNet, and VGG, on benchmark datasets demonstrate a syllabus that ranks samples using measures such as Joint Entropy between adjacent samples, can improve learning and significantly reduce the amount of training steps required to achieve desirable training accuracy. We present results that indicate our approach can reduce training loss by as much as a factor of 9 compared to conventional training.","PeriodicalId":346657,"journal":{"name":"2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115389060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-11-01DOI: 10.1109/ICTAI.2019.00187
Javier Maldonado, M. Riff
Detecting various types of attacks is a major problem in cybersecurity. In this paper, we show different configurations of an evolutionary wrapper algorithm for selecting features to classify attacks using a decision tree. We use two metrics for the evaluation function and evolutionary operator acceptance criteria. As part of our experiments, we interchange them and test the effect on the classification quality. Results show that the algorithm is able to guide the classification to accomplish different goals.
{"title":"Evaluating Different Metric Configurations of an Evolutionary Wrapper for Attack Detection","authors":"Javier Maldonado, M. Riff","doi":"10.1109/ICTAI.2019.00187","DOIUrl":"https://doi.org/10.1109/ICTAI.2019.00187","url":null,"abstract":"Detecting various types of attacks is a major problem in cybersecurity. In this paper, we show different configurations of an evolutionary wrapper algorithm for selecting features to classify attacks using a decision tree. We use two metrics for the evaluation function and evolutionary operator acceptance criteria. As part of our experiments, we interchange them and test the effect on the classification quality. Results show that the algorithm is able to guide the classification to accomplish different goals.","PeriodicalId":346657,"journal":{"name":"2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123168141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-11-01DOI: 10.1109/ICTAI.2019.00010
Sahil Verma, R. Yap
Symbolic execution is a powerful technique for bug finding and program testing. It is successful in finding bugs in real-world code. The core reasoning techniques use constraint solving, path exploration, and search, which are also the same techniques used in solving combinatorial problems, e.g., finite-domain constraint satisfaction problems (CSPs). We propose CSP instances as more challenging benchmarks to evaluate the effectiveness of the core techniques in symbolic execution. We transform CSP benchmarks into C programs suitable for testing the reasoning capabilities of symbolic execution tools. From a single CSP P, we transform P depending on transformation choice into different C programs. Preliminary testing with the KLEE, Tracer-X, and LLBMC tools show substantial runtime differences from transformation and solver choice. Our C benchmarks are effective in showing the limitations of existing symbolic execution tools. The motivation for this work is we believe that benchmarks of this form can spur the development and engineering of improved core reasoning in symbolic execution engines.
{"title":"Benchmarking Symbolic Execution Using Constraint Problems - Initial Results","authors":"Sahil Verma, R. Yap","doi":"10.1109/ICTAI.2019.00010","DOIUrl":"https://doi.org/10.1109/ICTAI.2019.00010","url":null,"abstract":"Symbolic execution is a powerful technique for bug finding and program testing. It is successful in finding bugs in real-world code. The core reasoning techniques use constraint solving, path exploration, and search, which are also the same techniques used in solving combinatorial problems, e.g., finite-domain constraint satisfaction problems (CSPs). We propose CSP instances as more challenging benchmarks to evaluate the effectiveness of the core techniques in symbolic execution. We transform CSP benchmarks into C programs suitable for testing the reasoning capabilities of symbolic execution tools. From a single CSP P, we transform P depending on transformation choice into different C programs. Preliminary testing with the KLEE, Tracer-X, and LLBMC tools show substantial runtime differences from transformation and solver choice. Our C benchmarks are effective in showing the limitations of existing symbolic execution tools. The motivation for this work is we believe that benchmarks of this form can spur the development and engineering of improved core reasoning in symbolic execution engines.","PeriodicalId":346657,"journal":{"name":"2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123310765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-11-01DOI: 10.1109/ICTAI.2019.00179
Xiao Peng, Wei-qing Huang, Zhixin Shi
Denial of Service (DoS) attacks pose serious threats to network security. With the rapid development of machine learning technologies, artificial neural network (ANN) has been used to classify DoS attacks. However, ANN models are vulnerable to adversarial samples: inputs that are specially crafted to yield incorrect outputs. In this work, we explore a kind of DoS adversarial attacks which aim to bypass ANN-based DoS intrusion detection systems. By analyzing features of DoS samples, we propose an improved boundary-based method to craft adversarial DoS samples. The key idea is to optimize a Mahalanobis distance by perturbing continuous features and discrete features of DoS samples respectively. We experimentally study the effectiveness of our method in two trained ANN classifiers on KDDcup99 dataset and CICIDS2017 dataset. Results show that our method can craft adversarial DoS samples with limited queries.
DoS (Denial of Service)攻击对网络安全构成严重威胁。随着机器学习技术的迅速发展,人工神经网络(ANN)已被用于DoS攻击分类。然而,人工神经网络模型很容易受到对抗性样本的影响:那些经过特殊设计以产生不正确输出的输入。在这项工作中,我们探索了一种旨在绕过基于人工神经网络的DoS入侵检测系统的DoS对抗性攻击。通过分析DoS样本的特征,提出了一种改进的基于边界的DoS样本生成方法。其关键思想是通过分别扰动DoS样本的连续特征和离散特征来优化马氏距离。我们在KDDcup99数据集和CICIDS2017数据集上实验研究了该方法在两个训练好的ANN分类器上的有效性。结果表明,该方法可以在有限的查询条件下生成对抗DoS样本。
{"title":"Adversarial Attack Against DoS Intrusion Detection: An Improved Boundary-Based Method","authors":"Xiao Peng, Wei-qing Huang, Zhixin Shi","doi":"10.1109/ICTAI.2019.00179","DOIUrl":"https://doi.org/10.1109/ICTAI.2019.00179","url":null,"abstract":"Denial of Service (DoS) attacks pose serious threats to network security. With the rapid development of machine learning technologies, artificial neural network (ANN) has been used to classify DoS attacks. However, ANN models are vulnerable to adversarial samples: inputs that are specially crafted to yield incorrect outputs. In this work, we explore a kind of DoS adversarial attacks which aim to bypass ANN-based DoS intrusion detection systems. By analyzing features of DoS samples, we propose an improved boundary-based method to craft adversarial DoS samples. The key idea is to optimize a Mahalanobis distance by perturbing continuous features and discrete features of DoS samples respectively. We experimentally study the effectiveness of our method in two trained ANN classifiers on KDDcup99 dataset and CICIDS2017 dataset. Results show that our method can craft adversarial DoS samples with limited queries.","PeriodicalId":346657,"journal":{"name":"2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123420906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-11-01DOI: 10.1109/ICTAI.2019.00089
Xiuping Bao, Jiabin Yuan, Bei Chen
Human action recognition has became an important task in computer vision and has received a significant amount of research interests in recent years. Convolutional Neural Network (CNN) has shown its power in image recognition task. While in the field of video recognition, it is still a challenge problem. In this paper, we introduce a high-efficient attention-based convolutional network named ECPNet for video understanding. ECPNet adopts the framework that is a consecutive connection of 2D CNN and pseudo-3D CNN. The pseudo-3D means we replace the traditional 3 × 3 × 3 kernel with two 3D convolutional filters shaped 1 × 3 × 3 and 3 × 1 × 1. Our ECPNet combines the advantages of both 2D and 3D CNNs: (1) ECPNet is an end-to-end network and can learn information of appearance from images and motion between frames. (2) ECPNet requires less computing resource and lower memory consumption than many state-of-art models. (3) ECPNet is easy to expand for different demands of runtime and classification accuracy. We evaluate the proposed model on three popular video benchmarks in human action recognition task: Kinetics-mini (split of full Kinetics), UCF101 and HMDB51. Our ECPNet achieves the excellent performance on above datasets with less time cost.
{"title":"ECPNet: An Efficient Attention-Based Convolution Network with Pseudo-3D Block for Human Action Recognition","authors":"Xiuping Bao, Jiabin Yuan, Bei Chen","doi":"10.1109/ICTAI.2019.00089","DOIUrl":"https://doi.org/10.1109/ICTAI.2019.00089","url":null,"abstract":"Human action recognition has became an important task in computer vision and has received a significant amount of research interests in recent years. Convolutional Neural Network (CNN) has shown its power in image recognition task. While in the field of video recognition, it is still a challenge problem. In this paper, we introduce a high-efficient attention-based convolutional network named ECPNet for video understanding. ECPNet adopts the framework that is a consecutive connection of 2D CNN and pseudo-3D CNN. The pseudo-3D means we replace the traditional 3 × 3 × 3 kernel with two 3D convolutional filters shaped 1 × 3 × 3 and 3 × 1 × 1. Our ECPNet combines the advantages of both 2D and 3D CNNs: (1) ECPNet is an end-to-end network and can learn information of appearance from images and motion between frames. (2) ECPNet requires less computing resource and lower memory consumption than many state-of-art models. (3) ECPNet is easy to expand for different demands of runtime and classification accuracy. We evaluate the proposed model on three popular video benchmarks in human action recognition task: Kinetics-mini (split of full Kinetics), UCF101 and HMDB51. Our ECPNet achieves the excellent performance on above datasets with less time cost.","PeriodicalId":346657,"journal":{"name":"2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121404601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}