首页 > 最新文献

2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA)最新文献

英文 中文
Neural Network Conditional Random Fields for Self-Paced Brain Computer Interfaces 自定步脑机接口的神经网络条件随机场
H. Bashashati, R. Ward, A. Bashashati, Amr M. Mohamed
The task of classifying EEG signals for self-paced Brain Computer Interface (BCI) applications is extremely challenging. This difficulty in classification of self-paced data stems from the fact that the system has no clue about the start time of a control task and the data contains a large number of periods during which the user has no intention to control the BCI. Therefore, to improve the performance of the BCI, it is imperative to exploit the characteristics of the EEG data as much as possible. For motor imagery based self-paced BCIs, during motor imagery task the EEG signal of each subject goes through several internal state changes. Applying appropriate classifiers that can exploit the temporal correlation in EEG data can enhance the performance of the BCI. In this paper, we propose an algorithm which is able to capture the temporal correlation of the EEG signal. We compare the performance of our algorithm that is based on neural network conditional random fields to two well-known dynamic classifiers, the Hidden Markov Models and Conditional Random Fields and to the static classifier, Support Vector Machines. We compare these methods using the data from SM2 dataset, and we show that our algorithm yields results that are considerably superior to the other approaches in terms of the Area Under the Curve (AUC) of the BCI system.
自定节奏脑机接口(BCI)应用的脑电信号分类任务极具挑战性。这种自定节奏数据分类的困难源于这样一个事实,即系统对控制任务的开始时间没有任何线索,并且数据包含大量用户无意控制BCI的时间段。因此,为了提高脑机接口的性能,必须尽可能地挖掘脑电数据的特征。对于基于运动意象的自定节奏脑机,在运动意象任务中,每个被试的脑电信号经历了多次内部状态变化。采用适当的分类器,利用脑电数据的时间相关性,可以提高脑机接口的性能。本文提出了一种能够捕获脑电信号时间相关性的算法。我们将基于神经网络条件随机场的算法的性能与两种知名的动态分类器(隐马尔可夫模型和条件随机场)以及静态分类器(支持向量机)进行了比较。我们使用SM2数据集的数据对这些方法进行了比较,结果表明,就BCI系统的曲线下面积(AUC)而言,我们的算法产生的结果明显优于其他方法。
{"title":"Neural Network Conditional Random Fields for Self-Paced Brain Computer Interfaces","authors":"H. Bashashati, R. Ward, A. Bashashati, Amr M. Mohamed","doi":"10.1109/ICMLA.2016.0169","DOIUrl":"https://doi.org/10.1109/ICMLA.2016.0169","url":null,"abstract":"The task of classifying EEG signals for self-paced Brain Computer Interface (BCI) applications is extremely challenging. This difficulty in classification of self-paced data stems from the fact that the system has no clue about the start time of a control task and the data contains a large number of periods during which the user has no intention to control the BCI. Therefore, to improve the performance of the BCI, it is imperative to exploit the characteristics of the EEG data as much as possible. For motor imagery based self-paced BCIs, during motor imagery task the EEG signal of each subject goes through several internal state changes. Applying appropriate classifiers that can exploit the temporal correlation in EEG data can enhance the performance of the BCI. In this paper, we propose an algorithm which is able to capture the temporal correlation of the EEG signal. We compare the performance of our algorithm that is based on neural network conditional random fields to two well-known dynamic classifiers, the Hidden Markov Models and Conditional Random Fields and to the static classifier, Support Vector Machines. We compare these methods using the data from SM2 dataset, and we show that our algorithm yields results that are considerably superior to the other approaches in terms of the Area Under the Curve (AUC) of the BCI system.","PeriodicalId":356182,"journal":{"name":"2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"2013 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127387830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Faster Gated Recurrent Units via Conditional Computation 通过条件计算的更快门控循环单元
Andrew S. Davis, I. Arel
In this work, we apply the idea of conditional computation to the gated recurrent unit (GRU), a type of recurrent activation function. With slight modifications to the GRU, the number of floating point operations required to calculate the feed-forward pass through the network may be significantly reduced. This allows for more rapid computation, enabling a trade-off between model accuracy and model speed. Such a trade-off may be useful in a scenario where real-time performance is required, allowing for powerful recurrent models to be deployed on compute-limited devices.
在这项工作中,我们将条件计算的思想应用于门控循环单元(GRU),这是一种循环激活函数。对GRU稍加修改,计算通过网络的前馈传递所需的浮点运算次数可能会显著减少。这允许更快速的计算,实现模型精度和模型速度之间的权衡。这种权衡在需要实时性能的场景中可能很有用,允许在计算有限的设备上部署强大的循环模型。
{"title":"Faster Gated Recurrent Units via Conditional Computation","authors":"Andrew S. Davis, I. Arel","doi":"10.1109/ICMLA.2016.0165","DOIUrl":"https://doi.org/10.1109/ICMLA.2016.0165","url":null,"abstract":"In this work, we apply the idea of conditional computation to the gated recurrent unit (GRU), a type of recurrent activation function. With slight modifications to the GRU, the number of floating point operations required to calculate the feed-forward pass through the network may be significantly reduced. This allows for more rapid computation, enabling a trade-off between model accuracy and model speed. Such a trade-off may be useful in a scenario where real-time performance is required, allowing for powerful recurrent models to be deployed on compute-limited devices.","PeriodicalId":356182,"journal":{"name":"2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115747245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
L1-Norm Principal-Component Analysis via Bit Flipping 基于位翻转的l1范数主成分分析
Panos P. Markopoulos, S. Kundu, Shubham Chamadia, D. Pados
The K L1-norm Principal Components (L1-PCs) of a data matrix X Ε RD × N can be found optimally with cost O(2NK), in the general case, and O(Nrank(X)K - K + 1), when rankX is a constant with respect to N [1],[2]. Certainly, in real-world applications where N is large, even the latter polynomial cost is prohibitive. In this work, we present L1-BF: a novel, near-optimal algorithm that calculates the K L1-PCs of X with cost O (NDmin{N, D} + N2(K4 + DK2) + DNK3), comparable to that of standard (L2-norm) Principal-Component Analysis. Our numerical studies illustrate that the proposed algorithm attains optimality with very high frequency while, at the same time, it outperforms on the L1-PCA metric any counterpart of comparable computational cost. The outlier-resistance of the L1-PCs calculated by L1-BF is documented with experiments on dimensionality reduction and genomic data classification for disease diagnosis.
数据矩阵X Ε RD × N的K个l1范数主成分(L1-PCs)可以在一般情况下以代价O(2NK)和O(Nrank(X)K - K + 1)最优地找到,当rankX是关于N[1],[2]的常数时。当然,在N很大的实际应用程序中,即使是后一个多项式的代价也是令人望而却步的。在这项工作中,我们提出了L1-BF:一种新颖的、接近最优的算法,它以O (NDmin{N, D} + N2(K4 + DK2) + DNK3)的代价计算X的K l1 - pc,与标准(l2 -范数)主成分分析相当。我们的数值研究表明,所提出的算法以非常高的频率达到最优性,同时,它在L1-PCA度量上优于任何类似计算成本的对应度量。利用L1-BF计算的L1-PCs的离群抗性,通过降维和基因组数据分类进行了疾病诊断实验。
{"title":"L1-Norm Principal-Component Analysis via Bit Flipping","authors":"Panos P. Markopoulos, S. Kundu, Shubham Chamadia, D. Pados","doi":"10.1109/ICMLA.2016.0060","DOIUrl":"https://doi.org/10.1109/ICMLA.2016.0060","url":null,"abstract":"The K L1-norm Principal Components (L1-PCs) of a data matrix X Ε RD × N can be found optimally with cost O(2NK), in the general case, and O(Nrank(X)K - K + 1), when rankX is a constant with respect to N [1],[2]. Certainly, in real-world applications where N is large, even the latter polynomial cost is prohibitive. In this work, we present L1-BF: a novel, near-optimal algorithm that calculates the K L1-PCs of X with cost O (NDmin{N, D} + N2(K4 + DK2) + DNK3), comparable to that of standard (L2-norm) Principal-Component Analysis. Our numerical studies illustrate that the proposed algorithm attains optimality with very high frequency while, at the same time, it outperforms on the L1-PCA metric any counterpart of comparable computational cost. The outlier-resistance of the L1-PCs calculated by L1-BF is documented with experiments on dimensionality reduction and genomic data classification for disease diagnosis.","PeriodicalId":356182,"journal":{"name":"2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114962712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Correlating Filter Diversity with Convolutional Neural Network Accuracy 滤波器分集与卷积神经网络精度的关系
Casey A. Graff, Jeffrey S. Ellen
This paper describes three metrics used to asses the filter diversity learned by convolutional neural networks during supervised classification. As our testbed we use four different data sets, including two subsets of ImageNet and two planktonic data sets collected by scientific instruments. We investigate the correlation between our devised metrics and accuracy, using normalization and regularization to alter filter diversity. We propose that these metrics could be used to improve training CNNs. Three potential applications are determining the best preprocessing method for non-standard data sets, diagnosing training efficacy, and predicting performance in cases where validation data is expensive or impossible to collect.
本文描述了用于评估卷积神经网络在监督分类过程中学习到的滤波器多样性的三个指标。作为我们的测试平台,我们使用了四个不同的数据集,包括ImageNet的两个子集和由科学仪器收集的两个浮游数据集。我们研究了我们设计的指标和精度之间的相关性,使用归一化和正则化来改变滤波器的多样性。我们建议这些指标可以用来改进训练cnn。三个潜在的应用是确定非标准数据集的最佳预处理方法,诊断训练效果,以及在验证数据昂贵或无法收集的情况下预测性能。
{"title":"Correlating Filter Diversity with Convolutional Neural Network Accuracy","authors":"Casey A. Graff, Jeffrey S. Ellen","doi":"10.1109/ICMLA.2016.0021","DOIUrl":"https://doi.org/10.1109/ICMLA.2016.0021","url":null,"abstract":"This paper describes three metrics used to asses the filter diversity learned by convolutional neural networks during supervised classification. As our testbed we use four different data sets, including two subsets of ImageNet and two planktonic data sets collected by scientific instruments. We investigate the correlation between our devised metrics and accuracy, using normalization and regularization to alter filter diversity. We propose that these metrics could be used to improve training CNNs. Three potential applications are determining the best preprocessing method for non-standard data sets, diagnosing training efficacy, and predicting performance in cases where validation data is expensive or impossible to collect.","PeriodicalId":356182,"journal":{"name":"2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"347 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122078885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Feature Fusion for Denoising and Sparse Autoencoders: Application to Neuroimaging Data 特征融合去噪与稀疏自编码器:在神经影像数据中的应用
Arezou Moussavi Khalkhali, M. Jamshidi, Subhashie Wijemanne
Although there is no cure to date, Alzheimer's disease detection in early stages has a significant impact on the patient's life in terms of cost, the progress, and helping to plan in advance for an appropriate healthcare in the life ahead as well as providing clinical etiologies for further research. This paper discusses implementing a feature fusion method utilizing sparse and denoising autoencoders to reveal the stage of Alzheimer's disease. Four cohorts consisted of individuals with Alzheimer's disease, late mild cognitive impairment, early mild cognitive impairment, and normal control groups are classified using multinomial logistic regression fueled by the fusion of high-level and low-level features. The high-level features are extracted from the stacked autoencoders. The results show that feature fusion enhance the performance of typical autoencoders. However, the performance of feature fusion using denoising autoencoders is superior to that of the sparse training of autoencoders in terms of overall accuracy, precision, and recall.
尽管到目前为止还没有治愈方法,但在早期阶段检测阿尔茨海默病对患者的生活有重大影响,包括成本、进展、帮助提前计划未来生活中的适当医疗保健,以及为进一步研究提供临床病因。本文讨论了一种利用稀疏和去噪自编码器实现特征融合的方法来揭示阿尔茨海默病的阶段。四个队列由阿尔茨海默病患者、晚期轻度认知障碍组、早期轻度认知障碍组和正常对照组组成,使用高水平和低水平特征融合的多项逻辑回归进行分类。从堆叠的自编码器中提取高级特征。结果表明,特征融合提高了典型自编码器的性能。然而,使用去噪自编码器的特征融合在总体准确率、精度和召回率方面优于稀疏训练的自编码器。
{"title":"Feature Fusion for Denoising and Sparse Autoencoders: Application to Neuroimaging Data","authors":"Arezou Moussavi Khalkhali, M. Jamshidi, Subhashie Wijemanne","doi":"10.1109/ICMLA.2016.0106","DOIUrl":"https://doi.org/10.1109/ICMLA.2016.0106","url":null,"abstract":"Although there is no cure to date, Alzheimer's disease detection in early stages has a significant impact on the patient's life in terms of cost, the progress, and helping to plan in advance for an appropriate healthcare in the life ahead as well as providing clinical etiologies for further research. This paper discusses implementing a feature fusion method utilizing sparse and denoising autoencoders to reveal the stage of Alzheimer's disease. Four cohorts consisted of individuals with Alzheimer's disease, late mild cognitive impairment, early mild cognitive impairment, and normal control groups are classified using multinomial logistic regression fueled by the fusion of high-level and low-level features. The high-level features are extracted from the stacked autoencoders. The results show that feature fusion enhance the performance of typical autoencoders. However, the performance of feature fusion using denoising autoencoders is superior to that of the sparse training of autoencoders in terms of overall accuracy, precision, and recall.","PeriodicalId":356182,"journal":{"name":"2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122132060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Semi-Supervised Learning with Bidirectional Adaptive Pairwise Encoding 双向自适应成对编码的半监督学习
Jiangbo Yuan, Jie Yu
In contrast to classic supervised learning methods that demand pre-defined class labels, pairwise encoding or side-information encoding merely requires pairwise similarity information to drive feature learning, which makes it very appealing for many fundamental tasks such as dimensionality reduction and semi-supervised learning. In this paper, we present a novel bimarginal pairwise encoding model, along with deep autoencoder, to learn nonlinear embedding for the aforementioned tasks. The new method learns powerful features that preserve critical pairwise information in a semi-supervised manner. It has achieved better performance on the well-known yet hard to make improvement benchmark MINIST compared with other methods in the same category, i.e. Autoencoder [4], Invariant Mapping for Dimensionality Reduction [1], Neighborhood Component Analysis [3], and Fixed Bi-Margin Pairwise Encoding [11].
与经典的监督学习方法需要预定义的类标签相比,两两编码或侧信息编码只需要两两相似信息来驱动特征学习,这使得它对于降维和半监督学习等许多基本任务非常有吸引力。在本文中,我们提出了一种新的双边缘成对编码模型,以及深度自编码器,以学习上述任务的非线性嵌入。新方法学习强大的特征,以半监督的方式保留关键的成对信息。与同类方法(Autoencoder[4]、Invariant Mapping for Dimensionality Reduction[1]、Neighborhood Component Analysis[3]、Fixed Bi-Margin Pairwise Encoding[11])相比,该方法在知名但难以改进的基准MINIST上取得了更好的性能。
{"title":"Semi-Supervised Learning with Bidirectional Adaptive Pairwise Encoding","authors":"Jiangbo Yuan, Jie Yu","doi":"10.1109/ICMLA.2016.0119","DOIUrl":"https://doi.org/10.1109/ICMLA.2016.0119","url":null,"abstract":"In contrast to classic supervised learning methods that demand pre-defined class labels, pairwise encoding or side-information encoding merely requires pairwise similarity information to drive feature learning, which makes it very appealing for many fundamental tasks such as dimensionality reduction and semi-supervised learning. In this paper, we present a novel bimarginal pairwise encoding model, along with deep autoencoder, to learn nonlinear embedding for the aforementioned tasks. The new method learns powerful features that preserve critical pairwise information in a semi-supervised manner. It has achieved better performance on the well-known yet hard to make improvement benchmark MINIST compared with other methods in the same category, i.e. Autoencoder [4], Invariant Mapping for Dimensionality Reduction [1], Neighborhood Component Analysis [3], and Fixed Bi-Margin Pairwise Encoding [11].","PeriodicalId":356182,"journal":{"name":"2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124490275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Machine Learning Approach for Fault Detection in Vehicular Cyber-Physical Systems 车辆信息物理系统故障检测的机器学习方法
A. Sargolzaei, C. Crane, Alireza Abbaspour, S. Noei
A network of vehicular cyber-physical systems (VCPSs) can use wireless communications to interact with each other and the surrounding environment to improve transportation safety, mobility, and sustainability. However, cloud-oriented architectures are vulnerable to cyber attacks, which may endanger passenger and pedestrian safety and privacy, and cause severe property damage. For instance, a hacker can use message falsification attack to affect functionality of a particular application in a platoon of VCPSs. In this paper, a neural network-based fault detection technique is applied to detect and track fault data injection attacks on the cooperative adaptive cruise control layer of a platoon of connected vehicles in real time. A decision support system was developed to reduce the probability and severity of any consequent accident. A case study with its design specifications is demonstrated in detail. The simulation results show that the proposed method can improve system reliability, robustness, and safety.
车辆网络物理系统(vcps)网络可以使用无线通信与彼此和周围环境进行交互,以提高交通的安全性、移动性和可持续性。然而,面向云的架构容易受到网络攻击,可能危及乘客和行人的安全和隐私,并造成严重的财产损失。例如,黑客可以使用消息伪造攻击来影响一组vcps中特定应用程序的功能。本文采用一种基于神经网络的故障检测技术,实时检测和跟踪联网车辆协同自适应巡航控制层的故障数据注入攻击。开发了一个决策支持系统,以降低任何事故的概率和严重程度。并详细介绍了其设计规范的案例研究。仿真结果表明,该方法提高了系统的可靠性、鲁棒性和安全性。
{"title":"A Machine Learning Approach for Fault Detection in Vehicular Cyber-Physical Systems","authors":"A. Sargolzaei, C. Crane, Alireza Abbaspour, S. Noei","doi":"10.1109/ICMLA.2016.0112","DOIUrl":"https://doi.org/10.1109/ICMLA.2016.0112","url":null,"abstract":"A network of vehicular cyber-physical systems (VCPSs) can use wireless communications to interact with each other and the surrounding environment to improve transportation safety, mobility, and sustainability. However, cloud-oriented architectures are vulnerable to cyber attacks, which may endanger passenger and pedestrian safety and privacy, and cause severe property damage. For instance, a hacker can use message falsification attack to affect functionality of a particular application in a platoon of VCPSs. In this paper, a neural network-based fault detection technique is applied to detect and track fault data injection attacks on the cooperative adaptive cruise control layer of a platoon of connected vehicles in real time. A decision support system was developed to reduce the probability and severity of any consequent accident. A case study with its design specifications is demonstrated in detail. The simulation results show that the proposed method can improve system reliability, robustness, and safety.","PeriodicalId":356182,"journal":{"name":"2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"18 Suppl 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125734922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 44
Semantic Clone Detection Using Machine Learning 使用机器学习的语义克隆检测
Abdullah M. Sheneamer, J. Kalita
If two fragments of source code are identical to each other, they are called code clones. Code clones introduce difficulties in software maintenance and cause bug propagation. In this paper, we present a machine learning framework to automatically detect clones in software, which is able to detect Types-3 and the most complicated kind of clones, Type-4 clones. Previously used traditional features are often weak in detecting the semantic clones The novel aspects of our approach are the extraction of features from abstract syntax trees (AST) and program dependency graphs (PDG), representation of a pair of code fragments as a vector and the use of classification algorithms. The key benefit of this approach is that our approach can find both syntactic and semantic clones extremely well. Our evaluation indicates that using our new AST and PDG features is a viable methodology, since they improve detecting clones on the IJaDataset 2.0.
如果两个源代码片段彼此相同,它们被称为代码克隆。代码克隆给软件维护带来困难,并导致bug传播。在本文中,我们提出了一个在软件中自动检测克隆的机器学习框架,该框架能够检测类型3和最复杂的类型4克隆。我们的方法的新颖之处是从抽象语法树(AST)和程序依赖图(PDG)中提取特征,将一对代码片段表示为向量,以及使用分类算法。这种方法的主要优点是,我们的方法可以非常好地找到语法和语义克隆。我们的评估表明,使用新的AST和PDG特性是一种可行的方法,因为它们改进了ijadatasset 2.0上的克隆检测。
{"title":"Semantic Clone Detection Using Machine Learning","authors":"Abdullah M. Sheneamer, J. Kalita","doi":"10.1109/ICMLA.2016.0185","DOIUrl":"https://doi.org/10.1109/ICMLA.2016.0185","url":null,"abstract":"If two fragments of source code are identical to each other, they are called code clones. Code clones introduce difficulties in software maintenance and cause bug propagation. In this paper, we present a machine learning framework to automatically detect clones in software, which is able to detect Types-3 and the most complicated kind of clones, Type-4 clones. Previously used traditional features are often weak in detecting the semantic clones The novel aspects of our approach are the extraction of features from abstract syntax trees (AST) and program dependency graphs (PDG), representation of a pair of code fragments as a vector and the use of classification algorithms. The key benefit of this approach is that our approach can find both syntactic and semantic clones extremely well. Our evaluation indicates that using our new AST and PDG features is a viable methodology, since they improve detecting clones on the IJaDataset 2.0.","PeriodicalId":356182,"journal":{"name":"2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128415853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 48
Enhanced Approach to Detection of SQL Injection Attack SQL注入攻击的增强检测方法
Raja Prasad Karuparthi, Bing Zhou
In recent years, many financial sectors are evolving with huge numbers of web applications, which plays a crucial role in organizations to make important decisions. Considering this, the data has to be secured in order to prevent it from any attacks which lead to a huge loss. One of the topmost attacks in the database is SQL injection attack, is injecting some malicious query into the database causing serious threats. This paper proposes an enhanced approach to dynamic query matching technique by imposing a sanitizer for quick and easy detection of attack.
近年来,许多金融部门随着大量web应用程序的发展而发展,web应用程序在组织做出重要决策时起着至关重要的作用。考虑到这一点,数据必须得到保护,以防止任何攻击,导致巨大的损失。SQL注入攻击是数据库中最主要的攻击之一,它是向数据库中注入一些恶意查询,造成严重的威胁。本文提出了一种增强的动态查询匹配技术,通过施加一个消毒器来快速简便地检测攻击。
{"title":"Enhanced Approach to Detection of SQL Injection Attack","authors":"Raja Prasad Karuparthi, Bing Zhou","doi":"10.1109/ICMLA.2016.0082","DOIUrl":"https://doi.org/10.1109/ICMLA.2016.0082","url":null,"abstract":"In recent years, many financial sectors are evolving with huge numbers of web applications, which plays a crucial role in organizations to make important decisions. Considering this, the data has to be secured in order to prevent it from any attacks which lead to a huge loss. One of the topmost attacks in the database is SQL injection attack, is injecting some malicious query into the database causing serious threats. This paper proposes an enhanced approach to dynamic query matching technique by imposing a sanitizer for quick and easy detection of attack.","PeriodicalId":356182,"journal":{"name":"2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129342598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Identifying IT Purchases Anomalies in the Brazilian Government Procurement System Using Deep Learning 利用深度学习识别巴西政府采购系统中的IT采购异常
Silvio L. Domingos, Rommel N. Carvalho, Ricardo Silva Carvalho, G. N. Ramos
The Department of Research and Strategic Information (DIE), from the Brazilian Office of the Comptroller General (CGU), is responsible for investigating potential problems related to federal expenditures. To pursue this goal, DIE regularly has to analyze large volumes of data to search for anomalies that can reveal suspicious activities. With the growing demand from the citizens for transparency and corruption prevention, DIE is constantly looking for new methods to automate these processes. In this work, we investigate IT purchases anomalies in the Federal Government Procurement System by using a deep learning algorithm to generate a predictive model. This model will be used to prioritize actions carried out by the office in its pursuit of problems related to this kind of purchases. The data mining process followed the CRISP-DM methodology and the modeling phase tested the parallel resources of the H2O tool. We evaluated the performance of twelve deep learning with auto-encoder models, each one generated under a different set of parameters, in order to find the best input data reconstruction model. The best model achieved a mean squared error (MSE) of 0.0012775 and was used to predict the anomalies over the test file samples.
研究和战略信息部隶属于巴西主计长办公室,负责调查与联邦支出有关的潜在问题。为了实现这一目标,DIE必须定期分析大量数据,以搜索可能揭示可疑活动的异常情况。随着公民对透明度和预防腐败的需求不断增长,DIE不断寻找新的方法来实现这些过程的自动化。在这项工作中,我们通过使用深度学习算法生成预测模型来研究联邦政府采购系统中的IT采购异常。该模型将用于确定办公室在处理与这类采购有关的问题时所采取行动的优先次序。数据挖掘过程采用CRISP-DM方法,建模阶段测试H2O工具的并行资源。为了找到最佳的输入数据重建模型,我们评估了12个带有自编码器模型的深度学习的性能,每个模型都在不同的参数集下生成。最佳模型的均方误差(MSE)为0.0012775,并用于预测测试文件样本上的异常。
{"title":"Identifying IT Purchases Anomalies in the Brazilian Government Procurement System Using Deep Learning","authors":"Silvio L. Domingos, Rommel N. Carvalho, Ricardo Silva Carvalho, G. N. Ramos","doi":"10.1109/ICMLA.2016.0129","DOIUrl":"https://doi.org/10.1109/ICMLA.2016.0129","url":null,"abstract":"The Department of Research and Strategic Information (DIE), from the Brazilian Office of the Comptroller General (CGU), is responsible for investigating potential problems related to federal expenditures. To pursue this goal, DIE regularly has to analyze large volumes of data to search for anomalies that can reveal suspicious activities. With the growing demand from the citizens for transparency and corruption prevention, DIE is constantly looking for new methods to automate these processes. In this work, we investigate IT purchases anomalies in the Federal Government Procurement System by using a deep learning algorithm to generate a predictive model. This model will be used to prioritize actions carried out by the office in its pursuit of problems related to this kind of purchases. The data mining process followed the CRISP-DM methodology and the modeling phase tested the parallel resources of the H2O tool. We evaluated the performance of twelve deep learning with auto-encoder models, each one generated under a different set of parameters, in order to find the best input data reconstruction model. The best model achieved a mean squared error (MSE) of 0.0012775 and was used to predict the anomalies over the test file samples.","PeriodicalId":356182,"journal":{"name":"2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"179 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130003660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
期刊
2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1