2021 13th International Conference on Machine Learning and Computing最新文献

英文中文

An Enhanced Adaptive Large Neighborhood Search Algorithm for the Capacitated Vehicle Routing Problem 车辆路径问题的一种增强自适应大邻域搜索算法

2021 13th International Conference on Machine Learning and Computing

Pub Date : 2021-02-26 DOI: 10.1145/3457682.3457694

Haiping Zhang, Wang Yang

Capacitated Vehicle Routing Problem (CVRP) is a representative type of Vehicle Routing Problem (VRP) and it is NP-hard. With the increase of the scale of the problem, the existing method is easy to fall into a local optimal solution, and the solution time is too long. To overcome these problems, in this paper, we propose an Enhanced Adaptive Large Neighborhood Search algorithm (EALNS). The EALNS adds a new type of linear removal strategy and selects several adjacent nodes on a route to be removed so that the vehicle can serve more customers. In the ALNS decision-making stage, an adaptive mechanism that weighs the time factor is added, so that each strategy combination can adjust the weight according to the solved time. Experiments are performed through three internationally published benchmarks. Experimental results show that the EALNS is competitive and can obtain satisfactory results in most instances. We compare with the optimal results from the collective best results reported in the literature, EALNS improves 2.30% average accuracy and significantly reduces the average solution time.

有能力车辆路由问题(Capacitated Vehicle Routing Problem, CVRP)是车辆路由问题(Vehicle Routing Problem, VRP)的代表类型，属于NP-hard问题。随着问题规模的增大，现有方法容易陷入局部最优解，且求解时间过长。为了克服这些问题，本文提出了一种增强的自适应大邻域搜索算法(EALNS)。EALNS增加了一种新型的线性移除策略，并在一条路线上选择几个相邻的节点进行移除，从而使车辆能够服务更多的客户。在ALNS决策阶段，增加了对时间因素进行加权的自适应机制，使各策略组合可以根据所解决的时间调整权重。通过三个国际公布的基准进行实验。实验结果表明，该算法具有较强的竞争力，在大多数情况下都能获得满意的结果。与文献中报道的最优结果相比，EALNS平均准确率提高了2.30%，平均求解时间显著缩短。

引用次数: 0

Semantic Auto-Encoder with L2-norm Constraint for Zero-Shot Learning 基于l2范数约束的零学习语义自编码器

2021 13th International Conference on Machine Learning and Computing

Pub Date : 2021-02-26 DOI: 10.1145/3457682.3457699

Yuhao Wu, Weipeng Cao, Ye Liu, Zhong Ming, Jian-qiang Li, Bo Lu

Zero-Shot Learning (ZSL) is an effective paradigm to solve label prediction when some classes have no training samples. In recent years, many ZSL algorithms have been proposed. Among them, semantic autoencoder (SAE) is widely used because of its simplicity and good generalization ability. However, our research found that most of the existing SAE based methods use implicit constraints to guarantee the mapping quality between feature space and semantic space. In fact, the implicit constraints are insufficient in minimizing the structural risk of the model and easy to cause the over-fitting problem. To solve this problem, we propose a novel SAE algorithm with the L2-norm constraint (SAE-L2) in this study. SAE-L2 adds the L2 regularization constraint to the mapping parameters in its optimization objective, which explicitly guarantees the structural risk minimization of the model. Extensive experiments on four benchmark datasets show that our proposed SAE-L2 can achieve better performance than the original SAE model and other ZSL algorithms.

零射击学习(Zero-Shot Learning, ZSL)是解决某些类没有训练样本时标签预测的有效范式。近年来，人们提出了许多ZSL算法。其中，语义自编码器(semantic autoencoder, SAE)以其简单、泛化能力好而得到广泛应用。然而，我们的研究发现，现有的基于SAE的方法大多使用隐式约束来保证特征空间和语义空间之间的映射质量。实际上，隐式约束不足以使模型的结构风险最小化，容易造成过拟合问题。为了解决这一问题，本研究提出了一种具有l2范数约束(SAE- l2)的新型SAE算法。SAE-L2在优化目标中对映射参数加入了L2正则化约束，明确保证了模型的结构风险最小化。在四个基准数据集上进行的大量实验表明，我们提出的SAE- l2算法比原始SAE模型和其他ZSL算法具有更好的性能。

引用次数: 3

Towards Explainable Image Classifier: An Analogy to Multiple Choice Question Using Patch-level Similarity Measure 迈向可解释的图像分类器:用补丁级相似性度量类比选择题

2021 13th International Conference on Machine Learning and Computing

Pub Date : 2021-02-26 DOI: 10.1145/3457682.3457730

Yian Seo, K. Shin

With increased interests in Explainable Artificial Intelligence (XAI), many researches find ways to provide explanations for machine learning algorithms and their predictions. We propose Multiple Choice Questioned Convolutional Neural Network (MCQ-CNN) to better understand the prediction of image classifier by considering the problem of multi-class classification as the problem of multiple choice question. MCQ-CNN not only performs classification of the query image, but also explains the classification result by demonstrating the elimination process of multi-class labels in patch-level. The proposed model consists of two modules. Classification module is to classify class label of the query. Elimination module is to perform similarity measure in patch-level to distinguish whether the target object part shares the feature of certain class label or not. Classification module is trained using ResNet with high classification accuracy. Elimination module performs similarity measure by distance metric learning based on Large Margin Nearest Neighbor (LMNN). Our experiments have shown notable performances in both classification and elimination modules.

随着人们对可解释人工智能(XAI)的兴趣日益浓厚，许多研究都在寻找方法来解释机器学习算法及其预测。为了更好地理解图像分类器的预测，我们提出了多项选择问题卷积神经网络(Multiple Choice questions Convolutional Neural Network, MCQ-CNN)，将多类分类问题视为多项选择问题。MCQ-CNN不仅对查询图像进行分类，还通过在patch级展示多类标签的消除过程来解释分类结果。该模型由两个模块组成。分类模块是对查询的类标号进行分类。消去模块是在贴片级进行相似性度量，以区分目标对象部件是否具有某类标签的特征。分类模块使用ResNet进行训练，分类精度高。消去模块通过基于大边界最近邻(LMNN)的距离度量学习进行相似性度量。我们的实验在分类和消去模块上都显示了显著的性能。

{"title":"Towards Explainable Image Classifier: An Analogy to Multiple Choice Question Using Patch-level Similarity Measure","authors":"Yian Seo, K. Shin","doi":"10.1145/3457682.3457730","DOIUrl":"https://doi.org/10.1145/3457682.3457730","url":null,"abstract":"With increased interests in Explainable Artificial Intelligence (XAI), many researches find ways to provide explanations for machine learning algorithms and their predictions. We propose Multiple Choice Questioned Convolutional Neural Network (MCQ-CNN) to better understand the prediction of image classifier by considering the problem of multi-class classification as the problem of multiple choice question. MCQ-CNN not only performs classification of the query image, but also explains the classification result by demonstrating the elimination process of multi-class labels in patch-level. The proposed model consists of two modules. Classification module is to classify class label of the query. Elimination module is to perform similarity measure in patch-level to distinguish whether the target object part shares the feature of certain class label or not. Classification module is trained using ResNet with high classification accuracy. Elimination module performs similarity measure by distance metric learning based on Large Margin Nearest Neighbor (LMNN). Our experiments have shown notable performances in both classification and elimination modules.","PeriodicalId":142045,"journal":{"name":"2021 13th International Conference on Machine Learning and Computing","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129944576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Do You Want to Foresee Your Future? The Best Model Predicting the Success of Kickstarter Campaigns 你想预见你的未来吗?预测Kickstarter活动成功的最佳模型

2021 13th International Conference on Machine Learning and Computing

Pub Date : 2021-02-26 DOI: 10.1145/3457682.3457716

Jiayu Tian

Many creators find crowdfunding websites one of the best ways to get assistance for their campaigns. Kickstarter, as one representative crowdfunding website, provides a great platform for their brightest dreams. However, not everyone successfully reaches their funding goals. In this paper, we figure out what machine learning model and factors can best predict success probability in a Kickstarter campaign. Through comparing 6 different machine learning models, we find that the best performing model is the Random Forest model, with robust forecast accuracy of 87.85%, which is 10% higher than existing studies. Factor importance analysis indicates that the number of backers, whether picked up by editors, and the edit time of campaign are the top three most important factors in determining the success rate of crowd-funding projects. This suggests, to launch a successful project, the number of backers, whether picked up by editors, and the edit time of campaign should be weighted more than other factors. Our research shed light on both crowd-funding project determinants and machine leaning down-stream applications.

许多创作者发现众筹网站是为他们的活动获得帮助的最佳途径之一。Kickstarter作为众筹网站的代表，为他们的梦想提供了一个伟大的平台。然而，并不是每个人都成功地达到了他们的融资目标。在本文中，我们将找出哪些机器学习模型和因素能够最好地预测Kickstarter活动的成功概率。通过比较6种不同的机器学习模型，我们发现表现最好的模型是随机森林模型，其鲁棒预测准确率达到87.85%，比现有研究提高了10%。因子重要性分析表明，决定众筹项目成功率的前三个最重要的因素是支持者的数量、是否被编辑挑选、活动的编辑时间。这表明，要发行一个成功的项目，支持者的数量，是否被编辑选中，以及活动的编辑时间应该比其他因素更重要。我们的研究揭示了众筹项目的决定因素和机器学习的下游应用。

引用次数: 1

An Infrared Small Target Detection Method Using Segmentation Based Region Proposal and CNN 基于区域分割和CNN的红外小目标检测方法

2021 13th International Conference on Machine Learning and Computing

Pub Date : 2021-02-26 DOI: 10.1145/3457682.3457705

Sha Wen, Kai Liu, Shaoqing Tian, Mingming Fan, Lin Yan

Previous infrared small target detection approaches mainly solve the problem of detecting small target in sky background with strong cloud occlusion. However, these methods hardly exclude the negative objects other than cloud that cause false alarms. To address this problem, we propose an infrared small target detection framework using segmentation based region proposal and Convolution Neural Network (SCNN). In specific, an improved segmentation algorithm is used to obtain the salient regions from the background as the proposals. To reduce the high false alarms from proposals, a lightweight CNN is used to classify these regions and make final predictions. Owning to the lack of current public infrared small target datasets, a new infrared dataset is proposed in this paper. The experimental results demonstrate that the proposed method has a good performance in detection rate and false alarm rate.

以往的红外小目标检测方法主要是解决强云遮挡天空背景下的小目标检测问题。然而，这些方法很难排除除云以外的负面对象，导致误报。为了解决这一问题，我们提出了一种基于分割区域建议和卷积神经网络(SCNN)的红外小目标检测框架。具体来说，我们提出了一种改进的分割算法，从背景中提取显著区域。为了减少提案的高虚警，使用轻量级CNN对这些区域进行分类并进行最终预测。针对目前公开的红外小目标数据缺乏的问题，本文提出了一种新的红外小目标数据集。实验结果表明，该方法在检测率和虚警率方面具有良好的性能。

引用次数: 0

A Deformable Convolutional Neural Network with Oriented Response for Fine-Grained Visual Classification 面向细粒度视觉分类的可变形卷积神经网络

2021 13th International Conference on Machine Learning and Computing

Pub Date : 2021-02-26 DOI: 10.1145/3457682.3457702

Shangxian Ruan, Jiating Yang, Jianbo Chen

Fine-grained visual classification (FGVC) aims to classify images belonging to the same basic category in a more detailed sub-category. It is a challenging research topic in the field of computer vision and pattern recognition in recent years. The existing FGVC method conduct the task by considering the part detection of the object in the image and its variants, which rarely pays attention to the difference in expression of many changes such as object size, posture, and perspective. As a result, these methods generally face two major difficulties: 1) How to effectively pay attention to the latent semantic region, and reduce the interference caused by many changes in pose and perspective; 2) How to extract rich feature information for non-rigid and weak structure objects. In order to solve these two problems, this paper proposes a deformable convolutional neural network with oriented response for FGVC. The proposed method can be divided into three main steps: firstly, the local region of latent semantic information is localized based on a lightweight CAM network; then, the deformable convolutional ResNet-50 network and the rotation-invariant coding oriented response network are designed, which input the original image and local region into the feature network to learn the discriminant features of rotation invariance; finally, the learned features are embed into a joint loss to optimize the entire network end-to-end. Experiments are carried out on three challenging FGVC datasets, including CUB-200-2011, FGVC_Aircraft and Aircraft_2 datasets. The results show that the accuracy of the proposed method on all datasets is better than the comparison method, which can effectively improve the accuracy of weakly supervised FGVC.

细粒度视觉分类(FGVC)旨在将属于同一基本类别的图像分类为更详细的子类别。它是近年来计算机视觉和模式识别领域一个具有挑战性的研究课题。现有的FGVC方法通过考虑图像中物体及其变体的局部检测来执行任务，很少关注物体大小、姿态、视角等诸多变化的表达差异。因此，这些方法普遍面临两大难题:1)如何有效地关注潜在语义区域，减少姿态和视角变化带来的干扰;2)如何对非刚性和弱结构对象提取丰富的特征信息。为了解决这两个问题，本文提出了一种面向FGVC的可变形卷积神经网络。该方法分为三个主要步骤:首先，基于轻量级CAM网络对潜在语义信息局部区域进行定位;然后，设计了可变形卷积ResNet-50网络和面向旋转不变性编码的响应网络，将原始图像和局部区域输入特征网络，学习旋转不变性的判别特征;最后，将学习到的特征嵌入到一个联合损失中，对整个网络进行端到端的优化。实验在三个具有挑战性的FGVC数据集上进行，包括ub -200-2011、FGVC_Aircraft和Aircraft_2数据集。结果表明，该方法在所有数据集上的准确率均优于对比方法，可以有效提高弱监督FGVC的准确率。

{"title":"A Deformable Convolutional Neural Network with Oriented Response for Fine-Grained Visual Classification","authors":"Shangxian Ruan, Jiating Yang, Jianbo Chen","doi":"10.1145/3457682.3457702","DOIUrl":"https://doi.org/10.1145/3457682.3457702","url":null,"abstract":"Fine-grained visual classification (FGVC) aims to classify images belonging to the same basic category in a more detailed sub-category. It is a challenging research topic in the field of computer vision and pattern recognition in recent years. The existing FGVC method conduct the task by considering the part detection of the object in the image and its variants, which rarely pays attention to the difference in expression of many changes such as object size, posture, and perspective. As a result, these methods generally face two major difficulties: 1) How to effectively pay attention to the latent semantic region, and reduce the interference caused by many changes in pose and perspective; 2) How to extract rich feature information for non-rigid and weak structure objects. In order to solve these two problems, this paper proposes a deformable convolutional neural network with oriented response for FGVC. The proposed method can be divided into three main steps: firstly, the local region of latent semantic information is localized based on a lightweight CAM network; then, the deformable convolutional ResNet-50 network and the rotation-invariant coding oriented response network are designed, which input the original image and local region into the feature network to learn the discriminant features of rotation invariance; finally, the learned features are embed into a joint loss to optimize the entire network end-to-end. Experiments are carried out on three challenging FGVC datasets, including CUB-200-2011, FGVC_Aircraft and Aircraft_2 datasets. The results show that the accuracy of the proposed method on all datasets is better than the comparison method, which can effectively improve the accuracy of weakly supervised FGVC.","PeriodicalId":142045,"journal":{"name":"2021 13th International Conference on Machine Learning and Computing","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124846733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Semantically Enhanced Multi-scale Feature Pyramid Fusion for Pedestrian Detection 语义增强的多尺度特征金字塔融合行人检测

2021 13th International Conference on Machine Learning and Computing

Pub Date : 2021-02-26 DOI: 10.1145/3457682.3457747

Jun Wang, Chao Zhu

Detecting multi-scale pedestrians (especially small scale ones) is one of the most challenging problems in computer vision community. At present, most existing pedestrian detectors only adopt single-scale feature map in their backbone network for detection, which is not capable of fully taking advantages of multi-scale feature information, and thus resulting in unsatisfactory multi-scale detection performance. To address this issue, we propose in this paper a semantically enhanced multi-scale feature pyramid fusion method that can effectively extract and integrate multi-scale feature maps for multi-scale pedestrian detection. The proposed method consists of two main components: 1) Trapezoidal Path Augmented Module (TPAM) and 2) Multi-scale Feature Fusion Module (MFFM). TPAM aims at extracting higher-level semantic features by the additional higher-level feature layers, where the produced features are enhanced with supplementary higher-level semantic information, so that they can focus more accurately in the pedestrian area, leading to improved detection performance. MFFM aims at integrating multi-scale feature maps coming from TPAM to further take advantages of multi-scale feature information and reduce computational redundancy caused by multiple detection heads. By extensive experimental evaluations on the popular CityPersons and Caltech benchmarks, our proposed method achieves superior performances than previous state of the arts on multi-scale pedestrian detection.

多尺度行人(尤其是小尺度行人)的检测是计算机视觉领域最具挑战性的问题之一。目前，大多数现有的行人检测器仅在其骨干网络中采用单尺度特征图进行检测，无法充分利用多尺度特征信息，导致多尺度检测性能不理想。为了解决这一问题，本文提出了一种语义增强的多尺度特征金字塔融合方法，该方法可以有效地提取和整合多尺度特征地图，用于多尺度行人检测。该方法主要由两部分组成:1)梯形路径增强模块(TPAM)和2)多尺度特征融合模块(MFFM)。TPAM的目的是通过增加更高层次的特征层来提取更高层次的语义特征，生成的特征通过补充更高层次的语义信息来增强，使其更准确地集中在行人区域，从而提高检测性能。MFFM旨在整合来自TPAM的多尺度特征映射，进一步利用多尺度特征信息，减少多个检测头带来的计算冗余。通过对流行的CityPersons和加州理工学院基准进行广泛的实验评估，我们提出的方法在多尺度行人检测方面取得了比以前的技术水平更高的性能。

{"title":"Semantically Enhanced Multi-scale Feature Pyramid Fusion for Pedestrian Detection","authors":"Jun Wang, Chao Zhu","doi":"10.1145/3457682.3457747","DOIUrl":"https://doi.org/10.1145/3457682.3457747","url":null,"abstract":"Detecting multi-scale pedestrians (especially small scale ones) is one of the most challenging problems in computer vision community. At present, most existing pedestrian detectors only adopt single-scale feature map in their backbone network for detection, which is not capable of fully taking advantages of multi-scale feature information, and thus resulting in unsatisfactory multi-scale detection performance. To address this issue, we propose in this paper a semantically enhanced multi-scale feature pyramid fusion method that can effectively extract and integrate multi-scale feature maps for multi-scale pedestrian detection. The proposed method consists of two main components: 1) Trapezoidal Path Augmented Module (TPAM) and 2) Multi-scale Feature Fusion Module (MFFM). TPAM aims at extracting higher-level semantic features by the additional higher-level feature layers, where the produced features are enhanced with supplementary higher-level semantic information, so that they can focus more accurately in the pedestrian area, leading to improved detection performance. MFFM aims at integrating multi-scale feature maps coming from TPAM to further take advantages of multi-scale feature information and reduce computational redundancy caused by multiple detection heads. By extensive experimental evaluations on the popular CityPersons and Caltech benchmarks, our proposed method achieves superior performances than previous state of the arts on multi-scale pedestrian detection.","PeriodicalId":142045,"journal":{"name":"2021 13th International Conference on Machine Learning and Computing","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127085800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Unsupervised Feature Selection using Pseudo Label Approximation 使用伪标签近似的无监督特征选择

2021 13th International Conference on Machine Learning and Computing

Pub Date : 2021-02-26 DOI: 10.1145/3457682.3457758

Ren Deng, Ye Liu, Liyan Luo, Dongjing Chen, Xijie Li

Feature selection is a machine learning technique that selects a representative subset of all features available in order to reduce the time and space needed to process high-dimensional data. Traditional feature selection methods include filter, wrapper, and embedded approaches. However, many conventional methods’ performances are not suitable in many contexts. This paper proposes a new unsupervised feature selection model based on pseudo label approximation. The new derived model incorporates a projection error, a sparsity regularization, and a manifold regularization term that preserves the manifold structure of the original data. Finally, implementation of the new model onto five distinct datasets validates the effectiveness of the proposed model.

特征选择是一种机器学习技术，它从所有可用的特征中选择一个有代表性的子集，以减少处理高维数据所需的时间和空间。传统的特征选择方法包括过滤器、包装器和嵌入方法。然而，许多传统方法的性能在许多情况下并不适用。提出了一种新的基于伪标签近似的无监督特征选择模型。新导出的模型结合了投影误差、稀疏正则化和流形正则化项，保留了原始数据的流形结构。最后，在五个不同的数据集上实现了新模型，验证了该模型的有效性。

引用次数: 0

DiffNet: Discriminative Feature Fusion Network of Multisurface Skeleton Project Images for Action Recognition DiffNet:用于动作识别的多面骨架工程图像判别特征融合网络

2021 13th International Conference on Machine Learning and Computing

Pub Date : 2021-02-26 DOI: 10.1145/3457682.3457749

Hao-Tian Ren, Hongbo Zhang, Qinghongya Shi, Qing Lei, Jixiang Du

In this work, we discuss the feature fusion approach of multisurface skeleton projection images for action recognition. Multisurface skeleton projection images are generated from human skeleton joint motion trajectories on three surfaces: horizontal-vertical, horizontal- and vertical-depth surfaces. The vision features of these skeleton projection images contain complementary action information on different surfaces and are generally combined by early fusion or late fusion. To learn and fuse the discriminative features of each surface effectively, this paper proposes a new feature fusion method called the discriminative feature fusion network, which uses a two-task framework to recognize action and surface categories simultaneously. In the proposed network, the features of three skeleton projection images are computed by the same convolutional network. To retain the complementary action feature of these skeleton projection images, the surface classification loss is defined and added into the action classification loss to train the feature learning network. The experimental results show that the performance of the proposed feature fusion method is better than traditional early and late fusion. Compared with skeleton visualization image-based action recognition methods, the proposed method achieves state-of-art performance on the NTU RGB+D action dataset.

在这项工作中，我们讨论了用于动作识别的多面骨架投影图像的特征融合方法。多曲面骨架投影图像是由人体骨骼关节运动轨迹在三个表面上生成的:水平-垂直、水平-垂直深度表面。这些骨骼投影图像的视觉特征在不同表面上包含互补的动作信息，通常通过早期融合或晚期融合进行组合。为了有效地学习和融合每个表面的判别特征，本文提出了一种新的特征融合方法，即判别特征融合网络，该方法采用双任务框架同时识别动作和表面类别。在该网络中，三幅骨架投影图像的特征由同一个卷积网络计算。为了保留这些骨架投影图像的互补动作特征，定义了表面分类损失，并将其加入到动作分类损失中训练特征学习网络。实验结果表明，所提出的特征融合方法的性能优于传统的早期和晚期融合方法。与基于骨架可视化图像的动作识别方法相比，该方法在NTU RGB+D动作数据集上取得了较好的识别效果。

{"title":"DiffNet: Discriminative Feature Fusion Network of Multisurface Skeleton Project Images for Action Recognition","authors":"Hao-Tian Ren, Hongbo Zhang, Qinghongya Shi, Qing Lei, Jixiang Du","doi":"10.1145/3457682.3457749","DOIUrl":"https://doi.org/10.1145/3457682.3457749","url":null,"abstract":"In this work, we discuss the feature fusion approach of multisurface skeleton projection images for action recognition. Multisurface skeleton projection images are generated from human skeleton joint motion trajectories on three surfaces: horizontal-vertical, horizontal- and vertical-depth surfaces. The vision features of these skeleton projection images contain complementary action information on different surfaces and are generally combined by early fusion or late fusion. To learn and fuse the discriminative features of each surface effectively, this paper proposes a new feature fusion method called the discriminative feature fusion network, which uses a two-task framework to recognize action and surface categories simultaneously. In the proposed network, the features of three skeleton projection images are computed by the same convolutional network. To retain the complementary action feature of these skeleton projection images, the surface classification loss is defined and added into the action classification loss to train the feature learning network. The experimental results show that the performance of the proposed feature fusion method is better than traditional early and late fusion. Compared with skeleton visualization image-based action recognition methods, the proposed method achieves state-of-art performance on the NTU RGB+D action dataset.","PeriodicalId":142045,"journal":{"name":"2021 13th International Conference on Machine Learning and Computing","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126751389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

SCG_FBS: A Code Grading Model for Students’ Program in Programming Education SCG_FBS:编程教育中学生程序的代码评分模型

2021 13th International Conference on Machine Learning and Computing

Pub Date : 2021-02-26 DOI: 10.1145/3457682.3457714

Yuze Qin, Guangzhong Sun, Jianfeng Li, Tianyu Hu, Yu He

With the development of computer science and technology, programming has become one of the college students’ essential abilities. The increasing number of students brings a big challenge to evaluating students’ programs. For saving human resources in checking code, most works attempt to design software to judge code automatically. However, these works focus on the best way to extract the semantic and syntax features of the correct programs, ignoring that judging for wrong programs is equally important to students. We design a grading model named SCG_FBS (Students’ Code Grading model, based on semantic Features analysis and Black-box testing, with a Select function) to extract semantic features of the code and evaluate codes based on the semantic features and black-box testing. We standardize the source code and translate it into a vector sequence by a pre-trained instruction embedding. Then we extract semantic features by a neural network with the attention method and concatenate semantic features with black-box testing results as the dependence for grading. Furthermore, we propose a select function to pick up significant sentences in each code, which can reduce the length of the input sequence and accelerate training. We gather two data sets from the OJ (Online Judge) platform, which is widely used in colleges to test students’ programs as a black-box. Our SCG_FBS model gets 87.92% accuracy on one data set and gets 84.28% accuracy on another. Meanwhile, our SCG_FBS model reduces 53.7% training time compared with baseline, significantly improving efficiency.

随着计算机科学技术的发展，编程已成为大学生必备技能之一。学生数量的增加给学生课程的评估带来了巨大的挑战。为了节省检查代码的人力资源，大多数作品都试图设计软件来自动判断代码。然而，这些作品关注的是提取正确程序的语义和语法特征的最佳方法，而忽略了对错误程序的判断对学生来说同样重要。我们设计了一个评分模型SCG_FBS (Students’Code grading model，基于语义特征分析和黑盒测试，带有Select功能)，提取代码的语义特征，并基于语义特征和黑盒测试对代码进行评估。我们将源代码标准化，并通过预训练指令嵌入将其转换为向量序列。然后利用神经网络的注意方法提取语义特征，并将语义特征与黑盒测试结果拼接作为依赖关系进行评分。此外，我们提出了一个选择函数来挑选每个代码中的重要句子，这可以减少输入序列的长度，加快训练速度。我们从OJ(在线评判)平台收集了两组数据，该平台被广泛用于大学测试学生的课程。我们的SCG_FBS模型在一个数据集上的准确率为87.92%，在另一个数据集上的准确率为84.28%。同时，我们的SCG_FBS模型与基线相比减少了53.7%的训练时间，显著提高了效率。

{"title":"SCG_FBS: A Code Grading Model for Students’ Program in Programming Education","authors":"Yuze Qin, Guangzhong Sun, Jianfeng Li, Tianyu Hu, Yu He","doi":"10.1145/3457682.3457714","DOIUrl":"https://doi.org/10.1145/3457682.3457714","url":null,"abstract":"With the development of computer science and technology, programming has become one of the college students’ essential abilities. The increasing number of students brings a big challenge to evaluating students’ programs. For saving human resources in checking code, most works attempt to design software to judge code automatically. However, these works focus on the best way to extract the semantic and syntax features of the correct programs, ignoring that judging for wrong programs is equally important to students. We design a grading model named SCG_FBS (Students’ Code Grading model, based on semantic Features analysis and Black-box testing, with a Select function) to extract semantic features of the code and evaluate codes based on the semantic features and black-box testing. We standardize the source code and translate it into a vector sequence by a pre-trained instruction embedding. Then we extract semantic features by a neural network with the attention method and concatenate semantic features with black-box testing results as the dependence for grading. Furthermore, we propose a select function to pick up significant sentences in each code, which can reduce the length of the input sequence and accelerate training. We gather two data sets from the OJ (Online Judge) platform, which is widely used in colleges to test students’ programs as a black-box. Our SCG_FBS model gets 87.92% accuracy on one data set and gets 84.28% accuracy on another. Meanwhile, our SCG_FBS model reduces 53.7% training time compared with baseline, significantly improving efficiency.","PeriodicalId":142045,"journal":{"name":"2021 13th International Conference on Machine Learning and Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129971872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2021 13th International Conference on Machine Learning and Computing

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀