首页 > 最新文献

Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence最新文献

英文 中文
A Sparse Deep Linear Discriminative Analysis using Sparse Evolutionary Training 基于稀疏进化训练的稀疏深度线性判别分析
Xuefeng Bai, Lijun Yan
Deep Linear Discriminative Analysis (DeepLDA) is an effective feature learning method that combines LDA with deep neural network. The core of DeepLDA is putting a LDA based loss function on the top of deep neural network, which is constructed by fully-connected layers. Generally speaking, fully-connected layers will lead to a large consumption of computing resource. What’s more, capacity of the deep neural network may too large to fit training data properly when fully-connected layers are used. Thus, performance of DeepLDA may be improved by increasing sparsity of the deep neural network. In this paper, a sparse training strategy is exploited to train DeepLDA. Dense layers in DeepLDA are replaced by a Erdös-Rényi random graph based sparse topology first. Then, sparse evolutionary training (SET) strategy is employed to train DeepLDA. Preliminary experiments show that DeepLDA trained with SET strategy outperforms DeepLDA trained with fully-connected layers on MINST classification task.
深度线性判别分析(Deep Linear Discriminative Analysis, DeepLDA)是一种将深度线性判别分析与深度神经网络相结合的有效特征学习方法。DeepLDA的核心是将一个基于LDA的损失函数放在由全连接层构成的深度神经网络的顶层。一般来说,全连接层会导致大量的计算资源消耗。此外,当使用全连接层时,深度神经网络的容量可能太大而无法正确拟合训练数据。因此,可以通过增加深度神经网络的稀疏度来提高DeepLDA的性能。本文采用稀疏训练策略对DeepLDA进行训练。DeepLDA中的密集层首先被基于稀疏拓扑的Erdös-Rényi随机图所取代。然后,采用稀疏进化训练(SET)策略对DeepLDA进行训练。初步实验表明,SET策略训练的DeepLDA在MINST分类任务上优于全连接层训练的DeepLDA。
{"title":"A Sparse Deep Linear Discriminative Analysis using Sparse Evolutionary Training","authors":"Xuefeng Bai, Lijun Yan","doi":"10.1145/3446132.3446167","DOIUrl":"https://doi.org/10.1145/3446132.3446167","url":null,"abstract":"Deep Linear Discriminative Analysis (DeepLDA) is an effective feature learning method that combines LDA with deep neural network. The core of DeepLDA is putting a LDA based loss function on the top of deep neural network, which is constructed by fully-connected layers. Generally speaking, fully-connected layers will lead to a large consumption of computing resource. What’s more, capacity of the deep neural network may too large to fit training data properly when fully-connected layers are used. Thus, performance of DeepLDA may be improved by increasing sparsity of the deep neural network. In this paper, a sparse training strategy is exploited to train DeepLDA. Dense layers in DeepLDA are replaced by a Erdös-Rényi random graph based sparse topology first. Then, sparse evolutionary training (SET) strategy is employed to train DeepLDA. Preliminary experiments show that DeepLDA trained with SET strategy outperforms DeepLDA trained with fully-connected layers on MINST classification task.","PeriodicalId":125388,"journal":{"name":"Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130678055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparison of genetic algorithm and dynamic programming solving knapsack problem 遗传算法与动态规划求解背包问题的比较
Yan Wang, M. Wang, Jia Li, Xiang Xu
In this paper, the genetic algorithm and the dynamic programming algorithm are used to solve the 0-1 knapsack problem, and the principles and implementation process of the two methods are analyzed. For the two methods, the initial condition values are changed respectively, and the running time, the number of iterations and the accuracy of the running results of each algorithm under different conditions are compared and analyzed, with the reasons for the differences are studied to show the characteristics in order to find different features of these algorithms.
本文采用遗传算法和动态规划算法求解0-1背包问题,并分析了这两种方法的原理和实现过程。对于两种方法,分别改变初始条件值,对不同条件下各算法的运行时间、迭代次数和运行结果的精度进行比较分析,研究差异的原因,显示出其特点,从而发现这两种算法的不同特点。
{"title":"Comparison of genetic algorithm and dynamic programming solving knapsack problem","authors":"Yan Wang, M. Wang, Jia Li, Xiang Xu","doi":"10.1145/3446132.3446142","DOIUrl":"https://doi.org/10.1145/3446132.3446142","url":null,"abstract":"In this paper, the genetic algorithm and the dynamic programming algorithm are used to solve the 0-1 knapsack problem, and the principles and implementation process of the two methods are analyzed. For the two methods, the initial condition values are changed respectively, and the running time, the number of iterations and the accuracy of the running results of each algorithm under different conditions are compared and analyzed, with the reasons for the differences are studied to show the characteristics in order to find different features of these algorithms.","PeriodicalId":125388,"journal":{"name":"Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134394333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
HSCKE: A Hybrid Supervised Method for Chinese Keywords Extraction 中文关键词提取的混合监督方法
Shuyu Kong, Ping Zhu, Qian Yang, Zhihua Wei
Automatic keywords extraction refers to extracting words or phrases from a single text or text collection. Supervised methods outperform unsupervised methods, but it requires a large volume of labeled corpus for training. To address the problem, extra knowledge is obtained through labels generated by other tools. Moreover, the preprocessing of Chinese text is more challenging than that in English because of the fragments caused by word segment. Hence the named entity recognition in the preprocessing is introduced to enhance the accuracy. On the other hand, text contains different separate parts, and each part conveys information to readers on different levels. Thus, we present a text weighting method based on priority that takes into consideration the importance of different texture parts. In this paper, we integrate the three ideas above and propose a novel hybrid method for Chinese keywords extraction (HSCKE). To evaluate the performance of our proposed approach, we compare HSCKE with four most commonly used methods on two typical Chinese keywords extraction datasets. The experimental results show that the proposed approach achieves the optimal performance in terms of precision, recall and F1 score.
关键词自动提取是指从单个文本或文本集合中提取单词或短语。有监督方法优于无监督方法,但它需要大量的标记语料库进行训练。为了解决这个问题,可以通过其他工具生成的标签获得额外的知识。此外,由于分词产生的碎片,汉语文本的预处理比英语文本更具挑战性。为此,在预处理中引入命名实体识别来提高识别精度。另一方面,文本包含不同的独立部分,每个部分在不同层次上向读者传递信息。因此,我们提出了一种基于优先级的文本加权方法,该方法考虑了不同纹理部分的重要性。本文将上述三种思想结合起来,提出了一种新的混合中文关键词提取方法。为了评估我们提出的方法的性能,我们将HSCKE与四种最常用的方法在两个典型的中文关键词提取数据集上进行了比较。实验结果表明,该方法在查全率、查全率和F1分数方面均达到了最佳性能。
{"title":"HSCKE: A Hybrid Supervised Method for Chinese Keywords Extraction","authors":"Shuyu Kong, Ping Zhu, Qian Yang, Zhihua Wei","doi":"10.1145/3446132.3446408","DOIUrl":"https://doi.org/10.1145/3446132.3446408","url":null,"abstract":"Automatic keywords extraction refers to extracting words or phrases from a single text or text collection. Supervised methods outperform unsupervised methods, but it requires a large volume of labeled corpus for training. To address the problem, extra knowledge is obtained through labels generated by other tools. Moreover, the preprocessing of Chinese text is more challenging than that in English because of the fragments caused by word segment. Hence the named entity recognition in the preprocessing is introduced to enhance the accuracy. On the other hand, text contains different separate parts, and each part conveys information to readers on different levels. Thus, we present a text weighting method based on priority that takes into consideration the importance of different texture parts. In this paper, we integrate the three ideas above and propose a novel hybrid method for Chinese keywords extraction (HSCKE). To evaluate the performance of our proposed approach, we compare HSCKE with four most commonly used methods on two typical Chinese keywords extraction datasets. The experimental results show that the proposed approach achieves the optimal performance in terms of precision, recall and F1 score.","PeriodicalId":125388,"journal":{"name":"Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129533668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A GAN-based Method for Generating Finger Vein Dataset 基于gan的手指静脉数据集生成方法
Hanwen Yang, P. Fang, Zhiang Hao
Deep learning is widely used in the field of biometrics, but a large amount of labeled image data is required to obtain a well-performing complicated model. Finger vein recognition has huge advantages over common biometric methods in terms of security and privacy. However, there are very few finger vein-related datasets. In order to solve this problem, this paper proposes a GAN-based finger vein dataset generation method, which is the first attempt in the domain of finger vein dataset generation by GAN. This paper generates a total of 53,630 images of 5,363 different subjects of finger veins and validates the synthetic dataset, which provides the basis for applying complex deep neural networks in the field of finger vein recognition.
深度学习在生物识别领域得到了广泛的应用,但要获得性能良好的复杂模型,需要大量的标记图像数据。手指静脉识别在安全性和隐私性方面比常见的生物识别方法具有巨大的优势。然而,很少有与手指静脉相关的数据集。为了解决这一问题,本文提出了一种基于GAN的手指静脉数据集生成方法,这是GAN在手指静脉数据集生成领域的首次尝试。本文共生成5363个不同主体的53630张手指静脉图像,并对合成数据集进行了验证,为复杂深度神经网络在手指静脉识别领域的应用提供了基础。
{"title":"A GAN-based Method for Generating Finger Vein Dataset","authors":"Hanwen Yang, P. Fang, Zhiang Hao","doi":"10.1145/3446132.3446150","DOIUrl":"https://doi.org/10.1145/3446132.3446150","url":null,"abstract":"Deep learning is widely used in the field of biometrics, but a large amount of labeled image data is required to obtain a well-performing complicated model. Finger vein recognition has huge advantages over common biometric methods in terms of security and privacy. However, there are very few finger vein-related datasets. In order to solve this problem, this paper proposes a GAN-based finger vein dataset generation method, which is the first attempt in the domain of finger vein dataset generation by GAN. This paper generates a total of 53,630 images of 5,363 different subjects of finger veins and validates the synthetic dataset, which provides the basis for applying complex deep neural networks in the field of finger vein recognition.","PeriodicalId":125388,"journal":{"name":"Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131252060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A YOLO-based Neural Network with VAE for Intelligent Garbage Detection and Classification 基于yolo的神经网络与VAE智能垃圾检测与分类
Anbang Ye, Bo Pang, Yucheng Jin, Jiahuan Cui
Garbage recycling is becoming an urgent need for the people as the rapid development of human society is producing colossal amount of waste every year. However, current machine learning models for intelligent garbage detection and classification are highly constrained by their limited processing speeds and large model sizes, which make them difficult to be deployed on portable, real-time, and energy-efficient edge-computing devices. Therefore, in this paper, we introduce a novel YOLO-based neural network model with Variational Autoencoder (VAE) to increase the accuracy of automatic garbage recycling, accelerate the speed of calculation, and reduce the model size to make it feasible in the real-world garbage recycling scenario. The model is consisted of a convolutional feature extractor, a convolutional predictor, and a decoder. After the training process, this model achieves a correct rate of 69.70% with a total number of 32.1 million parameters and a speed of processing 60 Frames Per Second (FPS), surpassing the performance of other existing models such as YOLO v1 and Fast R-CNN.
随着人类社会的快速发展,每年都会产生大量的垃圾,垃圾回收正成为人们迫切需要的东西。然而,目前用于智能垃圾检测和分类的机器学习模型受到其有限的处理速度和大模型尺寸的高度限制,这使得它们难以部署在便携式,实时和节能的边缘计算设备上。因此,本文引入了一种新的基于yolo的神经网络模型,并结合变分自编码器(Variational Autoencoder, VAE),提高了垃圾自动回收的精度,加快了计算速度,减小了模型尺寸,使其在真实的垃圾回收场景中可行。该模型由卷积特征提取器、卷积预测器和卷积解码器组成。经过训练过程,该模型的正确率达到69.70%,参数总数达到3210万个,处理速度达到每秒60帧(FPS),超过了现有的YOLO v1、Fast R-CNN等模型。
{"title":"A YOLO-based Neural Network with VAE for Intelligent Garbage Detection and Classification","authors":"Anbang Ye, Bo Pang, Yucheng Jin, Jiahuan Cui","doi":"10.1145/3446132.3446400","DOIUrl":"https://doi.org/10.1145/3446132.3446400","url":null,"abstract":"Garbage recycling is becoming an urgent need for the people as the rapid development of human society is producing colossal amount of waste every year. However, current machine learning models for intelligent garbage detection and classification are highly constrained by their limited processing speeds and large model sizes, which make them difficult to be deployed on portable, real-time, and energy-efficient edge-computing devices. Therefore, in this paper, we introduce a novel YOLO-based neural network model with Variational Autoencoder (VAE) to increase the accuracy of automatic garbage recycling, accelerate the speed of calculation, and reduce the model size to make it feasible in the real-world garbage recycling scenario. The model is consisted of a convolutional feature extractor, a convolutional predictor, and a decoder. After the training process, this model achieves a correct rate of 69.70% with a total number of 32.1 million parameters and a speed of processing 60 Frames Per Second (FPS), surpassing the performance of other existing models such as YOLO v1 and Fast R-CNN.","PeriodicalId":125388,"journal":{"name":"Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115716623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Community Detection Algorithm based on Node Similarity in Signed Networks 签名网络中基于节点相似度的社区检测算法
Zhi Bie, Lufeng Qian, J. Ren
Hierarchical clustering algorithms based on node similarity have been widely used in community detection, but it is not suitable for signed networks. The typical signed network community detection algorithm has the problem of low community division rate from different nodes. Based on the similarity of nodes, this paper proposes the CDNS algorithm (Community Detection Algorithm based on Node Similarity in Signed Networks). Firstly, the algorithm proposes a node influence measure suitable for signed networks as the basis for selecting the initial node of the community. Secondly, it proposes the calculation of the node similarity based on the eigenvector centrality, and selects the node with the highest similarity from the initial node from the neighbour nodes to form the initial community. Finally, according to the community contribution of neighbour nodes, algorithm determines whether the neighbour nodes are joined in the community and in which order the neighbour nodes are joined in the community. The experiments of real signed network and simulated signed network prove that the CDNS algorithm has good accuracy and efficiency.
基于节点相似度的分层聚类算法已广泛应用于社区检测,但并不适用于签名网络。典型的签名网络社区检测算法存在不同节点的社区分割率低的问题。基于节点的相似度,提出了CDNS算法(基于节点相似度的签名网络社区检测算法)。首先,该算法提出了一种适用于签名网络的节点影响测度,作为选择社区初始节点的依据;其次,提出了基于特征向量中心性的节点相似度计算方法,并从相邻节点中选取初始节点中相似度最高的节点组成初始社区;最后,根据邻居节点的社区贡献,算法确定邻居节点是否加入社区,以及邻居节点加入社区的顺序。真实签名网络和模拟签名网络的实验证明了CDNS算法具有良好的准确性和效率。
{"title":"Community Detection Algorithm based on Node Similarity in Signed Networks","authors":"Zhi Bie, Lufeng Qian, J. Ren","doi":"10.1145/3446132.3446184","DOIUrl":"https://doi.org/10.1145/3446132.3446184","url":null,"abstract":"Hierarchical clustering algorithms based on node similarity have been widely used in community detection, but it is not suitable for signed networks. The typical signed network community detection algorithm has the problem of low community division rate from different nodes. Based on the similarity of nodes, this paper proposes the CDNS algorithm (Community Detection Algorithm based on Node Similarity in Signed Networks). Firstly, the algorithm proposes a node influence measure suitable for signed networks as the basis for selecting the initial node of the community. Secondly, it proposes the calculation of the node similarity based on the eigenvector centrality, and selects the node with the highest similarity from the initial node from the neighbour nodes to form the initial community. Finally, according to the community contribution of neighbour nodes, algorithm determines whether the neighbour nodes are joined in the community and in which order the neighbour nodes are joined in the community. The experiments of real signed network and simulated signed network prove that the CDNS algorithm has good accuracy and efficiency.","PeriodicalId":125388,"journal":{"name":"Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116099579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Label-Attentive Hierarchical Network for Document Classification 用于文档分类的标签关注层次网络
Xi Chen, Chongwu Dong, Jinghui Qin, Long Yin, Wushao Wen
Text classification is one of the most fundamental and important tasks in the field of natural language processing, which aims to identify the most relevant label for a given piece of text. Although deep learning-based text classification methods have achieved promising results, most researches mainly focus on the internal context information of the document, ignoring the available global information such as document hierarchy and label semantics. To address this problem, we propose a novel Label-Attentive Hierarchical Network (LAHN) for document classification. In particular, we integrate label information into the hierarchical structure of the document by calculating the word-label attention at word level and the sentence-label attention at sentence level respectively. We give full consideration to the global information during encoding the whole document, which makes the final document representation vector more discriminative for classification. Extensive experiments on several benchmark datasets show that our proposed LAHN surpasses several state-of-the-art methods.
文本分类是自然语言处理领域中最基本、最重要的任务之一,其目的是为给定的文本片段识别出最相关的标签。尽管基于深度学习的文本分类方法已经取得了可喜的成果,但大多数研究主要集中在文档的内部上下文信息上,忽略了文档层次结构和标签语义等可用的全局信息。为了解决这个问题,我们提出了一种新的标签关注层次网络(LAHN)用于文档分类。特别地,我们通过分别计算词级的词标签注意和句子级的句子标签注意,将标签信息整合到文档的层次结构中。我们在对整个文档进行编码时充分考虑了全局信息,使得最终的文档表示向量对分类更具有判别性。在几个基准数据集上进行的大量实验表明,我们提出的LAHN优于几种最先进的方法。
{"title":"Label-Attentive Hierarchical Network for Document Classification","authors":"Xi Chen, Chongwu Dong, Jinghui Qin, Long Yin, Wushao Wen","doi":"10.1145/3446132.3446163","DOIUrl":"https://doi.org/10.1145/3446132.3446163","url":null,"abstract":"Text classification is one of the most fundamental and important tasks in the field of natural language processing, which aims to identify the most relevant label for a given piece of text. Although deep learning-based text classification methods have achieved promising results, most researches mainly focus on the internal context information of the document, ignoring the available global information such as document hierarchy and label semantics. To address this problem, we propose a novel Label-Attentive Hierarchical Network (LAHN) for document classification. In particular, we integrate label information into the hierarchical structure of the document by calculating the word-label attention at word level and the sentence-label attention at sentence level respectively. We give full consideration to the global information during encoding the whole document, which makes the final document representation vector more discriminative for classification. Extensive experiments on several benchmark datasets show that our proposed LAHN surpasses several state-of-the-art methods.","PeriodicalId":125388,"journal":{"name":"Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116511528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast Confidence Prediction of Uncertainty based on Knowledge Graph Embedding 基于知识图嵌入的不确定性快速置信度预测
Shihan Yang, Weiya Zhang, R. Tang
The uncertainty is an inherent feature of Knowledge Graph (KG), which is often modelled as confidence scores of relation facts. Although Knowledge Graph Embedding (KGE) has been a great success recently, it is still a big challenge to predict confidence of unseen facts in KG in the continuous vector space. There are several reasons for this situation. First, the current KGE is often concerned with the deterministic knowledge, in which unseen facts’ confidence are treated as zero, otherwise as one. Second, in the embedding space, uncertainty features are not well preserved. Third, approximate reasoning in embedding spaces is often unexplainable and not intuitive. Furthermore, the time and space cost of obtaining embedding spaces with uncertainty preserved are always very high. To address these issues, considering Uncertain Knowledge Graph (UKG), we propose a fast and effective embedding method, UKGsE, in which approximate reasoning and calculation can be quickly performed after generating an Uncertain Knowledge Graph Embedding (UKGE) space in a high speed and reasonable accuracy. The idea is that treating relation facts as short sentences and pre-handling are benefit to the learning and training confidence scores of them. The experiment shows that the method is suitable for the downstream task, confidence prediction of relation facts, whether they are seen in UKG or not. It achieves the best tradeoff between efficiency and accuracy of predicting uncertain confidence of knowledge. Further, we found that the model outperforms state-of-the-art uncertain link prediction baselines on CN15k dataset.
不确定性是知识图谱的固有特征,知识图谱通常被建模为关系事实的置信度分数。尽管知识图嵌入(Knowledge Graph Embedding, KGE)近年来取得了巨大的成功,但在连续向量空间中预测未知事实的置信度仍然是一个很大的挑战。造成这种情况有几个原因。首先,当前的KGE通常关注确定性知识,在这种知识中,看不见的事实的置信度被视为零,否则被视为一。其次,在嵌入空间中,不确定性特征没有得到很好的保存。第三,嵌入空间中的近似推理往往是无法解释和不直观的。此外,获取保留不确定性的嵌入空间的时间和空间成本总是很高的。针对这些问题,本文针对不确定知识图(UKGE),提出了一种快速有效的嵌入方法——UKGsE,该方法能够以高速、合理的精度生成不确定知识图嵌入(UKGE)空间后,快速进行近似推理和计算。将关系事实视为短句并进行预处理有利于其学习和训练信心分数。实验表明,该方法适用于关联事实的下游任务置信度预测,无论关联事实是否在UKG中出现。它在预测知识不确定置信度的效率和准确性之间取得了最佳的平衡。此外,我们发现该模型优于CN15k数据集上最先进的不确定链路预测基线。
{"title":"Fast Confidence Prediction of Uncertainty based on Knowledge Graph Embedding","authors":"Shihan Yang, Weiya Zhang, R. Tang","doi":"10.1145/3446132.3446186","DOIUrl":"https://doi.org/10.1145/3446132.3446186","url":null,"abstract":"The uncertainty is an inherent feature of Knowledge Graph (KG), which is often modelled as confidence scores of relation facts. Although Knowledge Graph Embedding (KGE) has been a great success recently, it is still a big challenge to predict confidence of unseen facts in KG in the continuous vector space. There are several reasons for this situation. First, the current KGE is often concerned with the deterministic knowledge, in which unseen facts’ confidence are treated as zero, otherwise as one. Second, in the embedding space, uncertainty features are not well preserved. Third, approximate reasoning in embedding spaces is often unexplainable and not intuitive. Furthermore, the time and space cost of obtaining embedding spaces with uncertainty preserved are always very high. To address these issues, considering Uncertain Knowledge Graph (UKG), we propose a fast and effective embedding method, UKGsE, in which approximate reasoning and calculation can be quickly performed after generating an Uncertain Knowledge Graph Embedding (UKGE) space in a high speed and reasonable accuracy. The idea is that treating relation facts as short sentences and pre-handling are benefit to the learning and training confidence scores of them. The experiment shows that the method is suitable for the downstream task, confidence prediction of relation facts, whether they are seen in UKG or not. It achieves the best tradeoff between efficiency and accuracy of predicting uncertain confidence of knowledge. Further, we found that the model outperforms state-of-the-art uncertain link prediction baselines on CN15k dataset.","PeriodicalId":125388,"journal":{"name":"Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122930858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Drill Pipe Counting Method Based on Scale Space and Siamese Network 基于尺度空间和暹罗网络的钻杆计数方法
Lihong Dong, Xinyi Wu, Jiehui Zhang
Aiming at the problem that the traditional video-based drilling pipe counting method has low accuracy and is vulnerable to interference in the process of positioning and tracking targets, a drilling pipe counting method based on scale space and Siamese network was proposed: the shape features of the drilling machine video image were calculated by the improved scale space algorithm, the initial position of the drilling machine chuck was determined by feature matching, the chuck was tracked in real time according to the improved Siamese network algorithm and its movement trajectory was recorded, moreover, the number of drilling pipes was calculated after locally weighted regression and hierarchical classification of the chuck movement trajectory using counting rules. The test results showed that the improved method could stably track the target under the interference of bright light and realize the accurate counting of drilling pipe.
针对传统基于视频的钻杆计数方法精度低,在目标定位跟踪过程中容易受到干扰的问题,提出了一种基于尺度空间和暹罗网络的钻杆计数方法:采用改进的尺度空间算法计算钻孔机视频图像的形状特征,通过特征匹配确定钻孔机卡盘的初始位置,采用改进的Siamese网络算法对卡盘进行实时跟踪并记录其运动轨迹;利用计数规则对卡盘运动轨迹进行局部加权回归和分层分类,计算出钻杆数。试验结果表明,改进后的方法能在强光干扰下稳定地跟踪目标,实现钻杆的精确计数。
{"title":"Drill Pipe Counting Method Based on Scale Space and Siamese Network","authors":"Lihong Dong, Xinyi Wu, Jiehui Zhang","doi":"10.1145/3446132.3446179","DOIUrl":"https://doi.org/10.1145/3446132.3446179","url":null,"abstract":"Aiming at the problem that the traditional video-based drilling pipe counting method has low accuracy and is vulnerable to interference in the process of positioning and tracking targets, a drilling pipe counting method based on scale space and Siamese network was proposed: the shape features of the drilling machine video image were calculated by the improved scale space algorithm, the initial position of the drilling machine chuck was determined by feature matching, the chuck was tracked in real time according to the improved Siamese network algorithm and its movement trajectory was recorded, moreover, the number of drilling pipes was calculated after locally weighted regression and hierarchical classification of the chuck movement trajectory using counting rules. The test results showed that the improved method could stably track the target under the interference of bright light and realize the accurate counting of drilling pipe.","PeriodicalId":125388,"journal":{"name":"Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124819180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Machine Learning Based Plagiarism Detection in Source Code 基于机器学习的源代码抄袭检测
N. Viuginov, P. Grachev, A. Filchenkov
Converting source codes to feature vectors can be useful in programming-related tasks, such as plagiarism detection on ACM contests. We present a brand-new method for feature extraction from C++ files, which includes both features describing syntactic and lexical properties of an AST tree and features characterizing disassembly of source code. We propose a method for solving the plagiarism detection task as a classification problem. We prove the effectiveness of our feature set by testing on a dataset that contains 50 ACM problems and ∼90k solutions for them. Trained xgboost model gets a relative binary f1-score=0.745 on the test set.
将源代码转换为特征向量在编程相关任务中很有用,例如ACM竞赛的剽窃检测。本文提出了一种从c++文件中提取特征的新方法,该方法既包括描述AST树的语法和词法属性的特征,也包括描述源代码反汇编的特征。我们提出了一种将抄袭检测任务作为分类问题来解决的方法。我们通过在包含50个ACM问题和约90k个解决方案的数据集上测试来证明我们的特征集的有效性。训练后的xgboost模型在测试集中得到一个相对二进制f1-score=0.745。
{"title":"A Machine Learning Based Plagiarism Detection in Source Code","authors":"N. Viuginov, P. Grachev, A. Filchenkov","doi":"10.1145/3446132.3446420","DOIUrl":"https://doi.org/10.1145/3446132.3446420","url":null,"abstract":"Converting source codes to feature vectors can be useful in programming-related tasks, such as plagiarism detection on ACM contests. We present a brand-new method for feature extraction from C++ files, which includes both features describing syntactic and lexical properties of an AST tree and features characterizing disassembly of source code. We propose a method for solving the plagiarism detection task as a classification problem. We prove the effectiveness of our feature set by testing on a dataset that contains 50 ACM problems and ∼90k solutions for them. Trained xgboost model gets a relative binary f1-score=0.745 on the test set.","PeriodicalId":125388,"journal":{"name":"Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130198416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1