首页 > 最新文献

Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence最新文献

英文 中文
Multi-Hop Memory Network with Graph Neural Networks Encoding for Proactive Dialogue 基于图神经网络编码的多跳记忆网络主动对话
Haonan Yuan, Jinqi An
Dialogue system has made great progress recently, but it is still in the initial stage of passive reply. How to build a dialogue model with proactive reply ability is a great challenge. This paper proposes an End-to-End dialogue model based on Memory network and Graph Neural Network, which uses memory network to store conversation history and knowledge, and uses Graph Neural Network to encode background knowledge. We propose a soft weighting mechanism to integrate the dialogue goal information into the query pointer, so as to enhance the dynamic topic transfer ability during decoding. Experimental results indicate that our model outperforms various kinds of generation models under automatic evaluations and can accomplish the conversational target more actively
对话系统近年来有了很大的发展,但仍处于被动回复的初级阶段。如何构建一个具有主动回复能力的对话模式是一个巨大的挑战。本文提出了一种基于记忆网络和图神经网络的端到端对话模型,利用记忆网络存储会话历史和知识,利用图神经网络对背景知识进行编码。我们提出了一种软加权机制,将对话目标信息整合到查询指针中,以增强解码过程中的动态话题传递能力。实验结果表明,该模型在自动评价下优于各种生成模型,能够更主动地完成会话目标
{"title":"Multi-Hop Memory Network with Graph Neural Networks Encoding for Proactive Dialogue","authors":"Haonan Yuan, Jinqi An","doi":"10.1145/3404555.3404605","DOIUrl":"https://doi.org/10.1145/3404555.3404605","url":null,"abstract":"Dialogue system has made great progress recently, but it is still in the initial stage of passive reply. How to build a dialogue model with proactive reply ability is a great challenge. This paper proposes an End-to-End dialogue model based on Memory network and Graph Neural Network, which uses memory network to store conversation history and knowledge, and uses Graph Neural Network to encode background knowledge. We propose a soft weighting mechanism to integrate the dialogue goal information into the query pointer, so as to enhance the dynamic topic transfer ability during decoding. Experimental results indicate that our model outperforms various kinds of generation models under automatic evaluations and can accomplish the conversational target more actively","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132289819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Novel Large-scale Model for Real Time Sentiment Analysis Using Online Shop Reviews and Comments 一种基于在线商店评论和评论的大规模实时情感分析模型
Fereshteh Ghorbanian, Mehrdad Jalali
Sentiment analysis concept is referring to natural language processing in lots of domains in order to find which kind of subjective data or information it expresses. Sentiment analysis recently used as a method in online shops to identify about type of buyers review and comment and an impression about services and products. In this research, we propose an adjustable sentiment analysis algorithm for real time analysis on user generated data on products in online shops with a UI that run on local shops as a friendly tool. The proposed model builds a dynamic dictionary from buyers comment and reviews gathered from online shops firstly using selected set of admin-based features extracted from a specific product (or top of a products in a category), then classifying these preprocessed data under predefined classes. According to best knowledge of authors the proposed method introduces new features vectors that strongly increase accuracy and trustworthiness in analyzing online shop reviews and comments with a low time overhead. Our extensive simulation result especially combination with an online shop for real data shows the improved accuracy and fine tuning of the polarity rank for online shops manager.
情感分析的概念是指在许多领域进行自然语言处理,以找出它所表达的主观数据或信息。情感分析最近被用作在线商店识别买家评论和评论的类型以及对服务和产品的印象的方法。在本研究中,我们提出了一种可调整的情感分析算法,用于实时分析在线商店中产品的用户生成数据,并使用在本地商店运行的UI作为友好工具。该模型首先使用从特定产品(或类别中的产品顶部)中提取的基于管理的特征集,从在线商店收集的买家评论和评论构建动态字典,然后将这些预处理数据分类到预定义的类中。根据作者的最佳知识,该方法引入了新的特征向量,在分析在线商店评论和评论时具有较低的时间开销,大大提高了准确性和可信度。我们的大量仿真结果,特别是结合一个在线商店的真实数据,表明了在线商店管理器极性排名的准确性和微调。
{"title":"A Novel Large-scale Model for Real Time Sentiment Analysis Using Online Shop Reviews and Comments","authors":"Fereshteh Ghorbanian, Mehrdad Jalali","doi":"10.1145/3404555.3404646","DOIUrl":"https://doi.org/10.1145/3404555.3404646","url":null,"abstract":"Sentiment analysis concept is referring to natural language processing in lots of domains in order to find which kind of subjective data or information it expresses. Sentiment analysis recently used as a method in online shops to identify about type of buyers review and comment and an impression about services and products. In this research, we propose an adjustable sentiment analysis algorithm for real time analysis on user generated data on products in online shops with a UI that run on local shops as a friendly tool. The proposed model builds a dynamic dictionary from buyers comment and reviews gathered from online shops firstly using selected set of admin-based features extracted from a specific product (or top of a products in a category), then classifying these preprocessed data under predefined classes. According to best knowledge of authors the proposed method introduces new features vectors that strongly increase accuracy and trustworthiness in analyzing online shop reviews and comments with a low time overhead. Our extensive simulation result especially combination with an online shop for real data shows the improved accuracy and fine tuning of the polarity rank for online shops manager.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"116 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120842841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
BERT-Based Mental Model, a Better Fake News Detector 基于bert的心理模型,更好的假新闻检测器
Jia Ding, Yongjun Hu, Huiyou Chang
Automatic fake news detection is a challenging problem which needs a number of verifiable facts support back. Wang et al. [16] introduced LIAR, a validated dataset, and presented a six classes classification task with several popular machine learning methods to detect fake news in linguistic level. However, empirical results have shown that the CNN and RNN based model can not perform very well especially when integrating all features with claim. In this paper, we are the first to present a method to build up a BERT-based [4] mental model to capture the mental feature in fake news detection. In details, we present a method to construct a patterned text in linguistic level to integrate the claim and features appropriately. Then we fine-tune the BERT model with all features integrated text. Empirical results show that our method provides significant improvement over the state-of-art model based on the LIAR dataset we have known by 16.71% in accuracy.
虚假新闻的自动检测是一个具有挑战性的问题,需要大量可验证的事实作为支持。Wang等人[16]引入了经过验证的数据集LIAR,并使用几种流行的机器学习方法提出了一个六类分类任务,以在语言层面检测假新闻。然而,经验结果表明,基于CNN和RNN的模型在整合所有特征与索赔时表现不佳。在本文中,我们首先提出了一种方法来建立基于bert[4]的心理模型来捕捉假新闻检测中的心理特征。具体而言,我们提出了一种在语言层面构建模式文本的方法,将权利要求和特征适当地结合起来。然后我们对BERT模型进行微调,将所有特征整合到文本中。实证结果表明,与现有的基于LIAR数据集的模型相比,本文方法的准确率提高了16.71%。
{"title":"BERT-Based Mental Model, a Better Fake News Detector","authors":"Jia Ding, Yongjun Hu, Huiyou Chang","doi":"10.1145/3404555.3404607","DOIUrl":"https://doi.org/10.1145/3404555.3404607","url":null,"abstract":"Automatic fake news detection is a challenging problem which needs a number of verifiable facts support back. Wang et al. [16] introduced LIAR, a validated dataset, and presented a six classes classification task with several popular machine learning methods to detect fake news in linguistic level. However, empirical results have shown that the CNN and RNN based model can not perform very well especially when integrating all features with claim. In this paper, we are the first to present a method to build up a BERT-based [4] mental model to capture the mental feature in fake news detection. In details, we present a method to construct a patterned text in linguistic level to integrate the claim and features appropriately. Then we fine-tune the BERT model with all features integrated text. Empirical results show that our method provides significant improvement over the state-of-art model based on the LIAR dataset we have known by 16.71% in accuracy.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123724006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
A Text Classification Model Base On Region Embedding AND LSTM 一种基于区域嵌入和LSTM的文本分类模型
Ying Li, Ming Ye
In the field of natural language processing, recurrent neural networks are good at capturing long-range dependent information and can effectively complete text classification tasks. However, Recurrent neural network is model the entire sentence in the process of text feature extraction, which easily ignores the deep semantic information of the local phrase of the text. To further enhance the expressiveness of text features, we propose a text classification model base on region embedding and LSTM (RELSTM). RELSTM first divides regions for text and then generates region embedding. We introduce the learnable local context unit(LCU) to calculate the relative position information of the middle word and its influence on the context words in the region, and obtain a region matrix representation. In order to reduce the complexity of the model, the max pooling operation is applied to the region matrix and we obtain a dense region embedding. Then, we use LSTM's long-term memory of text information to extract the global characteristics. The model is verified on public data sets, and the results are compared using 5 benchmark models. Experimental results on three dataset show that RELSTM has better overall performance and is effective in improving the accuracy of text classification compared with traditional deep learning models.
在自然语言处理领域,递归神经网络擅长捕获远程依赖信息,能够有效地完成文本分类任务。然而,递归神经网络在文本特征提取过程中对整个句子进行建模,容易忽略文本局部短语的深层语义信息。为了进一步增强文本特征的表达能力,我们提出了一种基于区域嵌入和LSTM的文本分类模型(RELSTM)。RELSTM首先为文本划分区域,然后生成区域嵌入。我们引入可学习的局部上下文单元(LCU)来计算中间词的相对位置信息及其对区域内上下文词的影响,并得到一个区域矩阵表示。为了降低模型的复杂度,对区域矩阵进行最大池化操作,得到密集的区域嵌入。然后,我们利用LSTM对文本信息的长期记忆提取全局特征。在公共数据集上对模型进行了验证,并使用5个基准模型对结果进行了比较。在三个数据集上的实验结果表明,与传统的深度学习模型相比,RELSTM具有更好的综合性能,能够有效地提高文本分类的准确率。
{"title":"A Text Classification Model Base On Region Embedding AND LSTM","authors":"Ying Li, Ming Ye","doi":"10.1145/3404555.3404643","DOIUrl":"https://doi.org/10.1145/3404555.3404643","url":null,"abstract":"In the field of natural language processing, recurrent neural networks are good at capturing long-range dependent information and can effectively complete text classification tasks. However, Recurrent neural network is model the entire sentence in the process of text feature extraction, which easily ignores the deep semantic information of the local phrase of the text. To further enhance the expressiveness of text features, we propose a text classification model base on region embedding and LSTM (RELSTM). RELSTM first divides regions for text and then generates region embedding. We introduce the learnable local context unit(LCU) to calculate the relative position information of the middle word and its influence on the context words in the region, and obtain a region matrix representation. In order to reduce the complexity of the model, the max pooling operation is applied to the region matrix and we obtain a dense region embedding. Then, we use LSTM's long-term memory of text information to extract the global characteristics. The model is verified on public data sets, and the results are compared using 5 benchmark models. Experimental results on three dataset show that RELSTM has better overall performance and is effective in improving the accuracy of text classification compared with traditional deep learning models.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124938158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Auxiliary Edge Detection for Semantic Image Segmentation 语义图像分割的辅助边缘检测
Wenrui Liu, Zongqing Lu, He Xu
Semantic segmentation is a challenging task which can be formulated as a pixel-wise classification problem. Most FCN-based methods of semantic segmentation apply simple bilinear up-sampling to recover the final pixel-wise prediction, which may lead to misclassification near the object edges. To solve this problem, we focus on the supplementary spatial details of semantic segmentation using edge information. We present an approach to incorporate the relevant auxiliary edge information to semantic segmentation features. By applying the explicit supervision of semantic boundary using intermediate features, the multi-tasks network learns features with strong inter-class distinctive ability. The attention-based feature fusion module fuses the high-resolution edge features with wide-receptive-field semantic features to sufficiently leverage the complementary information. Experiments on the Cityscapes dataset show the effectiveness of fusing intermediate edge information.
语义分割是一项具有挑战性的任务,可以将其表述为逐像素分类问题。大多数基于fcn的语义分割方法采用简单的双线性上采样来恢复最终的逐像素预测,这可能导致目标边缘附近的错误分类。为了解决这一问题,我们将重点放在利用边缘信息进行语义分割的补充空间细节上。我们提出了一种将相关辅助边缘信息纳入语义分割特征的方法。多任务网络通过使用中间特征对语义边界进行显式监督,学习到具有较强类间区分能力的特征。基于注意力的特征融合模块将高分辨率边缘特征与宽接受场语义特征融合,充分利用互补信息。在城市景观数据集上的实验表明了融合中间边缘信息的有效性。
{"title":"Auxiliary Edge Detection for Semantic Image Segmentation","authors":"Wenrui Liu, Zongqing Lu, He Xu","doi":"10.1145/3404555.3404624","DOIUrl":"https://doi.org/10.1145/3404555.3404624","url":null,"abstract":"Semantic segmentation is a challenging task which can be formulated as a pixel-wise classification problem. Most FCN-based methods of semantic segmentation apply simple bilinear up-sampling to recover the final pixel-wise prediction, which may lead to misclassification near the object edges. To solve this problem, we focus on the supplementary spatial details of semantic segmentation using edge information. We present an approach to incorporate the relevant auxiliary edge information to semantic segmentation features. By applying the explicit supervision of semantic boundary using intermediate features, the multi-tasks network learns features with strong inter-class distinctive ability. The attention-based feature fusion module fuses the high-resolution edge features with wide-receptive-field semantic features to sufficiently leverage the complementary information. Experiments on the Cityscapes dataset show the effectiveness of fusing intermediate edge information.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114622281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Channel-Wise Spatial Attention with Spatiotemporal Heterogeneous Framework for Action Recognition 基于时空异构框架的通道空间注意行为识别
Yiying Li, Yulin Li, Yanfei Gu
Recent years have witnessed the effective of attention network based on two-stream for video action recognition. However, most methods adopt the same structure on spatial stream and temporal stream, which produce amount redundant information and often ignore the relevance among channels. In this paper, we propose a channel-wise spatial attention with spatiotemporal heterogeneous framework, a new approach to action recognition. First, we employ two different network structures for spatial stream and temporal stream to improve the performance of action recognition. Then, we design a channel-wise network and spatial network inspired by self-attention mechanism to obtain the fine-grained and salient information of the video. Finally, the feature of video for action recognition is generated by end-to-end training. Experimental results on the datasets HMDB51 and UCF101 shows our method can effectively recognize the actions in the video.
近年来,基于双流的注意力网络在视频动作识别中取得了显著的效果。然而,大多数方法在空间流和时间流上采用相同的结构,产生了大量的冗余信息,往往忽略了信道之间的相关性。在本文中,我们提出了一种具有时空异构框架的通道型空间注意,这是一种新的动作识别方法。首先,我们采用空间流和时间流两种不同的网络结构来提高动作识别的性能。然后,我们设计了基于自关注机制的通道网络和空间网络,以获取视频的细粒度和显著性信息。最后,通过端到端训练生成用于动作识别的视频特征。在HMDB51和UCF101数据集上的实验结果表明,我们的方法可以有效地识别视频中的动作。
{"title":"Channel-Wise Spatial Attention with Spatiotemporal Heterogeneous Framework for Action Recognition","authors":"Yiying Li, Yulin Li, Yanfei Gu","doi":"10.1145/3404555.3404592","DOIUrl":"https://doi.org/10.1145/3404555.3404592","url":null,"abstract":"Recent years have witnessed the effective of attention network based on two-stream for video action recognition. However, most methods adopt the same structure on spatial stream and temporal stream, which produce amount redundant information and often ignore the relevance among channels. In this paper, we propose a channel-wise spatial attention with spatiotemporal heterogeneous framework, a new approach to action recognition. First, we employ two different network structures for spatial stream and temporal stream to improve the performance of action recognition. Then, we design a channel-wise network and spatial network inspired by self-attention mechanism to obtain the fine-grained and salient information of the video. Finally, the feature of video for action recognition is generated by end-to-end training. Experimental results on the datasets HMDB51 and UCF101 shows our method can effectively recognize the actions in the video.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130295912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generative Model for Node Generation 节点生成的生成模型
Boyu Zhang, Xin Wang, Kai Liu
We present a generative model applied to graph-structured data for node generation by incorporating the graph convolutional architecture and semi-supervised learning with variational auto-encoder. This idea is motivated by successful applications of deep generative models for images and speeches. However, when applied to graph-structured data, especially social network data, existing deep generative models usually don't work: these models can not learn underlying distributions of social network data effectively. In order to address this problem, we construct a deep generative model, using architectures and techniques that prove to be effective for modelling network data in practice. Experimental results show that our model can successfully learn the underlying distribution from the social network dataset, and generate reasonable nodes, which can be altered by varying latent variables. This provides us a way to study social network data in the same way we study image data.
通过结合图卷积结构和带变分自编码器的半监督学习,提出了一种用于图结构数据节点生成的生成模型。这个想法是由图像和演讲的深度生成模型的成功应用所激发的。然而,当应用于图结构数据,特别是社交网络数据时,现有的深度生成模型通常不起作用:这些模型不能有效地学习社交网络数据的底层分布。为了解决这个问题,我们构建了一个深度生成模型,使用的架构和技术在实践中被证明是有效的网络数据建模。实验结果表明,我们的模型能够成功地从社交网络数据集中学习到底层分布,并生成合理的节点,这些节点可以通过改变潜在变量来改变。这为我们提供了一种研究社交网络数据的方法,就像我们研究图像数据一样。
{"title":"Generative Model for Node Generation","authors":"Boyu Zhang, Xin Wang, Kai Liu","doi":"10.1145/3404555.3404599","DOIUrl":"https://doi.org/10.1145/3404555.3404599","url":null,"abstract":"We present a generative model applied to graph-structured data for node generation by incorporating the graph convolutional architecture and semi-supervised learning with variational auto-encoder. This idea is motivated by successful applications of deep generative models for images and speeches. However, when applied to graph-structured data, especially social network data, existing deep generative models usually don't work: these models can not learn underlying distributions of social network data effectively. In order to address this problem, we construct a deep generative model, using architectures and techniques that prove to be effective for modelling network data in practice. Experimental results show that our model can successfully learn the underlying distribution from the social network dataset, and generate reasonable nodes, which can be altered by varying latent variables. This provides us a way to study social network data in the same way we study image data.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131200864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detecting Deepfake Video by Learning Two-Level Features with Two-Stream Convolutional Neural Network 基于双流卷积神经网络学习两级特征的深度假视频检测
Zheng Zhao, Penghui Wang, W. Lu
Deepfake techniques has made face swapping in video easy to use. Nowadays, the spread of Deepfake videos over networks is concerned worldwide. This work proposes an approach to more accurate and robust detection of them. Since artifacts left by Deepfake tools can be largely categorized into two classes of different levels, i.e. semantic and noise level, we adopt a two-stream convolutional neural network (CNN) to capture the 2-level features concurrently. Xception network is trained only as the first stream to detect semantic anomalies such as the editing artifacts around face contour, detail missing, and geometric inconsistence in eyes. Meanwhile, the 2nd stream, which contain the constrained convolution filter and median filter, is designed to capture the tampering traces in local noises. By concatenating the 2-level features learned from the both streams, our method obtains very comprehensive knowledge about the existence of face swapping. The experimental results have shown its advantage over the existing methods on both the accuracy and robustness.
深度造假技术使得视频中的换脸变得容易使用。如今,Deepfake视频在网络上的传播受到了全世界的关注。这项工作提出了一种更准确和鲁棒的检测方法。由于Deepfake工具留下的工件在很大程度上可以分为语义级和噪声级两类不同级别,因此我们采用两流卷积神经网络(CNN)同时捕获两级特征。异常网络仅作为第一流进行训练,以检测语义异常,如面部轮廓周围的编辑伪影、细节缺失和眼睛几何不一致。同时,第二流包含约束卷积滤波器和中值滤波器,用于捕获局部噪声中的篡改痕迹。通过连接从两个流中学习到的2级特征,我们的方法获得了关于人脸交换存在性的非常全面的知识。实验结果表明,该方法在精度和鲁棒性方面都优于现有方法。
{"title":"Detecting Deepfake Video by Learning Two-Level Features with Two-Stream Convolutional Neural Network","authors":"Zheng Zhao, Penghui Wang, W. Lu","doi":"10.1145/3404555.3404564","DOIUrl":"https://doi.org/10.1145/3404555.3404564","url":null,"abstract":"Deepfake techniques has made face swapping in video easy to use. Nowadays, the spread of Deepfake videos over networks is concerned worldwide. This work proposes an approach to more accurate and robust detection of them. Since artifacts left by Deepfake tools can be largely categorized into two classes of different levels, i.e. semantic and noise level, we adopt a two-stream convolutional neural network (CNN) to capture the 2-level features concurrently. Xception network is trained only as the first stream to detect semantic anomalies such as the editing artifacts around face contour, detail missing, and geometric inconsistence in eyes. Meanwhile, the 2nd stream, which contain the constrained convolution filter and median filter, is designed to capture the tampering traces in local noises. By concatenating the 2-level features learned from the both streams, our method obtains very comprehensive knowledge about the existence of face swapping. The experimental results have shown its advantage over the existing methods on both the accuracy and robustness.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131222226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Digital-Display Temperature and Humidity Instrument Recognition Based on YOLOv3 and Character Structure Clustering 基于YOLOv3和特征结构聚类的数显温湿度仪识别
Lei Geng, Fengfeng Yan, Zhitao Xiao, Fang Zhang, Yanbei Liu
In this paper, in order to more efficiently verify digital-display temperature and humidity instruments and better evaluate the quality of digital-display temperature and humidity instruments, we propose a new recognition method of digital-display temperature and humidity instrument based on YOLOv3 and character structure clustering. First, the screen region of digitaldisplay temperature and humidity instrument contains all valid characters, so we define the smallest bounding rectangle region of the screen region as the region of interest. We extract the region of interest through YOLOv3-tiny neural network. Then we use YOLOv3 neural network to detect characters on the region of interest. Finally, according to the intra-class correlation of characters, we use character structure clustering to obtain temperature and humidity values. In addition, in this paper, we verify the effectiveness of this method through experiments.
为了更有效地验证数显温湿度仪,更好地评价数显温湿度仪的质量,本文提出了一种基于YOLOv3和字符结构聚类的数显温湿度仪识别新方法。首先,数显温湿度仪的屏幕区域包含所有有效字符,因此我们将屏幕区域的最小边界矩形区域定义为感兴趣区域。我们通过YOLOv3-tiny神经网络提取感兴趣区域。然后利用YOLOv3神经网络对感兴趣区域上的字符进行检测。最后,根据字符的类内相关性,利用字符结构聚类获得温湿度值。此外,本文还通过实验验证了该方法的有效性。
{"title":"Digital-Display Temperature and Humidity Instrument Recognition Based on YOLOv3 and Character Structure Clustering","authors":"Lei Geng, Fengfeng Yan, Zhitao Xiao, Fang Zhang, Yanbei Liu","doi":"10.1145/3404555.3404623","DOIUrl":"https://doi.org/10.1145/3404555.3404623","url":null,"abstract":"In this paper, in order to more efficiently verify digital-display temperature and humidity instruments and better evaluate the quality of digital-display temperature and humidity instruments, we propose a new recognition method of digital-display temperature and humidity instrument based on YOLOv3 and character structure clustering. First, the screen region of digitaldisplay temperature and humidity instrument contains all valid characters, so we define the smallest bounding rectangle region of the screen region as the region of interest. We extract the region of interest through YOLOv3-tiny neural network. Then we use YOLOv3 neural network to detect characters on the region of interest. Finally, according to the intra-class correlation of characters, we use character structure clustering to obtain temperature and humidity values. In addition, in this paper, we verify the effectiveness of this method through experiments.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125914362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning the Front-End Speech Feature with Raw Waveform for End-to-End Speaker Recognition 学习前端语音特征与原始波形的端到端说话人识别
Ningxin Liang, W. Xu, Chengfang Luo, Wenxiong Kang
State-of-the-art deep neural network-based speaker recognition systems tend to follow the paradigm of speech feature extraction and then the speaker classifier training, namely "divide and conquer" approaches. These methods usually rely on fixed, handcrafted features such as Mel frequency cepstral coefficients (MFCCs) to preprocess the waveform before the classification pipeline. In this paper, inspired by the success and promising work to model a system directly from the raw speech signal for applications such as audio speech recognition, anti-spoofing and emotion recognition, we present an end-to-end speaker recognition system, combining front-end raw waveform feature extractor, back-end speaker embedding classifier and angle-based loss optimizer. Specifically, this means that the proposed frontend raw waveform feature extractor builds on a trainable alternative for MFCCs without modification of the acoustic model. And we will detail the superiority of the raw waveform feature extractor, namely utilizing the time convolution layer to reduce temporal variations aiming to adaptively learn a front-end speech feature representation by supervised training together with the rest of classification model. Our experiments, conducted on CSTR VCTK Corpus dataset, demonstrate that the proposed end-to-end speaker recognition system can achieve state-of-the-art performance compared to baseline models.
目前基于深度神经网络的说话人识别系统倾向于遵循语音特征提取然后说话人分类器训练的范式,即“分而治之”方法。这些方法通常依赖于固定的、手工制作的特征,如Mel频率倒谱系数(mfccc),在分类管道之前对波形进行预处理。在本文中,受直接从原始语音信号建模系统用于音频语音识别、反欺骗和情感识别等应用的成功和有前途的工作的启发,我们提出了一个端到端的说话人识别系统,该系统结合了前端原始波形特征提取器、后端说话人嵌入分类器和基于角度的损失优化器。具体来说,这意味着所提出的前端原始波形特征提取器建立在一个可训练的mfc替代方案上,而无需修改声学模型。我们将详细介绍原始波形特征提取器的优势,即利用时间卷积层来减少时间变化,旨在通过监督训练与其余分类模型一起自适应学习前端语音特征表示。我们在CSTR VCTK语料库数据集上进行的实验表明,与基线模型相比,所提出的端到端说话人识别系统可以达到最先进的性能。
{"title":"Learning the Front-End Speech Feature with Raw Waveform for End-to-End Speaker Recognition","authors":"Ningxin Liang, W. Xu, Chengfang Luo, Wenxiong Kang","doi":"10.1145/3404555.3404571","DOIUrl":"https://doi.org/10.1145/3404555.3404571","url":null,"abstract":"State-of-the-art deep neural network-based speaker recognition systems tend to follow the paradigm of speech feature extraction and then the speaker classifier training, namely \"divide and conquer\" approaches. These methods usually rely on fixed, handcrafted features such as Mel frequency cepstral coefficients (MFCCs) to preprocess the waveform before the classification pipeline. In this paper, inspired by the success and promising work to model a system directly from the raw speech signal for applications such as audio speech recognition, anti-spoofing and emotion recognition, we present an end-to-end speaker recognition system, combining front-end raw waveform feature extractor, back-end speaker embedding classifier and angle-based loss optimizer. Specifically, this means that the proposed frontend raw waveform feature extractor builds on a trainable alternative for MFCCs without modification of the acoustic model. And we will detail the superiority of the raw waveform feature extractor, namely utilizing the time convolution layer to reduce temporal variations aiming to adaptively learn a front-end speech feature representation by supervised training together with the rest of classification model. Our experiments, conducted on CSTR VCTK Corpus dataset, demonstrate that the proposed end-to-end speaker recognition system can achieve state-of-the-art performance compared to baseline models.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"256 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121895225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1