首页 > 最新文献

Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence最新文献

英文 中文
Adaptive Importance Pooling Network for Scene Text Recognition 场景文本识别的自适应重要性池化网络
Peng Ren, Qingsong Yu, Xuanqi Wu, Ziyang Wang
Scene text recognition (STR) has attracted extensive attention in pattern recognition community. With the development of deep learning, the object detection and sequence recognition schemes based on deep neural networks have been widely used in this task. Crucially, the discriminative features play a vital role in complex scene text backgrounds. However, for specific tasks, inappropriate pooling strategies may lose feature details. To tackle this problem, in this paper, an end-to-end based on adaptive importance pooling network (AIPN) is proposed. Concretely, we embed the novel AIP strategy into feature extraction stage. Additionally, we adopt the attention-based LSTM as decoder so that the useful image feature information regions are automatically focused while predicting final recognition results. Furthermore, to reduce the burden of feature representation for the next recognition, text rectification network (TRN) supervised by text recognition parts is utilized to normalize the input text images. Experimental results show that our model achieves inspiring performances on STR benchmark datasets IIIT5K, SVT, ICDAR-2003 and ICDAR-2013.
场景文本识别在模式识别界引起了广泛的关注。随着深度学习的发展,基于深度神经网络的目标检测和序列识别方案在该任务中得到了广泛的应用。至关重要的是,在复杂的场景文本背景中,判别特征起着至关重要的作用。然而,对于特定的任务,不适当的池化策略可能会丢失特性细节。为了解决这一问题,本文提出了一种基于端到端的自适应重要性池化网络(AIPN)。具体而言,我们将新的AIP策略嵌入到特征提取阶段。此外,我们采用基于注意力的LSTM作为解码器,在预测最终识别结果的同时自动聚焦有用的图像特征信息区域。此外,为了减少下一次识别的特征表示负担,利用文本识别部分监督的文本校正网络(TRN)对输入的文本图像进行归一化。实验结果表明,该模型在STR基准数据集IIIT5K、SVT、ICDAR-2003和ICDAR-2013上取得了令人鼓舞的性能。
{"title":"Adaptive Importance Pooling Network for Scene Text Recognition","authors":"Peng Ren, Qingsong Yu, Xuanqi Wu, Ziyang Wang","doi":"10.1145/3404555.3404614","DOIUrl":"https://doi.org/10.1145/3404555.3404614","url":null,"abstract":"Scene text recognition (STR) has attracted extensive attention in pattern recognition community. With the development of deep learning, the object detection and sequence recognition schemes based on deep neural networks have been widely used in this task. Crucially, the discriminative features play a vital role in complex scene text backgrounds. However, for specific tasks, inappropriate pooling strategies may lose feature details. To tackle this problem, in this paper, an end-to-end based on adaptive importance pooling network (AIPN) is proposed. Concretely, we embed the novel AIP strategy into feature extraction stage. Additionally, we adopt the attention-based LSTM as decoder so that the useful image feature information regions are automatically focused while predicting final recognition results. Furthermore, to reduce the burden of feature representation for the next recognition, text rectification network (TRN) supervised by text recognition parts is utilized to normalize the input text images. Experimental results show that our model achieves inspiring performances on STR benchmark datasets IIIT5K, SVT, ICDAR-2003 and ICDAR-2013.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"120 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120820649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Research on Navigation and Path Planning of Mobile Robot Based on Vision Sensor 基于视觉传感器的移动机器人导航与路径规划研究
Yingze Mu, Chao-Yi Dong, Qi-Ming Chen, Bochen Li, Zhi-Qiang Fan
The realization of mobile robots' autonomous positioning and map constructing in unknown environments is crucial for the robots' obstacle avoidance and path planning. In this paper, an improved ORB (Oriented fast and Rotated Brief)-SLAM2 (Simultaneous Localization And Mapping 2) algorithm is used to construct a 3D (Three Dimensional) point cloud map of the robot's own positioning and environment. The improved ORB-SLAM2 algorithm is schemed as follows: firstly, after the environment map constructions, it adds the function of saving maps to help implementing map type conversion and navigation obstacle avoidance. Then we employ a PCL (Point Cloud Library) to convert the saved 3D point cloud map into an octomap. A path planning algorithm for mobile robots is implemented on the basis of the octomaps. The robot's dynamical global path planning is implemented using a RRT (Rapidly-exploring Random Tree) algorithm. The experimental results of map constructing and path planning show that the scheme proposed in this paper can effectively realize the obstacle avoidance and path planning of the mobile robot. Thus, the algorithm provides a basis for the further realizing the mobile robot' autonomous movement.
移动机器人在未知环境下的自主定位和地图构建的实现对于机器人的避障和路径规划至关重要。本文采用改进的ORB (Oriented fast and rotating Brief)-SLAM2 (Simultaneous Localization and Mapping 2)算法,构建机器人自身定位和环境的三维点云图。改进的ORB-SLAM2算法方案如下:首先,在环境地图构建完成后,增加地图保存功能,实现地图类型转换和导航避障;然后我们使用PCL(点云库)将保存的3D点云图转换为八坐标图。提出了一种基于八元地图的移动机器人路径规划算法。机器人的动态全局路径规划采用RRT(快速探索随机树)算法。地图生成和路径规划的实验结果表明,本文提出的方案可以有效地实现移动机器人的避障和路径规划。从而为进一步实现移动机器人的自主运动提供了基础。
{"title":"Research on Navigation and Path Planning of Mobile Robot Based on Vision Sensor","authors":"Yingze Mu, Chao-Yi Dong, Qi-Ming Chen, Bochen Li, Zhi-Qiang Fan","doi":"10.1145/3404555.3404589","DOIUrl":"https://doi.org/10.1145/3404555.3404589","url":null,"abstract":"The realization of mobile robots' autonomous positioning and map constructing in unknown environments is crucial for the robots' obstacle avoidance and path planning. In this paper, an improved ORB (Oriented fast and Rotated Brief)-SLAM2 (Simultaneous Localization And Mapping 2) algorithm is used to construct a 3D (Three Dimensional) point cloud map of the robot's own positioning and environment. The improved ORB-SLAM2 algorithm is schemed as follows: firstly, after the environment map constructions, it adds the function of saving maps to help implementing map type conversion and navigation obstacle avoidance. Then we employ a PCL (Point Cloud Library) to convert the saved 3D point cloud map into an octomap. A path planning algorithm for mobile robots is implemented on the basis of the octomaps. The robot's dynamical global path planning is implemented using a RRT (Rapidly-exploring Random Tree) algorithm. The experimental results of map constructing and path planning show that the scheme proposed in this paper can effectively realize the obstacle avoidance and path planning of the mobile robot. Thus, the algorithm provides a basis for the further realizing the mobile robot' autonomous movement.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122715811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Cutting Piece and CAD Matching Method Based on Feature Retrieval and Shape Segmentation 基于特征检索和形状分割的切削件与CAD匹配方法
Lei Geng, Changshun Yin, Zhitao Xiao, Fang Zhang, Jun Wu
In order to accurately measure the deviation between car seat cutting pieces and CAD templates, and then evaluate the production quality of car seat cutting pieces, this paper proposes a matching algorithm of car seat cutting pieces and CAD based on feature retrieval and shape segmentation. The processing object of this algorithm is the cutting piece images collected by the acquisition system that combines the backlight board and CCD camera. Firstly, according to the geometric characteristics of CAD, a CAD retrieval method based on image edge shape features was proposed. Then, in view of the flexible characteristics of car seat cutting piece, a matching algorithm of car seat cutting piece and CAD based on shape segmentation was proposed. Finally, the coordinate system of the cutting piece and CAD is unified by affine transformation, and the deviation between the two is calculated. A large number of experiments are performed in a field of view of 700x 500mm, and the results show that the method proposed in this paper can effectively improve the matching accuracy of the cutting piece and CAD. Experimental results verify the effectiveness of the proposed method.
为了准确测量汽车座椅切割件与CAD模板之间的偏差,进而评价汽车座椅切割件的生产质量,本文提出了一种基于特征检索和形状分割的汽车座椅切割件与CAD的匹配算法。该算法的处理对象是由背光板和CCD相机相结合的采集系统采集到的切割片图像。首先,根据CAD的几何特征,提出了一种基于图像边缘形状特征的CAD检索方法。然后,针对汽车座椅切割件的柔性特点,提出了一种基于形状分割的汽车座椅切割件与CAD的匹配算法。最后,通过仿射变换将切削件与CAD的坐标系统一起来,并计算两者之间的偏差。在700x 500mm的视场中进行了大量的实验,结果表明本文提出的方法可以有效地提高切割件与CAD的匹配精度。实验结果验证了该方法的有效性。
{"title":"Cutting Piece and CAD Matching Method Based on Feature Retrieval and Shape Segmentation","authors":"Lei Geng, Changshun Yin, Zhitao Xiao, Fang Zhang, Jun Wu","doi":"10.1145/3404555.3404611","DOIUrl":"https://doi.org/10.1145/3404555.3404611","url":null,"abstract":"In order to accurately measure the deviation between car seat cutting pieces and CAD templates, and then evaluate the production quality of car seat cutting pieces, this paper proposes a matching algorithm of car seat cutting pieces and CAD based on feature retrieval and shape segmentation. The processing object of this algorithm is the cutting piece images collected by the acquisition system that combines the backlight board and CCD camera. Firstly, according to the geometric characteristics of CAD, a CAD retrieval method based on image edge shape features was proposed. Then, in view of the flexible characteristics of car seat cutting piece, a matching algorithm of car seat cutting piece and CAD based on shape segmentation was proposed. Finally, the coordinate system of the cutting piece and CAD is unified by affine transformation, and the deviation between the two is calculated. A large number of experiments are performed in a field of view of 700x 500mm, and the results show that the method proposed in this paper can effectively improve the matching accuracy of the cutting piece and CAD. Experimental results verify the effectiveness of the proposed method.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129007948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hierarchical Label Embedding Networks for Financial Document Sentiment Analysis 基于层次标签嵌入网络的财务文件情感分析
Ping Yao, Qinke Peng, Tian Han
With the rapid development of the Internet, document data have become an important source of information in the financial field. The application of documents sentiment analysis in the financial field has attracted increasing attention. It is obviously impractical to extract sentiments manually from a large amount of financial document, but natural language processing (NLP) technology can solve this problem. The research object of this paper focuses on the research reports of listed companies, which is a kind of long financial document published by experts in the field. In this paper, we propose a hierarchical label embedding neural network model for sentiment analysis of financial documents. This model adopts hierarchical network structure to capture the structural information of financial documents. Moreover, the model also includes an expression embedding mechanism for focusing on important content. We believe that most of the words and sentences in a document are consistent with the sentiments of the labels marked by the author. The label embedding mechanism can pay more attention to the content that is consistent with the sentiments of the labels during the document's hierarchical representation. Experiments showed that our method is more effective than other advanced methods on the established dataset.
随着互联网的快速发展,文件数据已成为金融领域重要的信息来源。文献情感分析在金融领域的应用越来越受到人们的关注。从大量的金融文档中手动提取情感显然是不切实际的,而自然语言处理技术可以解决这一问题。本文的研究对象主要集中在上市公司研究报告,这是一种由该领域的专家发表的长篇财务文件。本文提出了一种用于财务文件情感分析的分层标签嵌入神经网络模型。该模型采用层次网络结构来捕获财务文件的结构信息。此外,该模型还包括一个表达式嵌入机制,用于关注重要内容。我们认为,文档中的大多数单词和句子与作者所标记的标签的情感是一致的。标签嵌入机制可以在文档分层表示过程中更加关注与标签情感一致的内容。实验表明,在已建立的数据集上,我们的方法比其他先进的方法更有效。
{"title":"Hierarchical Label Embedding Networks for Financial Document Sentiment Analysis","authors":"Ping Yao, Qinke Peng, Tian Han","doi":"10.1145/3404555.3404583","DOIUrl":"https://doi.org/10.1145/3404555.3404583","url":null,"abstract":"With the rapid development of the Internet, document data have become an important source of information in the financial field. The application of documents sentiment analysis in the financial field has attracted increasing attention. It is obviously impractical to extract sentiments manually from a large amount of financial document, but natural language processing (NLP) technology can solve this problem. The research object of this paper focuses on the research reports of listed companies, which is a kind of long financial document published by experts in the field. In this paper, we propose a hierarchical label embedding neural network model for sentiment analysis of financial documents. This model adopts hierarchical network structure to capture the structural information of financial documents. Moreover, the model also includes an expression embedding mechanism for focusing on important content. We believe that most of the words and sentences in a document are consistent with the sentiments of the labels marked by the author. The label embedding mechanism can pay more attention to the content that is consistent with the sentiments of the labels during the document's hierarchical representation. Experiments showed that our method is more effective than other advanced methods on the established dataset.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121610927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A New Object Detection Algorithm Based on YOLOv3 for Lung Nodules 基于YOLOv3的肺结节目标检测新算法
Kejia Xu, Hong Jiang, Wen-Gen Tang
Lung cancer has always threatened people's health and life. Lung nodules, as early features of lung cancer, have very important clinical significance and research value for the diagnosis of lung cancer. The features captured by the traditional convolutional neural network are limited, in addition, traditional YOLO method has the problems of low accuracy and inaccurate positioning. Aiming at this problem, this paper proposes a new algorithm based on YOLOv3 for detecting lung nodules. The Inception ResBlocks are added to the feature network of YOLOv3, so that the network can extract richer feature information, furthermore, a new bounding box regression loss function is proposed. The loss function GDIoU loss makes the prediction of bounding box regression more accurate and further improves the performance of lung nodule detection. After experimental verification, the AP of this model can reach 83.5%, and the sensitivity can reach 92.6%. The proposed method has a good performance in terms of positioning accuracy and detection rate, and can avoid the problems of false detection and missed detection to a certain extent. It provides a new idea for the detection of lung nodules.
肺癌一直威胁着人们的健康和生命。肺结节作为肺癌的早期特征,对肺癌的诊断具有非常重要的临床意义和研究价值。传统的卷积神经网络捕获的特征是有限的,而且传统的YOLO方法存在精度低和定位不准确的问题。针对这一问题,本文提出了一种基于YOLOv3的肺结节检测新算法。在YOLOv3的特征网络中加入Inception ResBlocks,使网络能够提取更丰富的特征信息,并提出了一种新的边界盒回归损失函数。损失函数GDIoU损失使得边界盒回归的预测更加准确,进一步提高了肺结节检测的性能。经实验验证,该模型的AP可达83.5%,灵敏度可达92.6%。所提出的方法在定位精度和检测率方面都具有良好的性能,并且在一定程度上避免了误检和漏检的问题。为肺结节的检测提供了新的思路。
{"title":"A New Object Detection Algorithm Based on YOLOv3 for Lung Nodules","authors":"Kejia Xu, Hong Jiang, Wen-Gen Tang","doi":"10.1145/3404555.3404609","DOIUrl":"https://doi.org/10.1145/3404555.3404609","url":null,"abstract":"Lung cancer has always threatened people's health and life. Lung nodules, as early features of lung cancer, have very important clinical significance and research value for the diagnosis of lung cancer. The features captured by the traditional convolutional neural network are limited, in addition, traditional YOLO method has the problems of low accuracy and inaccurate positioning. Aiming at this problem, this paper proposes a new algorithm based on YOLOv3 for detecting lung nodules. The Inception ResBlocks are added to the feature network of YOLOv3, so that the network can extract richer feature information, furthermore, a new bounding box regression loss function is proposed. The loss function GDIoU loss makes the prediction of bounding box regression more accurate and further improves the performance of lung nodule detection. After experimental verification, the AP of this model can reach 83.5%, and the sensitivity can reach 92.6%. The proposed method has a good performance in terms of positioning accuracy and detection rate, and can avoid the problems of false detection and missed detection to a certain extent. It provides a new idea for the detection of lung nodules.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132862774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence 2020年第六届计算与人工智能国际会议论文集
{"title":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","authors":"","doi":"10.1145/3404555","DOIUrl":"https://doi.org/10.1145/3404555","url":null,"abstract":"","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"145 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133198979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Human Activities of Daily Living Recognition with Graph Convolutional Network 基于图卷积网络的人类日常生活活动识别
N. Chinpanthana, Yunyu Liu
A rapidly growing population presents many challenges to healthcare and security surveillance around the world. Human activity recognition is one of the active research areas to recognizing and understanding the various activities. Many researchers are finding and representing the details of human body gestures to determine human activity or action. The result, however, is still unsatisfactory due to the inclusion of irrelevant images. The model is rather rudimentary and it does not specific enough for representing the meaning of images. In this paper, we propose a methodology for human activities of daily living recognition with 4 steps (1) processes including text-based embedding concept, (2) semi-supervised graph node, (3) graph convolution network, and (4) measurement and evaluation. The experimental results indicate that our proposed approach offers significant performance improvements in data set 2 in 10-fold, with the maximum of 79.34%.
快速增长的人口给世界各地的医疗保健和安全监控带来了许多挑战。人类活动识别是识别和理解人类各种活动的活跃研究领域之一。许多研究人员正在寻找和表现人体手势的细节,以确定人类的活动或行动。然而,由于包含了不相关的图像,结果仍然令人不满意。该模型是相当初级的,它没有足够的具体表示图像的意义。在本文中,我们提出了一种人类日常生活活动识别的方法,分为四个步骤:(1)基于文本的嵌入概念,(2)半监督图节点,(3)图卷积网络,(4)测量与评价。实验结果表明,我们提出的方法在数据集2上的性能提高了10倍,最高达到79.34%。
{"title":"Human Activities of Daily Living Recognition with Graph Convolutional Network","authors":"N. Chinpanthana, Yunyu Liu","doi":"10.1145/3404555.3404557","DOIUrl":"https://doi.org/10.1145/3404555.3404557","url":null,"abstract":"A rapidly growing population presents many challenges to healthcare and security surveillance around the world. Human activity recognition is one of the active research areas to recognizing and understanding the various activities. Many researchers are finding and representing the details of human body gestures to determine human activity or action. The result, however, is still unsatisfactory due to the inclusion of irrelevant images. The model is rather rudimentary and it does not specific enough for representing the meaning of images. In this paper, we propose a methodology for human activities of daily living recognition with 4 steps (1) processes including text-based embedding concept, (2) semi-supervised graph node, (3) graph convolution network, and (4) measurement and evaluation. The experimental results indicate that our proposed approach offers significant performance improvements in data set 2 in 10-fold, with the maximum of 79.34%.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"158 8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128891185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Traffic Condition Prediction of Urban Roads Based on Neural Network 基于神经网络的城市道路交通状况预测
Ruyi Zhu
Real-time and reliable traffic flow estimation is the basis of urban traffic management and control. However, the existing research focuses on how to use the historical data of surveillance intersection to predict future traffic conditions. As we know, there are few effective algorithms to infer the real-time traffic state of non-surveillance intersections from limited road surveillance by using traffic information in the urban road system. In this paper, we introduce a new solution to solve the prediction task of traffic flow analysis by using traffic data, especially taxi historical data, traffic network data and intersection historical data. The proposed solution takes advantage of GCN and CGAN, and we improved the Unet to realize an important part of the generator. Then, we capture the relationship between the intersections with surveillance and the intersections without surveillance by floating taxi-cabs covered in the whole city. The framework of CGAN can adjust the weights and enhance the inference ability to generate complete traffic status under current conditions. The experimental results show that our method is superior to other methods on the accuracy of traffic volume inference.
实时、可靠的交通流估计是城市交通管理与控制的基础。然而,现有的研究主要集中在如何利用监控交叉口的历史数据来预测未来的交通状况。我们知道,利用城市道路系统中的交通信息,从有限的道路监控中推断非监控交叉口的实时交通状态,目前还没有有效的算法。本文介绍了一种利用交通数据,特别是出租车历史数据、交通网络数据和交叉口历史数据来解决交通流分析预测任务的新方法。该方案充分利用了GCN和CGAN的优势,并对Unet进行了改进,实现了发电机的重要组成部分。然后,我们通过覆盖整个城市的浮动出租车来捕捉有监控的交叉口和没有监控的交叉口之间的关系。CGAN框架可以调整权值,增强推理能力,生成当前条件下完整的交通状态。实验结果表明,该方法在交通量推断精度上优于其他方法。
{"title":"Traffic Condition Prediction of Urban Roads Based on Neural Network","authors":"Ruyi Zhu","doi":"10.1145/3404555.3404621","DOIUrl":"https://doi.org/10.1145/3404555.3404621","url":null,"abstract":"Real-time and reliable traffic flow estimation is the basis of urban traffic management and control. However, the existing research focuses on how to use the historical data of surveillance intersection to predict future traffic conditions. As we know, there are few effective algorithms to infer the real-time traffic state of non-surveillance intersections from limited road surveillance by using traffic information in the urban road system. In this paper, we introduce a new solution to solve the prediction task of traffic flow analysis by using traffic data, especially taxi historical data, traffic network data and intersection historical data. The proposed solution takes advantage of GCN and CGAN, and we improved the Unet to realize an important part of the generator. Then, we capture the relationship between the intersections with surveillance and the intersections without surveillance by floating taxi-cabs covered in the whole city. The framework of CGAN can adjust the weights and enhance the inference ability to generate complete traffic status under current conditions. The experimental results show that our method is superior to other methods on the accuracy of traffic volume inference.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124137177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Face Recognition Method for Enterprise Workstations Based on Convolutional Neural Network Optimization Algorithm 基于卷积神经网络优化算法的企业工作站人脸识别方法
Naiyuan Tian, Xiangyun Zhang, Tian Liu, Chen-Xia Zhao
Nowadays, the world economy is developing rapidly, new Internet companies are emerging. Based on the consideration of effectively improving the working efficiency of employees, enhance the competitiveness of enterprises, and facilitate managers to grasp the working status of employees at any time, this paper proposed a face recognition method for enterprise positions based on a convolutional neural network (CNN) optimization algorithm. At first, this paper established enterprise employee face classification model based on the TensorFlow deep learning framework, then used convolutional neural network to extract employee face image features, and introduced Keras deep learning library to train face recognition model, finally used TensorFlow-supported momentum gradient descent optimization method to effectively optimize the CNN model and used the loss function to effectively evaluate the performance of the model, thereby effectively improving the recognition accuracy of the face recognition algorithm. The algorithm proposed in this paper is used to identify the working status of employees in practice. The validity of the algorithm is verified by questionnaire results, and compared with typical face recognition algorithms. The experiment results clearly show that to some extent the method we proposed has higher recognition accuracy and better practicality, which will help companies find out the working status of their employees.
当今世界经济快速发展,新的互联网公司不断涌现。基于有效提高员工工作效率,增强企业竞争力,方便管理者随时掌握员工工作状态的考虑,本文提出了一种基于卷积神经网络(CNN)优化算法的企业岗位人脸识别方法。本文首先基于TensorFlow深度学习框架建立企业员工人脸分类模型,然后利用卷积神经网络提取员工人脸图像特征,并引入Keras深度学习库训练人脸识别模型,最后利用TensorFlow支持的动量梯度下降优化方法对CNN模型进行有效优化,并利用损失函数对模型的性能进行有效评价。从而有效地提高了人脸识别算法的识别精度。本文提出的算法在实践中用于员工工作状态的识别。通过问卷调查结果验证了算法的有效性,并与典型人脸识别算法进行了比较。实验结果清楚地表明,我们提出的方法在一定程度上具有更高的识别准确率和更好的实用性,可以帮助企业了解员工的工作状态。
{"title":"Face Recognition Method for Enterprise Workstations Based on Convolutional Neural Network Optimization Algorithm","authors":"Naiyuan Tian, Xiangyun Zhang, Tian Liu, Chen-Xia Zhao","doi":"10.1145/3404555.3404585","DOIUrl":"https://doi.org/10.1145/3404555.3404585","url":null,"abstract":"Nowadays, the world economy is developing rapidly, new Internet companies are emerging. Based on the consideration of effectively improving the working efficiency of employees, enhance the competitiveness of enterprises, and facilitate managers to grasp the working status of employees at any time, this paper proposed a face recognition method for enterprise positions based on a convolutional neural network (CNN) optimization algorithm. At first, this paper established enterprise employee face classification model based on the TensorFlow deep learning framework, then used convolutional neural network to extract employee face image features, and introduced Keras deep learning library to train face recognition model, finally used TensorFlow-supported momentum gradient descent optimization method to effectively optimize the CNN model and used the loss function to effectively evaluate the performance of the model, thereby effectively improving the recognition accuracy of the face recognition algorithm. The algorithm proposed in this paper is used to identify the working status of employees in practice. The validity of the algorithm is verified by questionnaire results, and compared with typical face recognition algorithms. The experiment results clearly show that to some extent the method we proposed has higher recognition accuracy and better practicality, which will help companies find out the working status of their employees.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122185010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
End-to-End Image Reconstruction of Image from Human Functional Magnetic Resonance Imaging Based on the "Language" of Visual Cortex 基于视觉皮层“语言”的人体功能磁共振图像端到端图像重建
Ziya Yu, Kai Qiao, Chi Zhang, Linyuan Wang, Bin Yan
In recent years, with the development of deep learning, the integration between neuroscience and computer vision has been deepened. In computer vision, it has been possible to generate images from text as well as semantic understanding from images based on deep learning. Here, text refers to human language, and the language that a computer can understand typically requires text to be encoded. In human brain visual expression, it also produces "descriptions" of visual stimuli, that is, the "language" that generates from the brain itself. Reconstruction of visual information is the process of reconstructing visual stimuli from the understanding of human brain, which is the most difficult to achieve in visual decoding. And based on the existing research of visual mechanisms, it is still difficult to understand the "language" of human brain. Inspired by generating images from text, we regarded voxel responses as the "language" of brain in order to reconstruct visual stimuli and built an end-to-end visual decoding model under the condition of small number of samples. We simply retrained a generative adversarial network (GAN) used to generate images from text on 1200 training data (including natural image stimuli and corresponding voxel responses). We regarded voxel responses as semantic information of brain, and sent them to GAN as prior information. The results showed that the decoding model we trained can reconstruct the natural images successfully. It also suggested the feasibility of reconstructing visual stimuli from "brain language", and the end-to-end model was more likely to learn the direct mapping between brain activity and visual perception. Moreover, it further indicated the great potential of combining neuroscience and computer vision.
近年来,随着深度学习的发展,神经科学与计算机视觉的融合不断加深。在计算机视觉中,已经可以从文本中生成图像,以及基于深度学习的图像语义理解。这里,文本指的是人类语言,而计算机能够理解的语言通常需要对文本进行编码。在人脑视觉表达中,也产生对视觉刺激的“描述”,即大脑自身产生的“语言”。视觉信息重构是将人脑对视觉刺激的理解进行重构的过程,是视觉解码中最难实现的部分。而基于现有的视觉机制研究,理解人类大脑的“语言”仍然很困难。受文本生成图像的启发,我们将体素响应作为大脑的“语言”来重构视觉刺激,构建了样本数量较少情况下的端到端视觉解码模型。我们简单地重新训练了一个生成对抗网络(GAN),用于在1200个训练数据(包括自然图像刺激和相应的体素响应)上从文本生成图像。我们将体素响应作为大脑的语义信息,作为先验信息发送给GAN。结果表明,我们所训练的解码模型能够成功地重建自然图像。这也提示了从“大脑语言”重构视觉刺激的可行性,端到端模型更有可能学习到大脑活动与视觉感知之间的直接映射。此外,它进一步表明神经科学与计算机视觉相结合的巨大潜力。
{"title":"End-to-End Image Reconstruction of Image from Human Functional Magnetic Resonance Imaging Based on the \"Language\" of Visual Cortex","authors":"Ziya Yu, Kai Qiao, Chi Zhang, Linyuan Wang, Bin Yan","doi":"10.1145/3404555.3404593","DOIUrl":"https://doi.org/10.1145/3404555.3404593","url":null,"abstract":"In recent years, with the development of deep learning, the integration between neuroscience and computer vision has been deepened. In computer vision, it has been possible to generate images from text as well as semantic understanding from images based on deep learning. Here, text refers to human language, and the language that a computer can understand typically requires text to be encoded. In human brain visual expression, it also produces \"descriptions\" of visual stimuli, that is, the \"language\" that generates from the brain itself. Reconstruction of visual information is the process of reconstructing visual stimuli from the understanding of human brain, which is the most difficult to achieve in visual decoding. And based on the existing research of visual mechanisms, it is still difficult to understand the \"language\" of human brain. Inspired by generating images from text, we regarded voxel responses as the \"language\" of brain in order to reconstruct visual stimuli and built an end-to-end visual decoding model under the condition of small number of samples. We simply retrained a generative adversarial network (GAN) used to generate images from text on 1200 training data (including natural image stimuli and corresponding voxel responses). We regarded voxel responses as semantic information of brain, and sent them to GAN as prior information. The results showed that the decoding model we trained can reconstruct the natural images successfully. It also suggested the feasibility of reconstructing visual stimuli from \"brain language\", and the end-to-end model was more likely to learn the direct mapping between brain activity and visual perception. Moreover, it further indicated the great potential of combining neuroscience and computer vision.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"7 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120836180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1