Pub Date : 2019-11-01DOI: 10.1109/ICTAI.2019.00216
Nianwen Ning, Chenguang Song, Pengpeng Zhou, Yunlei Zhang, Bin Wu
Network embedding aims to learn a latent representation of each node which preserves the structure information. Many real-world networks have multiple dimensions of nodes and multiple types of relations. Therefore, it is more appropriate to represent such kind of networks as multiplex networks. A multiplex network is formed by a set of nodes connected in different layers by links indicating interactions of different types. However, existing random walk based multiplex networks embedding algorithms have problems with sampling bias and imbalanced relation types, thus leading the poor performance in the downstream tasks. In this paper, we propose a node embedding method based on adaptive cross-layer forest fire sampling (FFS) for multiplex networks (FFME). We first focus on the sampling strategies of FFS to address the bias issue of random walk. We utilize a fixed-length queue to record previously visited layers, which can balance the edge distribution over different layers in sampled node sequences. In addition, to adaptively sample node's context, we also propose a metric for node called Neighbors Partition Coefficient (N P C ). The generation process of node sequence is supervised by NPC for adaptive cross-layer sampling. Experiments on real-world networks in diverse fields show that our method outperforms the state-of-the-art methods in application tasks such as cross-domain link prediction and shared community structure detection.
网络嵌入的目的是学习保留结构信息的每个节点的潜在表示。许多现实世界的网络都有多个维度的节点和多种类型的关系。因此,用多路网络来表示这类网络更为合适。多路复用网络是由一组节点通过不同类型的链路连接在不同的层中形成的。然而,现有的基于随机行走的多路网络嵌入算法存在抽样偏差和关系类型不平衡的问题,导致其在下游任务中的性能较差。提出了一种基于自适应跨层森林火灾采样(FFS)的多路网络节点嵌入方法。我们首先关注FFS的抽样策略,以解决随机漫步的偏差问题。我们利用固定长度的队列来记录之前访问过的层,这可以平衡采样节点序列中不同层的边缘分布。此外,为了对节点的上下文进行自适应采样,我们还提出了一个节点的邻居划分系数(N P C)度量。节点序列的生成过程由NPC监督,用于自适应跨层采样。在不同领域的真实网络上进行的实验表明,我们的方法在跨域链接预测和共享社区结构检测等应用任务中优于最先进的方法。
{"title":"An Adaptive Cross-Layer Sampling-Based Node Embedding for Multiplex Networks","authors":"Nianwen Ning, Chenguang Song, Pengpeng Zhou, Yunlei Zhang, Bin Wu","doi":"10.1109/ICTAI.2019.00216","DOIUrl":"https://doi.org/10.1109/ICTAI.2019.00216","url":null,"abstract":"Network embedding aims to learn a latent representation of each node which preserves the structure information. Many real-world networks have multiple dimensions of nodes and multiple types of relations. Therefore, it is more appropriate to represent such kind of networks as multiplex networks. A multiplex network is formed by a set of nodes connected in different layers by links indicating interactions of different types. However, existing random walk based multiplex networks embedding algorithms have problems with sampling bias and imbalanced relation types, thus leading the poor performance in the downstream tasks. In this paper, we propose a node embedding method based on adaptive cross-layer forest fire sampling (FFS) for multiplex networks (FFME). We first focus on the sampling strategies of FFS to address the bias issue of random walk. We utilize a fixed-length queue to record previously visited layers, which can balance the edge distribution over different layers in sampled node sequences. In addition, to adaptively sample node's context, we also propose a metric for node called Neighbors Partition Coefficient (N P C ). The generation process of node sequence is supervised by NPC for adaptive cross-layer sampling. Experiments on real-world networks in diverse fields show that our method outperforms the state-of-the-art methods in application tasks such as cross-domain link prediction and shared community structure detection.","PeriodicalId":346657,"journal":{"name":"2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)","volume":"83 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116408188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-11-01DOI: 10.1109/ICTAI.2019.00045
Zhixin Li, Yaru Sun, Suqin Tang, Canlong Zhang, Huifang Ma
The intelligent information processing of the standard Zhuang language spoken mainly in Southern China is presently in its infancy, and lacks a well-defined language corpus and automatic part-of-speech tagging methods. Therefore, this study proposes an adversarial part-of-speech tagging method based on reinforcement learning, which solves the problems associated with a lack of a language corpus, time-consuming laborious manual marking, and the low performance of machine marking. Firstly, we construct a markup dictionary based on the grammatical characteristics of standard Zhuang and the Penn Chinese Treebank. Secondly, a dependency syntax analysis is applied for constructing the semantic information feature vectors of sentences, and long short-term memory is adopted as the policy network architecture to enhance available information using recurrent memory, and a conditional random field is employed as the discriminant network to perform label inference with global normalization. Finally, we use reinforcement learning as the model framework, target parts of speech as the feedback of the environment, and then obtain the optimal policy through adversarial learning. The results show that the combination of reinforcement learning and adversarial network alleviates the dependence of the model on the training corpus to some extent, and can quickly and effectively expand the scale of the annotation dictionary for the Zhuang language, thereby obtaining better labeling results.
{"title":"Sentence-Level Semantic Features Guided Adversarial Network for Zhuang Language Part-of-Speech Tagging","authors":"Zhixin Li, Yaru Sun, Suqin Tang, Canlong Zhang, Huifang Ma","doi":"10.1109/ICTAI.2019.00045","DOIUrl":"https://doi.org/10.1109/ICTAI.2019.00045","url":null,"abstract":"The intelligent information processing of the standard Zhuang language spoken mainly in Southern China is presently in its infancy, and lacks a well-defined language corpus and automatic part-of-speech tagging methods. Therefore, this study proposes an adversarial part-of-speech tagging method based on reinforcement learning, which solves the problems associated with a lack of a language corpus, time-consuming laborious manual marking, and the low performance of machine marking. Firstly, we construct a markup dictionary based on the grammatical characteristics of standard Zhuang and the Penn Chinese Treebank. Secondly, a dependency syntax analysis is applied for constructing the semantic information feature vectors of sentences, and long short-term memory is adopted as the policy network architecture to enhance available information using recurrent memory, and a conditional random field is employed as the discriminant network to perform label inference with global normalization. Finally, we use reinforcement learning as the model framework, target parts of speech as the feedback of the environment, and then obtain the optimal policy through adversarial learning. The results show that the combination of reinforcement learning and adversarial network alleviates the dependence of the model on the training corpus to some extent, and can quickly and effectively expand the scale of the annotation dictionary for the Zhuang language, thereby obtaining better labeling results.","PeriodicalId":346657,"journal":{"name":"2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126572813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-11-01DOI: 10.1109/ICTAI.2019.00222
Christopher Doyle, Maxime Guériau, Ivana Dusparic
With increasing applications of reinforcement learning in real life problems, it is becoming essential that agents are able to update their knowledge continually. Lifelong learning approaches aim to enable agents to retain the knowledge they learn and to selectively transfer knowledge to new tasks. Recent techniques for lifelong reinforcement learning have shown great success in getting an agent to generalise over several tasks. However, scalability becomes an issue when agents learn numerous tasks, as each task's information must be remembered. To address this issue, this paper proposes the approach of Variational Policy Chaining (VPC) which enables a reinforcement learning agent to generalise effectively in a scalable manner when presented with continuous task updates, without storing multiple historic experiences. VPC uses Kullback-Leibler divergence based method to isolate the most common pieces of knowledge, and condenses the important knowledge into a single policy chain. We evaluate VPC in a GridWorld environment and compare it to vanilla policy gradient methods, showing that VPC's ability to reuse knowledge from previously encountered tasks reduces learning time in new tasks by up to 50%.
{"title":"Variational Policy Chaining for Lifelong Reinforcement Learning","authors":"Christopher Doyle, Maxime Guériau, Ivana Dusparic","doi":"10.1109/ICTAI.2019.00222","DOIUrl":"https://doi.org/10.1109/ICTAI.2019.00222","url":null,"abstract":"With increasing applications of reinforcement learning in real life problems, it is becoming essential that agents are able to update their knowledge continually. Lifelong learning approaches aim to enable agents to retain the knowledge they learn and to selectively transfer knowledge to new tasks. Recent techniques for lifelong reinforcement learning have shown great success in getting an agent to generalise over several tasks. However, scalability becomes an issue when agents learn numerous tasks, as each task's information must be remembered. To address this issue, this paper proposes the approach of Variational Policy Chaining (VPC) which enables a reinforcement learning agent to generalise effectively in a scalable manner when presented with continuous task updates, without storing multiple historic experiences. VPC uses Kullback-Leibler divergence based method to isolate the most common pieces of knowledge, and condenses the important knowledge into a single policy chain. We evaluate VPC in a GridWorld environment and compare it to vanilla policy gradient methods, showing that VPC's ability to reuse knowledge from previously encountered tasks reduces learning time in new tasks by up to 50%.","PeriodicalId":346657,"journal":{"name":"2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126658562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-11-01DOI: 10.1109/ICTAI.2019.00-91
Ralph Samer, Martin Stettinger, Müslüm Atas, A. Felfernig, G. Ruhe, Gouri Deshpande
There is a high demand for intelligent decision support systems which assist stakeholders in requirements engineering tasks. Examples of such tasks are the elicitation of requirements, release planning, and the identification of requirement-dependencies. In particular, the detection of dependencies between requirements is a major challenge for stakeholders. In this paper, we present two content-based recommendation approaches which automatically detect and recommend such dependencies. The first approach identifies potential dependencies between requirements which are defined on a textual level by exploiting document classification techniques (based on Linear SVM, Naive Bayes, Random Forest, and k-Nearest Neighbors). This approach uses two different feature types (TF-IDF features vs. probabilistic features). The second recommendation approach is based on Latent Semantic Analysis and defines the baseline for the evaluation with a real-world data set. The evaluation shows that the recommendation approach based on Random Forest using probabilistic features achieves the best prediction quality of all approaches (F1: 0.89).
{"title":"New Approaches to the Identification of Dependencies between Requirements","authors":"Ralph Samer, Martin Stettinger, Müslüm Atas, A. Felfernig, G. Ruhe, Gouri Deshpande","doi":"10.1109/ICTAI.2019.00-91","DOIUrl":"https://doi.org/10.1109/ICTAI.2019.00-91","url":null,"abstract":"There is a high demand for intelligent decision support systems which assist stakeholders in requirements engineering tasks. Examples of such tasks are the elicitation of requirements, release planning, and the identification of requirement-dependencies. In particular, the detection of dependencies between requirements is a major challenge for stakeholders. In this paper, we present two content-based recommendation approaches which automatically detect and recommend such dependencies. The first approach identifies potential dependencies between requirements which are defined on a textual level by exploiting document classification techniques (based on Linear SVM, Naive Bayes, Random Forest, and k-Nearest Neighbors). This approach uses two different feature types (TF-IDF features vs. probabilistic features). The second recommendation approach is based on Latent Semantic Analysis and defines the baseline for the evaluation with a real-world data set. The evaluation shows that the recommendation approach based on Random Forest using probabilistic features achieves the best prediction quality of all approaches (F1: 0.89).","PeriodicalId":346657,"journal":{"name":"2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125841961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-11-01DOI: 10.1109/ICTAI.2019.00106
Yuanyuan Pan, Jun Gan, Xiangying Ran, Chong-Jun Wang
Aspect-based sentiment analysis (ABSA) is a fine-grained sentiment analysis problem, which has received more and more attention in recent years. Convolutional Neural Networks and their variants have shown potentialities for tackling the problem recently. Building upon this line of research, we propose a novel architecture named Multi-Granularity Position-Aware Convolutional Memory Network (MP-CMN) for ABSA in this paper. MP-CMN utilizes multiple convolutional layers to extract features of different granularities to build the convolutional memories, and then incorporates aspect information and position information into convolutional memory network via attention mechanism. To make the mechanism of our model clear, we also make some visualization and case studies. Experiment results on standard SemEval 2014 datasets demonstrate the effectiveness of the proposed model.
{"title":"Multi-granularity Position-Aware Convolutional Memory Network for Aspect-Based Sentiment Analysis","authors":"Yuanyuan Pan, Jun Gan, Xiangying Ran, Chong-Jun Wang","doi":"10.1109/ICTAI.2019.00106","DOIUrl":"https://doi.org/10.1109/ICTAI.2019.00106","url":null,"abstract":"Aspect-based sentiment analysis (ABSA) is a fine-grained sentiment analysis problem, which has received more and more attention in recent years. Convolutional Neural Networks and their variants have shown potentialities for tackling the problem recently. Building upon this line of research, we propose a novel architecture named Multi-Granularity Position-Aware Convolutional Memory Network (MP-CMN) for ABSA in this paper. MP-CMN utilizes multiple convolutional layers to extract features of different granularities to build the convolutional memories, and then incorporates aspect information and position information into convolutional memory network via attention mechanism. To make the mechanism of our model clear, we also make some visualization and case studies. Experiment results on standard SemEval 2014 datasets demonstrate the effectiveness of the proposed model.","PeriodicalId":346657,"journal":{"name":"2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126487878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-11-01DOI: 10.1109/ICTAI.2019.00022
Keisuke Otaki, Satoshi Koide, K. Hayakawa, Ayano Okoso, Tomoki Nishi
Cooperation among different vehicles is a promising concept for Mobility as a Service (MaaS). A principal problem in MaaS is optimizing the vehicle routes to reduce the total travel cost with cooperation. For example, we know that platooning among large trucks could reduce the fuel cost because it decreases the air resistance. Traditional platoons, however, cannot model cooperation among different types of vehicles because the model assumes the homogeneity of vehicle types. We then propose a model that permits heterogeneous cooperation. Targets of our model include a logistic scenario, where a truck for the long-distance delivery also carries small self-driving vehicles for the last mile delivery. For those purposes, we formalize a new route optimization problem with heterogeneous cooperation, and provide its integer programming (IP) formulation as an exact solver. We evaluate our formulation through numerical experiments using synthetic and real graphs. We also validate our concept of heterogeneous cooperation for MaaS with examples.
{"title":"Multi-agent Path Planning with Heterogeneous Cooperation","authors":"Keisuke Otaki, Satoshi Koide, K. Hayakawa, Ayano Okoso, Tomoki Nishi","doi":"10.1109/ICTAI.2019.00022","DOIUrl":"https://doi.org/10.1109/ICTAI.2019.00022","url":null,"abstract":"Cooperation among different vehicles is a promising concept for Mobility as a Service (MaaS). A principal problem in MaaS is optimizing the vehicle routes to reduce the total travel cost with cooperation. For example, we know that platooning among large trucks could reduce the fuel cost because it decreases the air resistance. Traditional platoons, however, cannot model cooperation among different types of vehicles because the model assumes the homogeneity of vehicle types. We then propose a model that permits heterogeneous cooperation. Targets of our model include a logistic scenario, where a truck for the long-distance delivery also carries small self-driving vehicles for the last mile delivery. For those purposes, we formalize a new route optimization problem with heterogeneous cooperation, and provide its integer programming (IP) formulation as an exact solver. We evaluate our formulation through numerical experiments using synthetic and real graphs. We also validate our concept of heterogeneous cooperation for MaaS with examples.","PeriodicalId":346657,"journal":{"name":"2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127735118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-11-01DOI: 10.1109/ICTAI.2019.00079
Xiu Li, Guichun Duan, Zhouxia Wang, Jimmy S. J. Ren, Yongbing Zhang, Jiawei Zhang, Kaixiang Song
In the past a few years, we witnessed rapid advancement in face super-resolution from very low resolution(VLR) images. However, most of the previous studies focus on solving such problem without explicitly considering the impact of severe real-life image degradation (e.g. blur and noise). We can show that robustly recover details from VLR images is a task beyond the ability of current state-of-the-art method. In this paper, we borrow ideas from "facial composite" and propose an alternative approach to tackle this problem. We endow the degraded VLR images with additional cues by integrating existing face components from multiple reference images into a novel learning pipeline with both low level and high level semantic loss function as well as a specialized adversarial based training scheme. We show that our method is able to effectively and robustly restore relevant facial details from 16x16 images with extreme degradation. We also tested our approach against real-life images and our method performs favorably against previous methods.
{"title":"Recovering Extremely Degraded Faces by Joint Super-Resolution and Facial Composite","authors":"Xiu Li, Guichun Duan, Zhouxia Wang, Jimmy S. J. Ren, Yongbing Zhang, Jiawei Zhang, Kaixiang Song","doi":"10.1109/ICTAI.2019.00079","DOIUrl":"https://doi.org/10.1109/ICTAI.2019.00079","url":null,"abstract":"In the past a few years, we witnessed rapid advancement in face super-resolution from very low resolution(VLR) images. However, most of the previous studies focus on solving such problem without explicitly considering the impact of severe real-life image degradation (e.g. blur and noise). We can show that robustly recover details from VLR images is a task beyond the ability of current state-of-the-art method. In this paper, we borrow ideas from \"facial composite\" and propose an alternative approach to tackle this problem. We endow the degraded VLR images with additional cues by integrating existing face components from multiple reference images into a novel learning pipeline with both low level and high level semantic loss function as well as a specialized adversarial based training scheme. We show that our method is able to effectively and robustly restore relevant facial details from 16x16 images with extreme degradation. We also tested our approach against real-life images and our method performs favorably against previous methods.","PeriodicalId":346657,"journal":{"name":"2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128787724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-11-01DOI: 10.1109/ICTAI.2019.00255
Lin Liu, D. Shi, Dansong Cheng, Maysam Orouskhani
In this paper, we propose a new and effective multi-objective optimization algorithm based on a modified harmony search. The proposed method employs reverse learning in the harmony vector updating equation in order to enhance the global searching ability. Moreover, it adopts a harmony anchoring scheme so that unnecessary exploration is avoided. Experimental studies carried on eight benchmark problems show quite satisfactory results and indicate the higher performance of the proposed algorithm in comparison with traditional multi-objective optimization algorithms. Finally, it has been applied to solve the image segmentation problem.
{"title":"An Advanced Harmony Search Algorithm Based on Harmony Anchoring and Reverse Learning","authors":"Lin Liu, D. Shi, Dansong Cheng, Maysam Orouskhani","doi":"10.1109/ICTAI.2019.00255","DOIUrl":"https://doi.org/10.1109/ICTAI.2019.00255","url":null,"abstract":"In this paper, we propose a new and effective multi-objective optimization algorithm based on a modified harmony search. The proposed method employs reverse learning in the harmony vector updating equation in order to enhance the global searching ability. Moreover, it adopts a harmony anchoring scheme so that unnecessary exploration is avoided. Experimental studies carried on eight benchmark problems show quite satisfactory results and indicate the higher performance of the proposed algorithm in comparison with traditional multi-objective optimization algorithms. Finally, it has been applied to solve the image segmentation problem.","PeriodicalId":346657,"journal":{"name":"2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132375785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
FPN (Feature Pyramid Networks) is one of the most popular object detection networks, which can improve small object detection by enhancing shallow features. However, limited attention has been paid to the improvement of large object detection via deeper feature enhancement. One existing approach merges the feature maps of different layers into a new feature map for object detection, but can lead to increased noise and loss of information. The other approach adds a bottom-up structure after the feature pyramid of FPN, which superimposes the information from shallow layers into the deep feature map but weakens the strength of FPN in detecting small objects. To address these challenges, this paper proposes TFPN (Twin Feature Pyramid Networks), which consists of (1) FPN+, a bottom-up structure that improves large object detection; (2) TPS, a Twin Pyramid Structure that improves medium object detection; and (3) innovative integration of these two with FPN, which can significantly improve the detection accuracy of large and medium objects while maintaining the advantage of FPN in small object detection. Extensive experiments using the MSCOCO object detection datasets and the BDD100K automatic driving dataset demonstrate that TFPN significantly improves over existing models, achieving up to 2.2 improvement in detection accuracy (e.g., 36.3 for FPN vs. 38.5 for TFPN on COCO Val-17). Our method can obtain the same accuracy as FPN with ResNet-101 based on ResNet-50 and needs fewer parameters.
特征金字塔网络(Feature Pyramid Networks,简称FPN)是目前最流行的目标检测网络之一,它可以通过增强浅层特征来改善小目标的检测。然而,通过更深层次的特征增强来改进大目标检测的研究却很少。现有的一种方法是将不同层的特征图合并成一个新的特征图用于目标检测,但这可能导致噪声增加和信息丢失。另一种方法是在FPN的特征金字塔之后增加一个自下而上的结构,将浅层信息叠加到深层特征图中,但削弱了FPN检测小目标的强度。为了解决这些挑战,本文提出了TFPN(双特征金字塔网络),它包括:(1)FPN+,一种自下而上的结构,可以提高大型目标的检测;(2) TPS,双金字塔结构,提高介质目标检测;(3)二者与FPN的创新融合,在保持FPN在小目标检测中的优势的同时,显著提高了大中型目标的检测精度。使用MSCOCO目标检测数据集和BDD100K自动驾驶数据集进行的大量实验表明,TFPN比现有模型有了显著改善,检测精度提高了2.2(例如,FPN的36.3比COCO var -17上的TFPN的38.5)。该方法可以获得与基于ResNet-50的ResNet-101的FPN相同的精度,并且需要更少的参数。
{"title":"TFPN: Twin Feature Pyramid Networks for Object Detection","authors":"Yi Liang, Changjian Wang, Fangzhao Li, Yuxing Peng, Q. Lv, Yuan Yuan, Zhen Huang","doi":"10.1109/ICTAI.2019.00251","DOIUrl":"https://doi.org/10.1109/ICTAI.2019.00251","url":null,"abstract":"FPN (Feature Pyramid Networks) is one of the most popular object detection networks, which can improve small object detection by enhancing shallow features. However, limited attention has been paid to the improvement of large object detection via deeper feature enhancement. One existing approach merges the feature maps of different layers into a new feature map for object detection, but can lead to increased noise and loss of information. The other approach adds a bottom-up structure after the feature pyramid of FPN, which superimposes the information from shallow layers into the deep feature map but weakens the strength of FPN in detecting small objects. To address these challenges, this paper proposes TFPN (Twin Feature Pyramid Networks), which consists of (1) FPN+, a bottom-up structure that improves large object detection; (2) TPS, a Twin Pyramid Structure that improves medium object detection; and (3) innovative integration of these two with FPN, which can significantly improve the detection accuracy of large and medium objects while maintaining the advantage of FPN in small object detection. Extensive experiments using the MSCOCO object detection datasets and the BDD100K automatic driving dataset demonstrate that TFPN significantly improves over existing models, achieving up to 2.2 improvement in detection accuracy (e.g., 36.3 for FPN vs. 38.5 for TFPN on COCO Val-17). Our method can obtain the same accuracy as FPN with ResNet-101 based on ResNet-50 and needs fewer parameters.","PeriodicalId":346657,"journal":{"name":"2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132402450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-11-01DOI: 10.1109/ICTAI.2019.00035
Len Du
In this paper we show how to implement a deep neural network that is strictly equivalent (sans floating-point errors) to the verbatim (batch) k-means algorithm or Lloyd's algorithm, when trained with gradient descent. Most interestingly, doing so shows that the k-means algorithm, a staple of "conventional'" or "shallow'" machine learning, can actually be seen as a special case of deep learning, contrary to the general perception that deep learning is a subset of machine learning. Doing so also automatically introduces yet another unsupervised learning technique into the arsenal of deep learning, which happens to be an example of interpretable deep neural networks as well. Finally, we also show how to utilize the powerful deep learning infrastructures with very little extra effort for adaptation.
{"title":"Shallow Deep Learning: Embedding Verbatim K-Means in Deep Neural Networks","authors":"Len Du","doi":"10.1109/ICTAI.2019.00035","DOIUrl":"https://doi.org/10.1109/ICTAI.2019.00035","url":null,"abstract":"In this paper we show how to implement a deep neural network that is strictly equivalent (sans floating-point errors) to the verbatim (batch) k-means algorithm or Lloyd's algorithm, when trained with gradient descent. Most interestingly, doing so shows that the k-means algorithm, a staple of \"conventional'\" or \"shallow'\" machine learning, can actually be seen as a special case of deep learning, contrary to the general perception that deep learning is a subset of machine learning. Doing so also automatically introduces yet another unsupervised learning technique into the arsenal of deep learning, which happens to be an example of interpretable deep neural networks as well. Finally, we also show how to utilize the powerful deep learning infrastructures with very little extra effort for adaptation.","PeriodicalId":346657,"journal":{"name":"2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114931452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}