首页 > 最新文献

2022 International Joint Conference on Neural Networks (IJCNN)最新文献

英文 中文
GNN-Detective: Efficient Weakly Correlated Neighbors Distinguishing and Processing in GNN GNN检测:GNN中有效的弱相关邻域识别与处理
Pub Date : 2022-07-18 DOI: 10.1109/IJCNN55064.2022.9892051
Jiayang Qiao, Yutong Liu, L. Kong
In the field of various downstream tasks of graph learning, graph neural networks (GNNs) have achieved the state-of-the-art (SOTA) performance benefits from its special propagation mechanism. The propagation mechanism aggregates attributes from neighbor nodes to obtain expressive node representations, which is pivotal for achieving SOTA performance in various downstream tasks. However, in most graph datasets, the neighborhood of each node may contain weakly correlated neighbors (WCNs), whose attributes may impair the expressiveness of central node representations. Though efforts have been devoted to solving such problem, they merely focus on aggregating fewer or even subtracting the attributes of WCNs. However, WCNs still share some correlated information with the central node, thus the correlated information provided by WCNs is underutilized. In this work, we devote to leveraging the correlated information provided by WCNs with our proposed method, namely GNN-detective. This detective can efficiently and automatically distinguish WCNs, as well as dig out their correlated information in the graph. It is realized by a semi-supervised learning framework, where the Differential Propagation (DP) module is designed specially for information triage and utilization. This module can fully leverage the correlated information provided by WCNs, and eliminate interference of uncorrelated information. We have conducted semi-supervised node classification tasks on 9 benchmark datasets. Our proposed method is proven to achieve the best performance in processing WCNs. The problems such as over-smoothing and overfitting are also mitigated as evaluated.
在图学习的各种下游任务中,图神经网络(gnn)由于其特殊的传播机制而获得了最先进的性能优势。传播机制聚合邻居节点的属性以获得富有表现力的节点表示,这是在各种下游任务中实现SOTA性能的关键。然而,在大多数图数据集中,每个节点的邻域可能包含弱相关邻居(WCNs),其属性可能会损害中心节点表示的表达性。尽管人们一直在努力解决这一问题,但他们只是专注于减少wcn的聚合甚至减去其属性。然而,WCNs仍然与中心节点共享一些相关信息,因此WCNs提供的相关信息没有得到充分利用。在这项工作中,我们致力于利用我们提出的方法,即gnn - detection,来利用wcn提供的相关信息。该检测方法能够高效、自动地识别wcn,并在图中挖掘出wcn的相关信息。它采用半监督学习框架实现,其中差分传播(DP)模块专门设计用于信息分类和利用。该模块可以充分利用wcn提供的相关信息,消除不相关信息的干扰。我们对9个基准数据集进行了半监督节点分类任务。实验证明,该方法在处理小波神经网络方面具有较好的性能。在评估过程中也减轻了过度平滑和过度拟合等问题。
{"title":"GNN-Detective: Efficient Weakly Correlated Neighbors Distinguishing and Processing in GNN","authors":"Jiayang Qiao, Yutong Liu, L. Kong","doi":"10.1109/IJCNN55064.2022.9892051","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892051","url":null,"abstract":"In the field of various downstream tasks of graph learning, graph neural networks (GNNs) have achieved the state-of-the-art (SOTA) performance benefits from its special propagation mechanism. The propagation mechanism aggregates attributes from neighbor nodes to obtain expressive node representations, which is pivotal for achieving SOTA performance in various downstream tasks. However, in most graph datasets, the neighborhood of each node may contain weakly correlated neighbors (WCNs), whose attributes may impair the expressiveness of central node representations. Though efforts have been devoted to solving such problem, they merely focus on aggregating fewer or even subtracting the attributes of WCNs. However, WCNs still share some correlated information with the central node, thus the correlated information provided by WCNs is underutilized. In this work, we devote to leveraging the correlated information provided by WCNs with our proposed method, namely GNN-detective. This detective can efficiently and automatically distinguish WCNs, as well as dig out their correlated information in the graph. It is realized by a semi-supervised learning framework, where the Differential Propagation (DP) module is designed specially for information triage and utilization. This module can fully leverage the correlated information provided by WCNs, and eliminate interference of uncorrelated information. We have conducted semi-supervised node classification tasks on 9 benchmark datasets. Our proposed method is proven to achieve the best performance in processing WCNs. The problems such as over-smoothing and overfitting are also mitigated as evaluated.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130797962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
UApredictor: Urban Anomaly Prediction from Spatial-Temporal Data using Graph Transformer Neural Network upredictor:基于图转换神经网络的时空数据城市异常预测
Pub Date : 2022-07-18 DOI: 10.1109/IJCNN55064.2022.9892885
Bhumika, D. Das
Urban anomalies are abnormal events such as a blocked driveway, illegal parking, noise, crime, crowd gathering, etc. affect people and policy managers drastically if not handled in time. Prediction of these anomalies in the early stages is critical for public safety and mitigation of economic losses. However, predicting urban anomalies has various challenges like complex spatio-temporal relationships, dynamic nature, and data sparsity. This paper proposes a novel end-to-end deep learning based framework, i.e., UApredictor that utilizes stacked spatial-temporal-interaction block to predict urban anomaly from multivariate time-series data. We model the problem using an attribute graph, where we represent city regions as nodes to capture inter region spatial information using a spatial transformer. Further, to capture temporal correlation, we utilize a temporal transformer, and the interaction module retains complex interaction between spatio-temporal dimensions. Besides, the attention layer is added on the top of the spatial-temporal-interaction block that captures important information for predicting urban anomaly. We use real-world NYC-Urban Anomaly, NYC-Taxi, NYC-POI, NYC-Road Network, NYC-Demographic, and NYC-Weather datasets of New York city to evaluate the urban anomaly prediction framework. The results show that our proposed framework predicts better in terms of F-measure, macro-F1, and micro-F1 than baseline and state-of-the-art models.
城市异常是指车道堵塞、违章停车、噪音、犯罪、人群聚集等异常事件,如果不及时处理,会对人们和政策管理者造成巨大影响。在早期阶段预测这些异常情况对公共安全和减轻经济损失至关重要。然而,城市异常预测面临着复杂的时空关系、动态性和数据稀疏性等诸多挑战。本文提出了一种基于端到端的深度学习框架upredict,该框架利用堆叠时空交互块从多元时间序列数据中预测城市异常。我们使用属性图对问题进行建模,其中我们将城市区域表示为节点,以使用空间转换器捕获区域间的空间信息。此外,为了捕获时间相关性,我们使用了一个时间转换器,交互模块保留了时空维度之间的复杂交互。此外,在时空交互块的基础上增加了关注层,获取城市异常预测的重要信息。我们使用纽约市真实的NYC-Urban Anomaly、NYC-Taxi、NYC-POI、NYC-Road Network、NYC-Demographic和NYC-Weather数据集来评估城市异常预测框架。结果表明,我们提出的框架在F-measure、宏观f1和微观f1方面的预测优于基线和最先进的模型。
{"title":"UApredictor: Urban Anomaly Prediction from Spatial-Temporal Data using Graph Transformer Neural Network","authors":"Bhumika, D. Das","doi":"10.1109/IJCNN55064.2022.9892885","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892885","url":null,"abstract":"Urban anomalies are abnormal events such as a blocked driveway, illegal parking, noise, crime, crowd gathering, etc. affect people and policy managers drastically if not handled in time. Prediction of these anomalies in the early stages is critical for public safety and mitigation of economic losses. However, predicting urban anomalies has various challenges like complex spatio-temporal relationships, dynamic nature, and data sparsity. This paper proposes a novel end-to-end deep learning based framework, i.e., UApredictor that utilizes stacked spatial-temporal-interaction block to predict urban anomaly from multivariate time-series data. We model the problem using an attribute graph, where we represent city regions as nodes to capture inter region spatial information using a spatial transformer. Further, to capture temporal correlation, we utilize a temporal transformer, and the interaction module retains complex interaction between spatio-temporal dimensions. Besides, the attention layer is added on the top of the spatial-temporal-interaction block that captures important information for predicting urban anomaly. We use real-world NYC-Urban Anomaly, NYC-Taxi, NYC-POI, NYC-Road Network, NYC-Demographic, and NYC-Weather datasets of New York city to evaluate the urban anomaly prediction framework. The results show that our proposed framework predicts better in terms of F-measure, macro-F1, and micro-F1 than baseline and state-of-the-art models.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130744416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Enhanced EfficientNet Network for Classifying Laparoscopy Videos using Transfer Learning Technique 基于迁移学习技术的腹腔镜视频分类增强型高效网络
Pub Date : 2022-07-18 DOI: 10.1109/IJCNN55064.2022.9891989
Divya Acharya, Guda Ramachandra Kaladhara Sarma, Kameshwar Raovenkatajammalamadaka
Recent days have seen a lot of interest in surgical data science (SDS) methods and imaging technologies. As a result of these developments, surgeons may execute less invasive procedures. Using pathology and no pathology situations to classify laparoscopic video pictures of surgical activities, in this research work authors conducted their investigation using a transfer learning technique named enhanced ENet (eENet) network based on enhanced EfficientNet network. Two base versions of the EfficientNet model named ENetB0 and ENetB7 along with the two proposed versions of the EfficientNet network as enhanced EfficientNetB0 (eENetB0) and enhanced EfficientnetB7 (eENetB7) are implemented in the proposed framework using publicly available GLENDA [1] dataset. The proposed eENetB0 and eENetB7 models have classified the features extracted using the transfer learning technique into binary classification. For 70–30 and 10-fold Cross-Validation (10-fold CV), the data splitting eENetB0 model has achieved maximum classification accuracy as 88.43% and 97.59%, and the eENetB7 model has achieved 97.72% and 98.78% accuracy. We also compared the performance of our proposed enhanced version of EfficientNet (eENetB0 and eENetB7) with the base version of the models (ENetB0 and ENetB7) it shows that among these four models eENetB7 performed well. For GUI-based visualization purposes, we also created a platform named IAS.ai that detects the surgical video clips having blood and dry scenarios and uses explainable AI for unboxing the deep learning model's performance. IAS.ai is a real-time application of our approach. For further validation, we compared our framework's performance with other leading approaches cited in the literature [2]–[4]. We can see how well the proposed eENet model does compare to existing models, as well as the current best practices.
近年来,人们对外科数据科学(SDS)方法和成像技术产生了浓厚的兴趣。由于这些发展,外科医生可以进行侵入性较小的手术。在本研究工作中,作者采用基于增强型effentnet网络的增强型ENet (eENet)网络迁移学习技术,采用病理和无病理情况对腹腔镜手术活动视频图像进行分类。效率网模型的两个基本版本ENetB0和ENetB7,以及效率网网络的两个建议版本,即增强的效率网(eENetB0)和增强的效率网(eENetB7),在建议的框架中使用公开可用的GLENDA[1]数据集实现。提出的eENetB0和eENetB7模型将使用迁移学习技术提取的特征分类为二值分类。对于70-30和10倍交叉验证(10-fold CV),数据分割eENetB0模型的分类准确率最高,分别为88.43%和97.59%,eENetB7模型的分类准确率最高,分别为97.72%和98.78%。我们还比较了我们提出的增强版本的效率网络(eENetB0和eENetB7)与基本版本的模型(ENetB0和ENetB7)的性能,结果表明,在这四个模型中,eENetB7表现良好。为了基于gui的可视化目的,我们还创建了一个名为IAS的平台。人工智能可以检测到有血迹和干燥场景的手术视频片段,并使用可解释的人工智能来打开深度学习模型的性能。IAS。人工智能是我们方法的实时应用。为了进一步验证,我们将我们的框架的性能与文献[2]-[4]中引用的其他领先方法进行了比较。我们可以看到提议的eENet模型与现有模型以及当前的最佳实践相比有多好。
{"title":"Enhanced EfficientNet Network for Classifying Laparoscopy Videos using Transfer Learning Technique","authors":"Divya Acharya, Guda Ramachandra Kaladhara Sarma, Kameshwar Raovenkatajammalamadaka","doi":"10.1109/IJCNN55064.2022.9891989","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9891989","url":null,"abstract":"Recent days have seen a lot of interest in surgical data science (SDS) methods and imaging technologies. As a result of these developments, surgeons may execute less invasive procedures. Using pathology and no pathology situations to classify laparoscopic video pictures of surgical activities, in this research work authors conducted their investigation using a transfer learning technique named enhanced ENet (eENet) network based on enhanced EfficientNet network. Two base versions of the EfficientNet model named ENetB0 and ENetB7 along with the two proposed versions of the EfficientNet network as enhanced EfficientNetB0 (eENetB0) and enhanced EfficientnetB7 (eENetB7) are implemented in the proposed framework using publicly available GLENDA [1] dataset. The proposed eENetB0 and eENetB7 models have classified the features extracted using the transfer learning technique into binary classification. For 70–30 and 10-fold Cross-Validation (10-fold CV), the data splitting eENetB0 model has achieved maximum classification accuracy as 88.43% and 97.59%, and the eENetB7 model has achieved 97.72% and 98.78% accuracy. We also compared the performance of our proposed enhanced version of EfficientNet (eENetB0 and eENetB7) with the base version of the models (ENetB0 and ENetB7) it shows that among these four models eENetB7 performed well. For GUI-based visualization purposes, we also created a platform named IAS.ai that detects the surgical video clips having blood and dry scenarios and uses explainable AI for unboxing the deep learning model's performance. IAS.ai is a real-time application of our approach. For further validation, we compared our framework's performance with other leading approaches cited in the literature [2]–[4]. We can see how well the proposed eENet model does compare to existing models, as well as the current best practices.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130987443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Link Prediction Model of Dynamic Heterogeneous Network Based on Transformer 基于变压器的动态异构网络链路预测模型
Pub Date : 2022-07-18 DOI: 10.1109/IJCNN55064.2022.9892546
Beibei Ruan, Cui Zhu, Wenjun Zhu
It has always been a challenge to research inductive learning, which can embed newly unseen nodes. Inductive learning is a frequently encountered problem in practical applications of graph networks, but there is little research on dynamic heterogeneous network link prediction. Therefore, we propose a Heterogeneous and Temporal Model Based on Transformer (HT-Trans) for dynamic heterogeneous network, which core idea is to introduce transformer to integrate better neighbor information to capture network structure. The goal of HT-Trans is to infer proper embedding for existing nodes and unseen nodes. Experimental results show that the algorithm proposed in this paper is significantly competitive compared with baselines for link prediction tasks on three real datasets.
归纳学习可以嵌入新的未见节点,这一直是研究归纳学习的一个挑战。归纳学习是图网络实际应用中经常遇到的问题,但对动态异构网络链路预测的研究很少。为此,我们提出了一种基于变压器的异构时态模型(HT-Trans),该模型的核心思想是引入变压器来整合更好的邻居信息以捕获网络结构。HT-Trans的目标是推断现有节点和未见节点的适当嵌入。实验结果表明,本文提出的算法在三个真实数据集上的链路预测任务与基线相比具有显著的竞争力。
{"title":"A Link Prediction Model of Dynamic Heterogeneous Network Based on Transformer","authors":"Beibei Ruan, Cui Zhu, Wenjun Zhu","doi":"10.1109/IJCNN55064.2022.9892546","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892546","url":null,"abstract":"It has always been a challenge to research inductive learning, which can embed newly unseen nodes. Inductive learning is a frequently encountered problem in practical applications of graph networks, but there is little research on dynamic heterogeneous network link prediction. Therefore, we propose a Heterogeneous and Temporal Model Based on Transformer (HT-Trans) for dynamic heterogeneous network, which core idea is to introduce transformer to integrate better neighbor information to capture network structure. The goal of HT-Trans is to infer proper embedding for existing nodes and unseen nodes. Experimental results show that the algorithm proposed in this paper is significantly competitive compared with baselines for link prediction tasks on three real datasets.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131107739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Development of Multi-task Models for Emotion-Aware Gender Prediction 情绪感知性别预测多任务模型的发展
Pub Date : 2022-07-18 DOI: 10.1109/IJCNN55064.2022.9892404
Chanchal Suman, Abhishek Singh, S. Saha, P. Bhattacharyya
With the rise of personalized online services, a huge opportunity for user profiling has developed. Gender plays a very important role for services that rely on information about a user's background. Although, due to anonymity and privacy, the gender information of a user is usually unavailable for other users. Social Networking sites have provided users with a lot of features to express their thoughts and emotions either using pictures or emojis or writing texts. Based on the idea that female and male users have some differences in their post and message contents, social media accounts can be analyzed using their textual posts for finding the user's gender. In this work, we explore different emotion-aided multi-modal gender prediction models. The basic intuition behind our proposed approach is to predict the gender of a user based on the emotional clues present in their multimodal posts, which includes texts as well as images. PAN 2018 dataset is enriched with emotion labels, for the experimentation. Different multi-tasking based architectures have been developed for gender prediction. Obtained results on the benchmark PAN-2018 dataset illustrate that the proposed multimodal emotion-aided system performs better than the single modal (with only text and only image) based models and the state of the art system too.
随着个性化在线服务的兴起,为用户分析提供了巨大的机会。对于依赖于用户背景信息的服务来说,性别扮演着非常重要的角色。但是,由于匿名性和隐私性,用户的性别信息通常对其他用户不可用。社交网站为用户提供了很多功能来表达他们的想法和情感,可以使用图片或表情符号,也可以使用文字。基于女性和男性用户在帖子和消息内容上存在一定差异的观点,可以通过社交媒体账户的文字帖子来分析用户的性别。在这项工作中,我们探索了不同的情绪辅助多模态性别预测模型。我们提出的方法背后的基本直觉是基于用户多模式帖子(包括文本和图像)中呈现的情感线索来预测用户的性别。PAN 2018数据集丰富了情感标签,用于实验。不同的基于多任务的架构已经被开发出来用于性别预测。在基准PAN-2018数据集上获得的结果表明,所提出的多模态情感辅助系统比基于单模态(只有文本和图像)的模型和最先进的系统表现得更好。
{"title":"Development of Multi-task Models for Emotion-Aware Gender Prediction","authors":"Chanchal Suman, Abhishek Singh, S. Saha, P. Bhattacharyya","doi":"10.1109/IJCNN55064.2022.9892404","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892404","url":null,"abstract":"With the rise of personalized online services, a huge opportunity for user profiling has developed. Gender plays a very important role for services that rely on information about a user's background. Although, due to anonymity and privacy, the gender information of a user is usually unavailable for other users. Social Networking sites have provided users with a lot of features to express their thoughts and emotions either using pictures or emojis or writing texts. Based on the idea that female and male users have some differences in their post and message contents, social media accounts can be analyzed using their textual posts for finding the user's gender. In this work, we explore different emotion-aided multi-modal gender prediction models. The basic intuition behind our proposed approach is to predict the gender of a user based on the emotional clues present in their multimodal posts, which includes texts as well as images. PAN 2018 dataset is enriched with emotion labels, for the experimentation. Different multi-tasking based architectures have been developed for gender prediction. Obtained results on the benchmark PAN-2018 dataset illustrate that the proposed multimodal emotion-aided system performs better than the single modal (with only text and only image) based models and the state of the art system too.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133001785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Density and Context Aware Network with Hierarchical Head for Traffic Scene Detection 基于分层头的交通场景检测密度和上下文感知网络
Pub Date : 2022-07-18 DOI: 10.1109/IJCNN55064.2022.9892125
Zuhao Ge, Wenhao Yu, Xian Liu, Lizhe Qi, Yunquan Sun
We investigate traffic scene detection from surveillance cameras and UAVs. This task is rather challenging, mainly due to the spatial nonuniform gathering, large-scale variance, and instance-level imbalanced distribution of vehicles. Most existing methods that employed FPN to enrich features are prone to failure in this scenario. To mitigate the influences above, we propose a novel detector called Density and Context Aware Network(DCANet) that can focus on dense regions and adaptively aggregate context features. Specifically, DCANet consists of three components: Density Map Supervision(DMP), Context Feature Aggregation(CFA), and Hierarchical Head Module(HHM). DMP is designed to capture the gathering information of objects supervised by density maps. CFA exploits adjacent feature layers' relationships to fulfill ROI-level contextual information enhancement. Finally, HHM is introduced to classify and locate imbalanced objects employed in hierarchical heads. Without bells and whistles, DCANet can be used in any two-stage detectors. Extensive experiments are carried out on the two widely used traffic detection datasets, CityCam and VisDrone, and DCANet reports new state-of-the-art scores on the CityCam.
我们研究从监控摄像机和无人机的交通场景检测。这一任务具有一定的挑战性,主要是由于车辆的空间不均匀收集、大规模方差和实例级分布不平衡。在这种情况下,大多数使用FPN来丰富特征的现有方法都容易失败。为了减轻上述影响,我们提出了一种新的检测器,称为密度和上下文感知网络(DCANet),它可以专注于密集区域并自适应地聚合上下文特征。具体来说,DCANet由三个组件组成:密度地图监督(DMP),上下文特征聚合(CFA)和分层头部模块(HHM)。DMP的目的是捕获由密度图监督的物体的收集信息。CFA利用相邻特征层的关系来实现roi级别的上下文信息增强。最后,引入HHM算法对分层头部中的不平衡目标进行分类和定位。DCANet可用于任何两级检测器,无需额外的附加功能。在两个广泛使用的交通检测数据集CityCam和VisDrone上进行了广泛的实验,DCANet报告了CityCam上最新的先进分数。
{"title":"Density and Context Aware Network with Hierarchical Head for Traffic Scene Detection","authors":"Zuhao Ge, Wenhao Yu, Xian Liu, Lizhe Qi, Yunquan Sun","doi":"10.1109/IJCNN55064.2022.9892125","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892125","url":null,"abstract":"We investigate traffic scene detection from surveillance cameras and UAVs. This task is rather challenging, mainly due to the spatial nonuniform gathering, large-scale variance, and instance-level imbalanced distribution of vehicles. Most existing methods that employed FPN to enrich features are prone to failure in this scenario. To mitigate the influences above, we propose a novel detector called Density and Context Aware Network(DCANet) that can focus on dense regions and adaptively aggregate context features. Specifically, DCANet consists of three components: Density Map Supervision(DMP), Context Feature Aggregation(CFA), and Hierarchical Head Module(HHM). DMP is designed to capture the gathering information of objects supervised by density maps. CFA exploits adjacent feature layers' relationships to fulfill ROI-level contextual information enhancement. Finally, HHM is introduced to classify and locate imbalanced objects employed in hierarchical heads. Without bells and whistles, DCANet can be used in any two-stage detectors. Extensive experiments are carried out on the two widely used traffic detection datasets, CityCam and VisDrone, and DCANet reports new state-of-the-art scores on the CityCam.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133084714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Generating Adaptive Targeted Adversarial Examples for Content-Based Image Retrieval 为基于内容的图像检索生成自适应目标对抗示例
Pub Date : 2022-07-18 DOI: 10.1109/IJCNN55064.2022.9892178
Jiameng Pan, Xiaoguang Zhu, Peilin Liu
Massive accessible personal data on the Internet raises the risk of malicious retrieval. In this paper, we propose to conceal the images with the targeted adversarial attacks on content-based image retrieval. An imperceptible perturbation is added to the original image to generate adversarial examples, making the retrieval results similar to the target image but look completely different. Previous work on the targeted attack for image retrieval only introduces a target-specific model and needs to train the model each time for new targets. We extend the attack adaptability by exploiting the target images as conditional input for the generative model. The proposed Adaptive Targeted Attack Generative Adversarial Network (ATA-GAN) is a GAN-based model with a generator and discriminator. The generator extracts the features of origin and target, then uses the Feature Integration Module to explore the relation between the target and original image to ignore the origin feature while paying more attention to the target. Simultaneously, the discriminator distinguishes the realness and ensures the adversarial example is similar to the origin. We evaluate and analyze the performance of the adaptive targeted attack on popular retrieval benchmarks.
互联网上大量可访问的个人数据增加了恶意检索的风险。在本文中,我们提出了基于内容的图像检索中使用有针对性的对抗性攻击来隐藏图像。在原始图像上加入难以察觉的扰动生成对抗样例,使检索结果与目标图像相似,但看起来完全不同。以往针对图像检索的针对性攻击研究只引入了针对特定目标的模型,每次都需要针对新的目标对模型进行训练。我们通过利用目标图像作为生成模型的条件输入来扩展攻击的适应性。提出的自适应目标攻击生成对抗网络(ATA-GAN)是一种基于gan的模型,具有生成器和鉴别器。生成器提取原点和目标的特征,然后利用特征集成模块探索目标与原始图像之间的关系,忽略原点特征而更加关注目标。同时,鉴别器区分真伪性,保证对抗样例与原点相似。我们在常用的检索基准上评估和分析了自适应目标攻击的性能。
{"title":"Generating Adaptive Targeted Adversarial Examples for Content-Based Image Retrieval","authors":"Jiameng Pan, Xiaoguang Zhu, Peilin Liu","doi":"10.1109/IJCNN55064.2022.9892178","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892178","url":null,"abstract":"Massive accessible personal data on the Internet raises the risk of malicious retrieval. In this paper, we propose to conceal the images with the targeted adversarial attacks on content-based image retrieval. An imperceptible perturbation is added to the original image to generate adversarial examples, making the retrieval results similar to the target image but look completely different. Previous work on the targeted attack for image retrieval only introduces a target-specific model and needs to train the model each time for new targets. We extend the attack adaptability by exploiting the target images as conditional input for the generative model. The proposed Adaptive Targeted Attack Generative Adversarial Network (ATA-GAN) is a GAN-based model with a generator and discriminator. The generator extracts the features of origin and target, then uses the Feature Integration Module to explore the relation between the target and original image to ignore the origin feature while paying more attention to the target. Simultaneously, the discriminator distinguishes the realness and ensures the adversarial example is similar to the origin. We evaluate and analyze the performance of the adaptive targeted attack on popular retrieval benchmarks.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133361781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
CoVal-SGAN: A Complex-Valued Spectral GAN architecture for the effective audio data augmentation in construction sites CoVal-SGAN:一种用于建筑工地有效音频数据增强的复值谱GAN结构
Pub Date : 2022-07-18 DOI: 10.1109/IJCNN55064.2022.9891915
M. Scarpiniti, Cristiano Mauri, D. Comminiello, A. Uncini, Yong-Cheol Lee
Generative audio data augmentation in a construction site is one of challenging research areas due to the high dissimilarity between work sounds of involved machines and equipment. However, it becomes necessary since the availability of audio data of critical work classes is often rare. Motivated by these considerations and demands, in this paper, we propose a complex-valued GAN architecture working with the audio spectrogram, named CoVal-SGAN, for an effective augmentation of audio data. Specifically, the proposed CoVal-SGAN exploits both the magnitude and phase information to improve the quality of the artificially generated audio signals and increase the overall performance of the underlying classifier. Numerical results, performed on the data recorded in real-world construction sites, along with the comparisons with available state-of-the-art approaches, show the effectiveness of the proposed idea by obtaining an improved accuracy.
由于涉及的机器和设备的工作声音高度不相似,建筑工地的生成音频数据增强是一个具有挑战性的研究领域。然而,这是必要的,因为关键工作课程的音频数据的可用性通常是罕见的。出于这些考虑和需求,在本文中,我们提出了一种与音频频谱图一起工作的复杂值GAN架构,称为CoVal-SGAN,用于有效增强音频数据。具体来说,所提出的CoVal-SGAN利用幅度和相位信息来提高人工生成的音频信号的质量,并提高底层分类器的整体性能。在实际建筑工地记录的数据上进行的数值结果,以及与现有最先进方法的比较,通过获得更高的精度,显示了所提出想法的有效性。
{"title":"CoVal-SGAN: A Complex-Valued Spectral GAN architecture for the effective audio data augmentation in construction sites","authors":"M. Scarpiniti, Cristiano Mauri, D. Comminiello, A. Uncini, Yong-Cheol Lee","doi":"10.1109/IJCNN55064.2022.9891915","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9891915","url":null,"abstract":"Generative audio data augmentation in a construction site is one of challenging research areas due to the high dissimilarity between work sounds of involved machines and equipment. However, it becomes necessary since the availability of audio data of critical work classes is often rare. Motivated by these considerations and demands, in this paper, we propose a complex-valued GAN architecture working with the audio spectrogram, named CoVal-SGAN, for an effective augmentation of audio data. Specifically, the proposed CoVal-SGAN exploits both the magnitude and phase information to improve the quality of the artificially generated audio signals and increase the overall performance of the underlying classifier. Numerical results, performed on the data recorded in real-world construction sites, along with the comparisons with available state-of-the-art approaches, show the effectiveness of the proposed idea by obtaining an improved accuracy.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133297970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Information Geometric Perspective to Adversarial Attacks and Defenses 对抗性攻击与防御的信息几何视角
Pub Date : 2022-07-18 DOI: 10.1109/IJCNN55064.2022.9892170
Kyle Naddeo, N. Bouaynaya, R. Shterenberg
Deep learning models have achieved state-of-the-art accuracy in complex tasks, sometimes outperforming human-level accuracy. Yet, they suffer from vulnerabilities known as adversarial attacks, which are imperceptible input perturbations that fool the models on inputs that were originally classified correctly. The adversarial problem remains poorly understood and commonly thought to be an inherent weakness of deep learning models. We argue that understanding and alleviating the adversarial phenomenon may require us to go beyond the Euclidean view and consider the relationship between the input and output spaces as a statistical manifold with the Fisher Information as its Riemannian metric. Under this information geometric view, the optimal attack is constructed as the direction corresponding to the highest eigenvalue of the Fisher Information Matrix - called the Fisher spectral attack. We show that an orthogonal transformation of the data cleverly alters its manifold by keeping the highest eigenvalue but changing the optimal direction of attack; thus deceiving the attacker into adopting the wrong direction. We demonstrate the defensive capabilities of the proposed orthogonal scheme - against the Fisher spectral attack and the popular fast gradient sign method - on standard networks, e.g., LeNet and MobileNetV2 for benchmark data sets, MNIST and CIFAR-10.
深度学习模型在复杂任务中达到了最先进的精度,有时甚至超过了人类水平的精度。然而,它们遭受了被称为对抗性攻击的漏洞,这是一种难以察觉的输入扰动,可以在最初正确分类的输入上欺骗模型。对抗性问题仍然知之甚少,通常被认为是深度学习模型的固有弱点。我们认为,理解和缓解对抗现象可能需要我们超越欧几里得观点,并将输入和输出空间之间的关系视为统计流形,并将Fisher信息作为其黎曼度量。在这种信息几何视图下,将最优攻击构造为Fisher信息矩阵最高特征值对应的方向,称为Fisher谱攻击。我们证明了数据的正交变换通过保持最高特征值而改变最优攻击方向巧妙地改变了它的流形;从而欺骗攻击者采取错误的方向。我们在标准网络上展示了所提出的正交方案的防御能力-对抗Fisher频谱攻击和流行的快速梯度符号方法-例如,LeNet和MobileNetV2的基准数据集,MNIST和CIFAR-10。
{"title":"An Information Geometric Perspective to Adversarial Attacks and Defenses","authors":"Kyle Naddeo, N. Bouaynaya, R. Shterenberg","doi":"10.1109/IJCNN55064.2022.9892170","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892170","url":null,"abstract":"Deep learning models have achieved state-of-the-art accuracy in complex tasks, sometimes outperforming human-level accuracy. Yet, they suffer from vulnerabilities known as adversarial attacks, which are imperceptible input perturbations that fool the models on inputs that were originally classified correctly. The adversarial problem remains poorly understood and commonly thought to be an inherent weakness of deep learning models. We argue that understanding and alleviating the adversarial phenomenon may require us to go beyond the Euclidean view and consider the relationship between the input and output spaces as a statistical manifold with the Fisher Information as its Riemannian metric. Under this information geometric view, the optimal attack is constructed as the direction corresponding to the highest eigenvalue of the Fisher Information Matrix - called the Fisher spectral attack. We show that an orthogonal transformation of the data cleverly alters its manifold by keeping the highest eigenvalue but changing the optimal direction of attack; thus deceiving the attacker into adopting the wrong direction. We demonstrate the defensive capabilities of the proposed orthogonal scheme - against the Fisher spectral attack and the popular fast gradient sign method - on standard networks, e.g., LeNet and MobileNetV2 for benchmark data sets, MNIST and CIFAR-10.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133639169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Bio-inspired Dark Adaptation Framework for Low-light Image Enhancement 微光图像增强的仿生暗适应框架
Pub Date : 2022-07-18 DOI: 10.1109/IJCNN55064.2022.9892877
Fang Lei
In low light conditions, image enhancement is critical for vision-based artificial systems since details of objects in dark regions are buried. Moreover, enhancing the low-light image without introducing too many irrelevant artifacts is important for visual tasks like motion detection. However, conventional methods always have the risk of “bad” enhancement. Nocturnal insects show remarkable visual abilities at night time, and their adaptations in light responses provide inspiration for low-light image enhancement. In this paper, we aim to adopt the neural mechanism of dark adaptation for adaptively raising intensities whilst preserving the naturalness. We propose a framework for enhancing low-light images by implementing the dark adaptation operation with proper adaptation parameters in R, G and B channels separately. Specifically, the dark adaptation in this paper consists of a series of canonical neural computations, including the power law adaptation, divisive normalization and adaptive rescaling operations. Experiments show that the proposed bio-inspired dark adaptation framework is more efficient and can better preserve the naturalness of the image compared to existing methods.
在弱光条件下,图像增强对于基于视觉的人工系统至关重要,因为黑暗区域的物体细节被掩盖了。此外,在不引入太多无关伪影的情况下增强弱光图像对于运动检测等视觉任务非常重要。然而,传统方法总是有“坏”增强的风险。夜行昆虫在夜间表现出非凡的视觉能力,它们对光线的适应为弱光图像增强提供了灵感。本文旨在采用暗适应的神经机制,在保持自然的同时自适应地提高强度。我们提出了一个框架,通过在R、G和B通道分别使用适当的自适应参数进行暗适应操作来增强低光图像。具体而言,本文中的暗适应由一系列规范神经计算组成,包括幂律适应、分裂归一化和自适应重标化操作。实验表明,与现有方法相比,提出的仿生暗适应框架更有效,能更好地保持图像的自然性。
{"title":"A Bio-inspired Dark Adaptation Framework for Low-light Image Enhancement","authors":"Fang Lei","doi":"10.1109/IJCNN55064.2022.9892877","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892877","url":null,"abstract":"In low light conditions, image enhancement is critical for vision-based artificial systems since details of objects in dark regions are buried. Moreover, enhancing the low-light image without introducing too many irrelevant artifacts is important for visual tasks like motion detection. However, conventional methods always have the risk of “bad” enhancement. Nocturnal insects show remarkable visual abilities at night time, and their adaptations in light responses provide inspiration for low-light image enhancement. In this paper, we aim to adopt the neural mechanism of dark adaptation for adaptively raising intensities whilst preserving the naturalness. We propose a framework for enhancing low-light images by implementing the dark adaptation operation with proper adaptation parameters in R, G and B channels separately. Specifically, the dark adaptation in this paper consists of a series of canonical neural computations, including the power law adaptation, divisive normalization and adaptive rescaling operations. Experiments show that the proposed bio-inspired dark adaptation framework is more efficient and can better preserve the naturalness of the image compared to existing methods.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"66 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132772584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2022 International Joint Conference on Neural Networks (IJCNN)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1