Pub Date : 2022-07-18DOI: 10.1109/IJCNN55064.2022.9891993
Sanjukta Krishnagopal, J. Bedrossian
While variational autoencoders have been successful in several tasks, the use of conventional priors are limited in their ability to encode the underlying structure of input data. We introduce an Encoded Prior Sliced Wasserstein AutoEncoder wherein an additional prior-encoder network learns a geometry and topology preserving embedding of any data manifold, thus improving the structure of latent space. The autoencoder and prior-encoder networks are iteratively trained using the Sliced Wasserstein distance, which facilitates the learning of nonstandard complex priors. We then introduce a graph-based algorithm to explore the learned manifold by traversing latent space through network-geodesics that lie along the manifold and hence are more realistic compared to conventional Euclidean interpolation. Specifically, we identify network-geodesics by maximizing the density of samples along the path while minimizing total energy. We use the 3D-spiral data to show that the prior encodes the geometry underlying the data unlike conventional autoencoders, and to demonstrate the exploration of the embedded data manifold through the network algorithm. We apply our framework to artificial as well as image datasets to demonstrate the advantages of learning improved latent structure, outlier generation, and geodesic interpolation.
{"title":"Preserving Data Manifold Structure in Latent Space for Exploration through Network Geodesics","authors":"Sanjukta Krishnagopal, J. Bedrossian","doi":"10.1109/IJCNN55064.2022.9891993","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9891993","url":null,"abstract":"While variational autoencoders have been successful in several tasks, the use of conventional priors are limited in their ability to encode the underlying structure of input data. We introduce an Encoded Prior Sliced Wasserstein AutoEncoder wherein an additional prior-encoder network learns a geometry and topology preserving embedding of any data manifold, thus improving the structure of latent space. The autoencoder and prior-encoder networks are iteratively trained using the Sliced Wasserstein distance, which facilitates the learning of nonstandard complex priors. We then introduce a graph-based algorithm to explore the learned manifold by traversing latent space through network-geodesics that lie along the manifold and hence are more realistic compared to conventional Euclidean interpolation. Specifically, we identify network-geodesics by maximizing the density of samples along the path while minimizing total energy. We use the 3D-spiral data to show that the prior encodes the geometry underlying the data unlike conventional autoencoders, and to demonstrate the exploration of the embedded data manifold through the network algorithm. We apply our framework to artificial as well as image datasets to demonstrate the advantages of learning improved latent structure, outlier generation, and geodesic interpolation.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"586 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123415435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-18DOI: 10.1109/IJCNN55064.2022.9892930
Fousiya Saleem, Mohammad Hamdan, A. Zalzala
This paper reports on using qualitative data analysis to understand aspects of the well-being of dwellers in underserved communities, by applying machine learning algorithms to identify specific themes from unstructured interview data. The work involved data translation, transcription, pre-processing as well as developing Word2Vec and FastText algorithms and ultimately a combined analysis engine. The reported experiments are conducted on field data captured from communities in India, hence offering a unique opportunity to examine automated context-based qualitative data analysis. The approach is proven feasible despite the dominant limitations on technology infrastructure and community awareness. The machine learning results identify themes from the interview data within minutes as opposed to hours of manual investigations through conventional qualitative analysis techniques. The outcomes from the analysis engine can be used for creating a grounded theory for further studies, hence facilitating an evidence-based approach to the evaluation of underserved communities.
{"title":"Towards Well-Being Management with Automated Qualitative Data Analysis","authors":"Fousiya Saleem, Mohammad Hamdan, A. Zalzala","doi":"10.1109/IJCNN55064.2022.9892930","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892930","url":null,"abstract":"This paper reports on using qualitative data analysis to understand aspects of the well-being of dwellers in underserved communities, by applying machine learning algorithms to identify specific themes from unstructured interview data. The work involved data translation, transcription, pre-processing as well as developing Word2Vec and FastText algorithms and ultimately a combined analysis engine. The reported experiments are conducted on field data captured from communities in India, hence offering a unique opportunity to examine automated context-based qualitative data analysis. The approach is proven feasible despite the dominant limitations on technology infrastructure and community awareness. The machine learning results identify themes from the interview data within minutes as opposed to hours of manual investigations through conventional qualitative analysis techniques. The outcomes from the analysis engine can be used for creating a grounded theory for further studies, hence facilitating an evidence-based approach to the evaluation of underserved communities.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125418188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-18DOI: 10.1109/IJCNN55064.2022.9892193
Yunru Bai, C. Yuan
Image dehazing remains a challenging problem because it is hard to restore a clean scene from a severely degraded hazy image. However, existing learning-based dehazing methods mostly ignore the fact that the interference of haze to an image is mainly concentrated in the low-frequency components. If all image components are processed indiscriminately, it is difficult to achieve a good restoration and accurate details cannot be guaranteed. In order to process the hazy images hierarchically, we propose a low-frequency sub-band contrastive regularization (LSCR) in the wavelet domain to ensure that the components of the restored image mainly affected by haze are pulled closer to the clear image and pushed far away from the hazy image. In addition, a high-frequency sub-band loss is also introduced to make high-frequency components of the restored image consistent with the clear image. Our method can better restore the haze-free image and achieve more accurate and rich details. The extensive experiments on synthetic and real-world datasets verify that the proposed method outperforms previous approaches.
{"title":"Contrastive Learning in Wavelet Domain for Image Dehazing","authors":"Yunru Bai, C. Yuan","doi":"10.1109/IJCNN55064.2022.9892193","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892193","url":null,"abstract":"Image dehazing remains a challenging problem because it is hard to restore a clean scene from a severely degraded hazy image. However, existing learning-based dehazing methods mostly ignore the fact that the interference of haze to an image is mainly concentrated in the low-frequency components. If all image components are processed indiscriminately, it is difficult to achieve a good restoration and accurate details cannot be guaranteed. In order to process the hazy images hierarchically, we propose a low-frequency sub-band contrastive regularization (LSCR) in the wavelet domain to ensure that the components of the restored image mainly affected by haze are pulled closer to the clear image and pushed far away from the hazy image. In addition, a high-frequency sub-band loss is also introduced to make high-frequency components of the restored image consistent with the clear image. Our method can better restore the haze-free image and achieve more accurate and rich details. The extensive experiments on synthetic and real-world datasets verify that the proposed method outperforms previous approaches.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125590319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-18DOI: 10.1109/IJCNN55064.2022.9892778
Takashi Nagata, Jinwei Xing, Tsutomu Kumazawa, E. Neftci
Model-based reinforcement learning is an effective approach to reducing sample complexity by adding more data from the model. Dyna is a well-known architecture that contains model-based reinforcement learning and integrates learning from interactions with an environment and a model of the environment. Although the model can greatly help to speed up the agent's learning, acquiring an accurate model is a hard problem in spite of the recent great success of function approximation using neural networks. A wrong model causes degradation of the agent's performance and raises another question: to which extent should an agent rely on the model to update its policy? In this paper, we propose to use the confidence of the model simulations to the integrated learning process so that the agent avoids updating its policy based on uncertain simulations by the model. To obtain confidence, we apply the Monte Carlo dropout technique to the state transition model. We show that this approach contributes to improving early-stage training, thus helping speed up the agent to reach reasonable performance. We conduct experiments on simulated robotic locomotion tasks to demonstrate the effectiveness of our approach.
{"title":"Uncertainty Aware Model Integration on Reinforcement Learning","authors":"Takashi Nagata, Jinwei Xing, Tsutomu Kumazawa, E. Neftci","doi":"10.1109/IJCNN55064.2022.9892778","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892778","url":null,"abstract":"Model-based reinforcement learning is an effective approach to reducing sample complexity by adding more data from the model. Dyna is a well-known architecture that contains model-based reinforcement learning and integrates learning from interactions with an environment and a model of the environment. Although the model can greatly help to speed up the agent's learning, acquiring an accurate model is a hard problem in spite of the recent great success of function approximation using neural networks. A wrong model causes degradation of the agent's performance and raises another question: to which extent should an agent rely on the model to update its policy? In this paper, we propose to use the confidence of the model simulations to the integrated learning process so that the agent avoids updating its policy based on uncertain simulations by the model. To obtain confidence, we apply the Monte Carlo dropout technique to the state transition model. We show that this approach contributes to improving early-stage training, thus helping speed up the agent to reach reasonable performance. We conduct experiments on simulated robotic locomotion tasks to demonstrate the effectiveness of our approach.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115258662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-18DOI: 10.1109/IJCNN55064.2022.9892869
Jin Cao, Zhong Qian, Peifeng Li
Event factuality is a description of the real situation of events in text. Event Factuality Identification (EFI) is the basic task of many related applications in the field of natural language processing. At present, most studies about EFI are carried out with the annotated event mentions, which is not applicable for practical application, and ignores the opinion of different event sources on event factuality. Moreover, previous work did not use cross-lingual information for EFI. We propose an end-to-end joint model JESF, which uses Bert to encode sentences and uses lingual feature to enrich the semantic representation of sentences, and then use BiLSTM to capture the serialized semantic features of sentences; Then, the multi-head attention is used to learn the event characteristics and identify the event mentions; After that, use multi-head attention to identify the event source; Finally, GCNs is used to capture the syntactic and semantic features, mult-head attention is used to capture the semantic features of sentences, event and event source features are integrated to identify event factuality. Especially, we use different cross-lingual related methods to learn supplementary sematic features from aligned Chinese sentences. The experimental results on FactBank show that JESF is effective and the Chinese information is helpful for English EFI, and the more effective method is to use Chinese cue as features for EFI.
{"title":"End-to-End Event Factuality Identification with Cross-Lingual Information","authors":"Jin Cao, Zhong Qian, Peifeng Li","doi":"10.1109/IJCNN55064.2022.9892869","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892869","url":null,"abstract":"Event factuality is a description of the real situation of events in text. Event Factuality Identification (EFI) is the basic task of many related applications in the field of natural language processing. At present, most studies about EFI are carried out with the annotated event mentions, which is not applicable for practical application, and ignores the opinion of different event sources on event factuality. Moreover, previous work did not use cross-lingual information for EFI. We propose an end-to-end joint model JESF, which uses Bert to encode sentences and uses lingual feature to enrich the semantic representation of sentences, and then use BiLSTM to capture the serialized semantic features of sentences; Then, the multi-head attention is used to learn the event characteristics and identify the event mentions; After that, use multi-head attention to identify the event source; Finally, GCNs is used to capture the syntactic and semantic features, mult-head attention is used to capture the semantic features of sentences, event and event source features are integrated to identify event factuality. Especially, we use different cross-lingual related methods to learn supplementary sematic features from aligned Chinese sentences. The experimental results on FactBank show that JESF is effective and the Chinese information is helpful for English EFI, and the more effective method is to use Chinese cue as features for EFI.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116000110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-18DOI: 10.1109/IJCNN55064.2022.9892106
Yulan Su, Yu Hong, Hongyu Zhu, Minhan Xu, Yifan Fan, Min Zhang
We propose a novel evaluation method for Question Generation (QG) task. It is designed to verify the quality of the generated questions in terms of different references, including not only the manually-written questions (i.e., ground truth) but also their variants. Back translation is utilized to obtain the variants, and accordingly, they generally appear as paraphrases of the ground-truth examples. In particular, an Asymmetrical Twin Gain (ATG) is proposed for binary-perspective evaluation using the existing metrics, such as BLEU and ROUGE-L, respectively. It enables both the metrics to be observed from two perspectives, including the consistency between QG results and ground-truth examples, as well as that of variants. The experiments on the publicly-available benchmark SQuAD demonstrate the reliability of ATG. More importantly, ATG is proven effective for indicating the stable QG performance. It is noteworthy that the proposed binary-perspective evaluation is explored for assisting the conventional evaluation methods, instead of replacing them. The contribute can be identified as the additional insight into the robustness of QG when some slightly-different references (e.g., paraphrases) are offered for evaluation. All the models and source codes in the experiments will be made publicly available to support reproducible research.
{"title":"Binary-perspective Asymmetrical Twin Gain: a Novel Evaluation Method for Question Generation","authors":"Yulan Su, Yu Hong, Hongyu Zhu, Minhan Xu, Yifan Fan, Min Zhang","doi":"10.1109/IJCNN55064.2022.9892106","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892106","url":null,"abstract":"We propose a novel evaluation method for Question Generation (QG) task. It is designed to verify the quality of the generated questions in terms of different references, including not only the manually-written questions (i.e., ground truth) but also their variants. Back translation is utilized to obtain the variants, and accordingly, they generally appear as paraphrases of the ground-truth examples. In particular, an Asymmetrical Twin Gain (ATG) is proposed for binary-perspective evaluation using the existing metrics, such as BLEU and ROUGE-L, respectively. It enables both the metrics to be observed from two perspectives, including the consistency between QG results and ground-truth examples, as well as that of variants. The experiments on the publicly-available benchmark SQuAD demonstrate the reliability of ATG. More importantly, ATG is proven effective for indicating the stable QG performance. It is noteworthy that the proposed binary-perspective evaluation is explored for assisting the conventional evaluation methods, instead of replacing them. The contribute can be identified as the additional insight into the robustness of QG when some slightly-different references (e.g., paraphrases) are offered for evaluation. All the models and source codes in the experiments will be made publicly available to support reproducible research.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122306763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-18DOI: 10.1109/IJCNN55064.2022.9892123
Ayça Deniz, Hakan Ezgi Kiziloz
The quality of features is one of the main factors that affect classification performance. Feature selection aims to remove irrelevant and redundant features from data in order to increase classification accuracy. However, identifying these features is not a trivial task due to a large search space. Evolutionary algorithms have been proven to be effective in many optimization problems, including feature selection. These algorithms require an initial population to start their search mechanism, and a poor initial population may cause getting stuck in local optima. Diversifying the initial population is known as an effective approach to overcome this issue; yet, it may not suffice as the search space grows exponentially with increasing feature sizes. In this study, we propose an enhanced initial population strategy to boost the performance of the feature selection task. In our proposed method, we ensure the diversity of the initial population by partitioning the candidate solutions according to their selected number of features. In addition, we adjust the chances of features being selected into a candidate solution regarding their information gain values, which enables wise selection of features among a vast search space. We conduct extensive experiments on many benchmark datasets retrieved from UCI Machine Learning Repository. Moreover, we apply our algorithm on a real-world, large-scale dataset, i.e., Stanford Sentiment Treebank. We observe significant improvements after the comparisons with three off-the-shelf initialization strategies.
{"title":"Boosting Initial Population in Multiobjective Feature Selection with Knowledge-Based Partitioning","authors":"Ayça Deniz, Hakan Ezgi Kiziloz","doi":"10.1109/IJCNN55064.2022.9892123","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892123","url":null,"abstract":"The quality of features is one of the main factors that affect classification performance. Feature selection aims to remove irrelevant and redundant features from data in order to increase classification accuracy. However, identifying these features is not a trivial task due to a large search space. Evolutionary algorithms have been proven to be effective in many optimization problems, including feature selection. These algorithms require an initial population to start their search mechanism, and a poor initial population may cause getting stuck in local optima. Diversifying the initial population is known as an effective approach to overcome this issue; yet, it may not suffice as the search space grows exponentially with increasing feature sizes. In this study, we propose an enhanced initial population strategy to boost the performance of the feature selection task. In our proposed method, we ensure the diversity of the initial population by partitioning the candidate solutions according to their selected number of features. In addition, we adjust the chances of features being selected into a candidate solution regarding their information gain values, which enables wise selection of features among a vast search space. We conduct extensive experiments on many benchmark datasets retrieved from UCI Machine Learning Repository. Moreover, we apply our algorithm on a real-world, large-scale dataset, i.e., Stanford Sentiment Treebank. We observe significant improvements after the comparisons with three off-the-shelf initialization strategies.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"140 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122348032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-18DOI: 10.1109/IJCNN55064.2022.9892568
Kodai Kaneko, N. Kubota
In recent years, advances in information and communication technology have led to the research and development of cyber-physical systems and digital twin that simulate various events in real space in cyber space. In the field of human flow simulation, it is possible to predict and analyze the flow of people in various spaces, from indoor to outdoor, and use this information to create spaces that promote human activities, such as searching for optimal layouts and alleviating congestion by distributing traffic lines. In this paper, we propose to use multi-scopic simulation to simulate human flow. Next, using the human flow data measured by the simulation, we extract and analyze the features by topological mapping. Finally, we discuss the effectiveness of the proposed method through some simulation results.
{"title":"Multi-Scopic Simulation for People Flow Feature Extraction Based on Topological Mapping","authors":"Kodai Kaneko, N. Kubota","doi":"10.1109/IJCNN55064.2022.9892568","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892568","url":null,"abstract":"In recent years, advances in information and communication technology have led to the research and development of cyber-physical systems and digital twin that simulate various events in real space in cyber space. In the field of human flow simulation, it is possible to predict and analyze the flow of people in various spaces, from indoor to outdoor, and use this information to create spaces that promote human activities, such as searching for optimal layouts and alleviating congestion by distributing traffic lines. In this paper, we propose to use multi-scopic simulation to simulate human flow. Next, using the human flow data measured by the simulation, we extract and analyze the features by topological mapping. Finally, we discuss the effectiveness of the proposed method through some simulation results.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122557707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Human interaction recognition has wide applications including intelligent surveillance, intelligent transportation and the analysis of sports videos. In recent years, benefiting from the development of action recognition based on deep learning, the performance of human interaction recognition has been boosted. This paper tackles two vital issues in recognizing human interactions, namely target missing and inadequate feature expression. To this end, we first design a data preprocessing method using skeleton estimation and multi-object tracking, which effectively reduces the chance of missing detection. Second, we propose a two-stream network composing of an appearance branch and a pose branch. The appearance branch extracts features enhanced via part affinity maps and part confidences maps, while the pose branch trains a customized Shift-GCN to extract skeletal features from people-pairs. Appearance and pose features are then fused to generate a more powerful representation of human interactions. Extensive experiments on two existing benchmarks, UT and BIT-Interaction, as well as a new dataset crafted by us, namely Campus-Interaction (CI), demonstrate the superior performance of the proposed approach over the state-of-the-arts.
{"title":"Human Interaction Recognition with Skeletal Attention and Shift Graph Convolution","authors":"Jin Zhou, Zhenhua Wang, Jiajun Meng, Sheng Liu, Jianhua Zhang, Shengyong Chen","doi":"10.1109/IJCNN55064.2022.9892292","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892292","url":null,"abstract":"Human interaction recognition has wide applications including intelligent surveillance, intelligent transportation and the analysis of sports videos. In recent years, benefiting from the development of action recognition based on deep learning, the performance of human interaction recognition has been boosted. This paper tackles two vital issues in recognizing human interactions, namely target missing and inadequate feature expression. To this end, we first design a data preprocessing method using skeleton estimation and multi-object tracking, which effectively reduces the chance of missing detection. Second, we propose a two-stream network composing of an appearance branch and a pose branch. The appearance branch extracts features enhanced via part affinity maps and part confidences maps, while the pose branch trains a customized Shift-GCN to extract skeletal features from people-pairs. Appearance and pose features are then fused to generate a more powerful representation of human interactions. Extensive experiments on two existing benchmarks, UT and BIT-Interaction, as well as a new dataset crafted by us, namely Campus-Interaction (CI), demonstrate the superior performance of the proposed approach over the state-of-the-arts.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"212 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122531816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-18DOI: 10.1109/IJCNN55064.2022.9892720
Junwei Zhang, Zhao Li, Hao Peng, Ming Li, Xiaofen Wang
Neural Networks (NNs) are widely used because of their superior feature extraction capabilities, among which Feedforward Neural Network (FNN) is used as the basic model for theoretical research. Recently, Quantum Neural Networks (QNNs) based on quantum mechanics have received extensive attention due to their ability to mine quantum correlations and parallel computing. Since two classical bits are required to simulate one qubit (i.e., quantum bit) on a classical computer, it brings challenges for simulating complex quantum operations or building large-scale QNNs on a classical computer. Hardy et al. extended the classical and quantum probability theories to the Generalized Probability Theory (GPT), so it is possible to construct high-order quantum systems. This paper regards the entire feature extraction and integration process of FNN as the evolution process of the high-order quantum system, and then leverages quantum coherence to describe the complex relationship between the features extracted by each layer of the network model. Intuitively, we reconstruct FNN to change the general vector processed by each layer into the state vector of the high-order quantum system. The experimental results on four mainstream datasets show that FNN reconstructed from the high-order quantum system is significantly better than the classical counterpart.
{"title":"Feedforward Neural Network Reconstructed from High-order Quantum Systems","authors":"Junwei Zhang, Zhao Li, Hao Peng, Ming Li, Xiaofen Wang","doi":"10.1109/IJCNN55064.2022.9892720","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892720","url":null,"abstract":"Neural Networks (NNs) are widely used because of their superior feature extraction capabilities, among which Feedforward Neural Network (FNN) is used as the basic model for theoretical research. Recently, Quantum Neural Networks (QNNs) based on quantum mechanics have received extensive attention due to their ability to mine quantum correlations and parallel computing. Since two classical bits are required to simulate one qubit (i.e., quantum bit) on a classical computer, it brings challenges for simulating complex quantum operations or building large-scale QNNs on a classical computer. Hardy et al. extended the classical and quantum probability theories to the Generalized Probability Theory (GPT), so it is possible to construct high-order quantum systems. This paper regards the entire feature extraction and integration process of FNN as the evolution process of the high-order quantum system, and then leverages quantum coherence to describe the complex relationship between the features extracted by each layer of the network model. Intuitively, we reconstruct FNN to change the general vector processed by each layer into the state vector of the high-order quantum system. The experimental results on four mainstream datasets show that FNN reconstructed from the high-order quantum system is significantly better than the classical counterpart.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114267708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}