Pub Date : 2021-12-01DOI: 10.1109/ICKG52313.2021.00068
Shigang Yang, Yongguo Liu
Text classification is an important task in natural language processing. Different from English, Chinese text owns two representations, character-level and word-level. The former has abundant connotations and the latter owns specific meanings. Current researches often simply concatenated two-level features with little processing and failed to explore the affiliation relation-ship between Chinese characters and words. In this paper, we proposed a character-word graph attention network (CW-GAT) to explore the interactive information between characters and words for Chinese text classification. A graph attention network is adopted to capture the context of sentences and the interaction between characters and words. Extensive experiments on six real Chinese text datasets show that the proposed model outperforms the latest baseline methods.
{"title":"A Character-Word Graph Attention Networks for Chinese Text Classification","authors":"Shigang Yang, Yongguo Liu","doi":"10.1109/ICKG52313.2021.00068","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00068","url":null,"abstract":"Text classification is an important task in natural language processing. Different from English, Chinese text owns two representations, character-level and word-level. The former has abundant connotations and the latter owns specific meanings. Current researches often simply concatenated two-level features with little processing and failed to explore the affiliation relation-ship between Chinese characters and words. In this paper, we proposed a character-word graph attention network (CW-GAT) to explore the interactive information between characters and words for Chinese text classification. A graph attention network is adopted to capture the context of sentences and the interaction between characters and words. Extensive experiments on six real Chinese text datasets show that the proposed model outperforms the latest baseline methods.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128326508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-01DOI: 10.1109/ICKG52313.2021.00046
Wenwu Zhi, Yuhong Zhang
Bilingual lexicon induction (BLI) can transfer knowledgefrom well- to under- resourced language, and has been widelyapplied to various NLP tasks. Recent work on BLI is projection-based that learns a mapping to connect source and target embedding spaces, with the isomorphism assumption. Unfortunately, the isomorphism assumption doesn't hold gener-ally, especially in typologically distant language pairs. Moreover, without supervised signals guiding, the training will further com-plicates BLI, making the performance of unsupervised methods unsatisfactory. To broke the restrict of isomorphism, we propose a semi-supervised method for distant BLI tasks, named A Semi-supervised Bilingual Lexicon Induction method in Latent Space based on Bidirectional Adversarial Model. First, two latent spaces are learned by two autoencoders for source and target domain independently to weaken the constraint of isomorphism in the embedding spaces. Then we add a few pairs of dictionary to learn the initial mapping to connect the Latent Space. Last, based on initial mapping, Cycle-Consistency is combined with Distance constraint constraint to maintain the geometry structure of both embedding spaces stable in the learning of bi-direction mapping based on adversarial model. By conducting extensive experiments, our method gets state-of-the-art results on most language pairs, especially with significant improvements on distant language pairs.
{"title":"A Semi-supervised Bilingual Lexicon Induction Method for Distant Language Pairs Based on Bidirectional Adversarial Model","authors":"Wenwu Zhi, Yuhong Zhang","doi":"10.1109/ICKG52313.2021.00046","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00046","url":null,"abstract":"Bilingual lexicon induction (BLI) can transfer knowledgefrom well- to under- resourced language, and has been widelyapplied to various NLP tasks. Recent work on BLI is projection-based that learns a mapping to connect source and target embedding spaces, with the isomorphism assumption. Unfortunately, the isomorphism assumption doesn't hold gener-ally, especially in typologically distant language pairs. Moreover, without supervised signals guiding, the training will further com-plicates BLI, making the performance of unsupervised methods unsatisfactory. To broke the restrict of isomorphism, we propose a semi-supervised method for distant BLI tasks, named A Semi-supervised Bilingual Lexicon Induction method in Latent Space based on Bidirectional Adversarial Model. First, two latent spaces are learned by two autoencoders for source and target domain independently to weaken the constraint of isomorphism in the embedding spaces. Then we add a few pairs of dictionary to learn the initial mapping to connect the Latent Space. Last, based on initial mapping, Cycle-Consistency is combined with Distance constraint constraint to maintain the geometry structure of both embedding spaces stable in the learning of bi-direction mapping based on adversarial model. By conducting extensive experiments, our method gets state-of-the-art results on most language pairs, especially with significant improvements on distant language pairs.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125178566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-01DOI: 10.1109/ICKG52313.2021.00058
Bingke Xu, Yue Cui, Zipeng Sun, Liwei Deng, Kai Zheng
Knowledge-aware recommendation system has at-tracted considerable interest in academia and industry, which comes in handy to solve the cold-start problem and offer a reliable solution for the business to grow. It's particularly important to consider fairness issues when designing and using those systems. However, we find that though not explicitly introduced to a knowledge graph (KG), sensitive information can be implicitly learned by a recommender and thus leads to unfairness. Most existing debiasing methods require sophisticated model design or can only be applied to specific base models. In this paper, to address the above problems, we propose a method to ensure the fairness of any knowledge-aware recommendation models by introducing a sensitivity graph. Different from the majority of previous studies that only handle a single protected attribute, we also aim to make our method flexible to different combinations of fairness constraints during inference. Specifically, given a knowledge-based recommendation model, we first construct a sensitivity graph by taking protected attributes as nodes and dynamically learned relations between pairs of attributes as edges. Then we merge the sensitivity graph into the original knowledge graph and introduce an adversarial framework to enhance fairness criterion by extracting sensitive information of users from the original KG during the graph representation process, without changing the KG-based recommendation model. Extensive experimental results on two public real-world datasets show that the proposed framework can achieve state-of-the-art performance on improving the fairness of any KG-based recommendation model while only cause trivial overall accuracy declination.
{"title":"Fair Representation Learning in Knowledge-aware Recommendation","authors":"Bingke Xu, Yue Cui, Zipeng Sun, Liwei Deng, Kai Zheng","doi":"10.1109/ICKG52313.2021.00058","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00058","url":null,"abstract":"Knowledge-aware recommendation system has at-tracted considerable interest in academia and industry, which comes in handy to solve the cold-start problem and offer a reliable solution for the business to grow. It's particularly important to consider fairness issues when designing and using those systems. However, we find that though not explicitly introduced to a knowledge graph (KG), sensitive information can be implicitly learned by a recommender and thus leads to unfairness. Most existing debiasing methods require sophisticated model design or can only be applied to specific base models. In this paper, to address the above problems, we propose a method to ensure the fairness of any knowledge-aware recommendation models by introducing a sensitivity graph. Different from the majority of previous studies that only handle a single protected attribute, we also aim to make our method flexible to different combinations of fairness constraints during inference. Specifically, given a knowledge-based recommendation model, we first construct a sensitivity graph by taking protected attributes as nodes and dynamically learned relations between pairs of attributes as edges. Then we merge the sensitivity graph into the original knowledge graph and introduce an adversarial framework to enhance fairness criterion by extracting sensitive information of users from the original KG during the graph representation process, without changing the KG-based recommendation model. Extensive experimental results on two public real-world datasets show that the proposed framework can achieve state-of-the-art performance on improving the fairness of any KG-based recommendation model while only cause trivial overall accuracy declination.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125520731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-01DOI: 10.1109/ICKG52313.2021.00065
Jiachen Du, Yang Gao
Text summarization is a task that creates a short version of a document while preserving the main content. In the age of information explosion, how to obtain the content that users care about from a large amount of information becomes par-ticularly significant. Under these circumstances, query-focused abstractive summarization (QFS) becomes more dominant since it is able to focus on user needs while generating fluent, con-cise, succinct paraphrased summaries. However, different from generic summarization that has achieved remarkable results driven by a large scale of parallel data, the QFS is suffering from lacking enough parallel corpus. To address the above issues, in this paper, we migrate the large-scale generic summarization datasets into query-focused datasets while preserving the informative summaries. Based on the synthetic queries and data, we proposed a new model, called SQAS, which is capable of extracting fine-grained factual information with respect to a specific question, and take into account the reasoning information by understanding the source document leveraged by the question-answering model. Receiving the extracted content, the summary generator can not only generate semantically relevant content but also assure fluent and readable sentences thanks to the language generation capability of a pre-trained language model. Experimental results on both generic datasets and query-focused summary datasets demonstrate the effectiveness of our proposed model in terms of automatic ROUGE metrics and investigating real cases.
{"title":"Query-focused Abstractive Summarization via Question-answering Model","authors":"Jiachen Du, Yang Gao","doi":"10.1109/ICKG52313.2021.00065","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00065","url":null,"abstract":"Text summarization is a task that creates a short version of a document while preserving the main content. In the age of information explosion, how to obtain the content that users care about from a large amount of information becomes par-ticularly significant. Under these circumstances, query-focused abstractive summarization (QFS) becomes more dominant since it is able to focus on user needs while generating fluent, con-cise, succinct paraphrased summaries. However, different from generic summarization that has achieved remarkable results driven by a large scale of parallel data, the QFS is suffering from lacking enough parallel corpus. To address the above issues, in this paper, we migrate the large-scale generic summarization datasets into query-focused datasets while preserving the informative summaries. Based on the synthetic queries and data, we proposed a new model, called SQAS, which is capable of extracting fine-grained factual information with respect to a specific question, and take into account the reasoning information by understanding the source document leveraged by the question-answering model. Receiving the extracted content, the summary generator can not only generate semantically relevant content but also assure fluent and readable sentences thanks to the language generation capability of a pre-trained language model. Experimental results on both generic datasets and query-focused summary datasets demonstrate the effectiveness of our proposed model in terms of automatic ROUGE metrics and investigating real cases.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125631897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-01DOI: 10.1109/ICKG52313.2021.00052
Zhichao Duan, Xiuxing Li, Zhengyan Zhang, Zhenyu Li, Ning Liu, Jianyong Wang
Question Answering (QA) is the task of automati-cally answering questions posed by humans in natural languages. There are different settings to answer a question, such as abstractive, extractive, boolean, and multiple-choice QA. As a popular topic in natural language processing tasks, extractive question answering task (extractive QA) has gained extensive attention in the past few years. With the continuous evolvement of the world, generalized cross-lingual transfer (G-XLT), where question and answer context are in different languages, poses some unique challenges over cross-lingual transfer (XLT), where question and answer context are in the same language. With the boost of corresponding development of related benchmarks, many works have been done to improve the performance of various language QA tasks. However, only a few works are dedicated to the G-XLT task. In this work, we propose a generalized cross-lingual transfer framework to enhance the model's ability to understand different languages. Specifically, we first assemble triples from different languages to form multilingual knowledge. Since the lack of knowledge between different languages greatly limits models' reasoning ability, we further design a knowledge injection strategy via leveraging link prediction techniques to enrich the model storage of multilingual knowledge. In this way, we can profoundly exploit rich semantic knowledge. Experiment results on real-world datasets MLQA demonstrate that the proposed method can improve the performance by a large margin, outperforming the baseline method by 13.18%/12.00% F1/EM on average.
{"title":"Bridging the Language Gap: Knowledge Injected Multilingual Question Answering","authors":"Zhichao Duan, Xiuxing Li, Zhengyan Zhang, Zhenyu Li, Ning Liu, Jianyong Wang","doi":"10.1109/ICKG52313.2021.00052","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00052","url":null,"abstract":"Question Answering (QA) is the task of automati-cally answering questions posed by humans in natural languages. There are different settings to answer a question, such as abstractive, extractive, boolean, and multiple-choice QA. As a popular topic in natural language processing tasks, extractive question answering task (extractive QA) has gained extensive attention in the past few years. With the continuous evolvement of the world, generalized cross-lingual transfer (G-XLT), where question and answer context are in different languages, poses some unique challenges over cross-lingual transfer (XLT), where question and answer context are in the same language. With the boost of corresponding development of related benchmarks, many works have been done to improve the performance of various language QA tasks. However, only a few works are dedicated to the G-XLT task. In this work, we propose a generalized cross-lingual transfer framework to enhance the model's ability to understand different languages. Specifically, we first assemble triples from different languages to form multilingual knowledge. Since the lack of knowledge between different languages greatly limits models' reasoning ability, we further design a knowledge injection strategy via leveraging link prediction techniques to enrich the model storage of multilingual knowledge. In this way, we can profoundly exploit rich semantic knowledge. Experiment results on real-world datasets MLQA demonstrate that the proposed method can improve the performance by a large margin, outperforming the baseline method by 13.18%/12.00% F1/EM on average.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127895673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-01DOI: 10.1109/ickg52313.2021.00009
Allen Bundy
{"title":"ICBK 2021 Keynote Abstracts","authors":"Allen Bundy","doi":"10.1109/ickg52313.2021.00009","DOIUrl":"https://doi.org/10.1109/ickg52313.2021.00009","url":null,"abstract":"","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123005867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-01DOI: 10.1109/ICKG52313.2021.00061
Hongwei Zeng, Zhenjie Hong, J. Liu, Bifan Wei
Multi-turn dialogue generation aims to generate natural and fluent responses that should be consistent with multiple consecutive utterances history. It is a more challenging task compared to its single-turn counterpart since it requires the model to capture the topic drift along with the multi-turn dialogue history. In this paper, we propose a multi-turn dialogue generation model which incorporates topic drift aware information into a hierarchical encoder-decoder framework to generate coherent responses. This model first utilizes a Convolutional Neural Network (CNN) based topic model to obtain the topic representation of each utterance. Then a topic drift model is employed to encode the sequential topics of multi-turn dialogue history to infer the topic of response. During the response generation, a specially designed topic drift aware generator is proposed to dynamically balance the impact of the inferred topic of response and local word structure. Fur-thermore, we employ multi-task learning to optimize the topic drift model and dialogue generation simultaneously. Extensive experimental results on two benchmark datasets (i.e. Cornell Movie Dialog Corpus and Ubuntu Dialogue Dataset) indicate that our proposed model can generate more coherent responses, and significantly outperform other dialogue generation models.
{"title":"Multi-task Learning for Multi-turn Dialogue Generation with Topic Drift Modeling","authors":"Hongwei Zeng, Zhenjie Hong, J. Liu, Bifan Wei","doi":"10.1109/ICKG52313.2021.00061","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00061","url":null,"abstract":"Multi-turn dialogue generation aims to generate natural and fluent responses that should be consistent with multiple consecutive utterances history. It is a more challenging task compared to its single-turn counterpart since it requires the model to capture the topic drift along with the multi-turn dialogue history. In this paper, we propose a multi-turn dialogue generation model which incorporates topic drift aware information into a hierarchical encoder-decoder framework to generate coherent responses. This model first utilizes a Convolutional Neural Network (CNN) based topic model to obtain the topic representation of each utterance. Then a topic drift model is employed to encode the sequential topics of multi-turn dialogue history to infer the topic of response. During the response generation, a specially designed topic drift aware generator is proposed to dynamically balance the impact of the inferred topic of response and local word structure. Fur-thermore, we employ multi-task learning to optimize the topic drift model and dialogue generation simultaneously. Extensive experimental results on two benchmark datasets (i.e. Cornell Movie Dialog Corpus and Ubuntu Dialogue Dataset) indicate that our proposed model can generate more coherent responses, and significantly outperform other dialogue generation models.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125359986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-01DOI: 10.1109/ICKG52313.2021.00021
Gui-Lin Li, Heng-Ru Zhang, Yuan-Yuan Xu, Yaoyao Lv, Fan Min
Label distribution learning (LDL) is a novel learning paradigm that predicts the degree of representation of multiple labels to an instance. Existing algorithms use all features to predict label distribution. However, each label is often related to part of the features, hence considering other irrelevant features may lead to deviation in both instance searching and model prediction. In this paper, we propose a new LDL algorithm by exploiting the local correlation between features and labels (LDL-LCFL). The main idea is to exploit the local correlations between features and labels, which will be used in the improved k NN algorithm for prediction. Experiments were conducted on eight well-known label distribution data sets with four distance measurements and two similarity measurements. Results show that compared with nine popular LDL methods, our algorithm's prediction ranking is superior.
{"title":"Label Distribution Learning by Exploiting Feature-Label Correlations Locally","authors":"Gui-Lin Li, Heng-Ru Zhang, Yuan-Yuan Xu, Yaoyao Lv, Fan Min","doi":"10.1109/ICKG52313.2021.00021","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00021","url":null,"abstract":"Label distribution learning (LDL) is a novel learning paradigm that predicts the degree of representation of multiple labels to an instance. Existing algorithms use all features to predict label distribution. However, each label is often related to part of the features, hence considering other irrelevant features may lead to deviation in both instance searching and model prediction. In this paper, we propose a new LDL algorithm by exploiting the local correlation between features and labels (LDL-LCFL). The main idea is to exploit the local correlations between features and labels, which will be used in the improved k NN algorithm for prediction. Experiments were conducted on eight well-known label distribution data sets with four distance measurements and two similarity measurements. Results show that compared with nine popular LDL methods, our algorithm's prediction ranking is superior.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122878525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-01DOI: 10.1109/ICKG52313.2021.00020
Runnan Tan, Qingfeng Tan, Peng Zhang, Zhao Li
Currently, the blockchain technology has been widely applied to various industries, and has attracted wide attention. However, because of its unique anonymity, digital currency has become a haven for all kinds of cyber crimes. It has been reported that Ethereum frauds provide huge profits, and pose a serious threat to the financial security of the Ethereum network. To create a desired financial environment, an effective method is urgently needed to automatically detect and identify Ethereum frauds in the governance of the Ethereum system. In view of this, this paper proposes a method for detecting Ethereum frauds by mining Ethereum-based transaction records. Specifically, web crawlers are used to capture labeled fraudulent addresses, and then a transaction network is reconstructed based on the public transaction book. Then, an amount-based network embedding algorithm is proposed to extract node features for identifying fraudulent transactions. At last, the graph convolutional network model is used to classify addresses into legal addresses and fraudulent addresses. The experimental results show that the system for detecting fraudulent transactions can achieve the accuracy of 95%, which reflects the excellent performance of the system for detecting Ethereum fraudulent transactions.
{"title":"Graph Neural Network for Ethereum Fraud Detection","authors":"Runnan Tan, Qingfeng Tan, Peng Zhang, Zhao Li","doi":"10.1109/ICKG52313.2021.00020","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00020","url":null,"abstract":"Currently, the blockchain technology has been widely applied to various industries, and has attracted wide attention. However, because of its unique anonymity, digital currency has become a haven for all kinds of cyber crimes. It has been reported that Ethereum frauds provide huge profits, and pose a serious threat to the financial security of the Ethereum network. To create a desired financial environment, an effective method is urgently needed to automatically detect and identify Ethereum frauds in the governance of the Ethereum system. In view of this, this paper proposes a method for detecting Ethereum frauds by mining Ethereum-based transaction records. Specifically, web crawlers are used to capture labeled fraudulent addresses, and then a transaction network is reconstructed based on the public transaction book. Then, an amount-based network embedding algorithm is proposed to extract node features for identifying fraudulent transactions. At last, the graph convolutional network model is used to classify addresses into legal addresses and fraudulent addresses. The experimental results show that the system for detecting fraudulent transactions can achieve the accuracy of 95%, which reflects the excellent performance of the system for detecting Ethereum fraudulent transactions.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126407078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-01DOI: 10.1109/ICKG52313.2021.00018
Guixian Zhang, Boyan Chen, Lijuan Wu, Kui Zhang, Shichao Zhang
Limited by the irregularity of graph topological structure and the sequence independence of nodes, existing graph neural networks usually generate graph-level representation by simple aggregation or sorting of node features for graph classification. These models are usually not deep enough to extract more abstract semantic information. Once the network deepens, it is easy to cause over-smoothing. To solve this problem, we propose a simple and efficient graph convolutional neural network based on DenseNet called AEGCN (Aggregation Enhanced Graph Convolutional Network) for graph clas-sification. We build a local extrema function named ELEConv (Enhanced Local Extrema Convolution) to reduce the noise in graphs, and then generate a large number of reusable feature maps through dense links. Extensive experiments on four real-world datasets validate that AEGCN not only alleviates the over-smoothing problem, but also has an advanced graph classification effect.
现有的图神经网络受图拓扑结构不规则性和节点序列独立性的限制,通常通过对节点特征进行简单的聚合或排序来生成图级表示,用于图分类。这些模型通常不够深入,无法提取更抽象的语义信息。一旦网络加深,很容易造成过度平滑。为了解决这一问题,我们提出了一种基于DenseNet的简单高效的图卷积神经网络,称为AEGCN (Aggregation Enhanced graph convolutional network),用于图分类。我们构建了一个局部极值函数ELEConv (Enhanced local extrema Convolution,增强局部极值卷积)来降低图中的噪声,然后通过密集链接生成大量可重用的特征图。在4个真实数据集上的大量实验验证了AEGCN不仅缓解了过度平滑问题,而且具有先进的图分类效果。
{"title":"Aggregation Enhanced Graph Convolutional Network for Graph Classification","authors":"Guixian Zhang, Boyan Chen, Lijuan Wu, Kui Zhang, Shichao Zhang","doi":"10.1109/ICKG52313.2021.00018","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00018","url":null,"abstract":"Limited by the irregularity of graph topological structure and the sequence independence of nodes, existing graph neural networks usually generate graph-level representation by simple aggregation or sorting of node features for graph classification. These models are usually not deep enough to extract more abstract semantic information. Once the network deepens, it is easy to cause over-smoothing. To solve this problem, we propose a simple and efficient graph convolutional neural network based on DenseNet called AEGCN (Aggregation Enhanced Graph Convolutional Network) for graph clas-sification. We build a local extrema function named ELEConv (Enhanced Local Extrema Convolution) to reduce the noise in graphs, and then generate a large number of reusable feature maps through dense links. Extensive experiments on four real-world datasets validate that AEGCN not only alleviates the over-smoothing problem, but also has an advanced graph classification effect.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"60 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132399277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}