首页 > 最新文献

2021 IEEE International Conference on Big Knowledge (ICBK)最新文献

英文 中文
Meta-path Enhanced Knowledge Graph Convolutional Network for Recommender Systems 推荐系统的元路径增强知识图卷积网络
Pub Date : 2021-12-01 DOI: 10.1109/ICKG52313.2021.00024
Ru Wang, Meng Wu, Shengwei Ji
Knowledge Graph (KG) is a directed heterogeneous information network that contains a large number of entities and relations, which is widely used as effective side information in rec-ommender systems. Moreover, in recommender systems, the Graph Convolutional Network (GCN) model is introduced to mine the relatedness between entities in a KG because of its efficiency in extracting spatial features on topological graphs. The Knowledge Graph Convolutional Network (KGCN) model up-dates the embedding of a currently positioned entity by aggregating the information of adjacent entities selected randomly. Never-theless, it has two limititations: 1) the information of neighbors se-lected randomly cannot accurately represent the current entity in the KG; 2) the model is hard to converge as graph features (i.e. The spatial relation features and semantic information features of en-tities in the KG) grow. To solve these limitations, in this paper, a meta-path (i.e., a sequence of artificially constructed relationships) is introduced into the selection of neighbors in the KGCN model to enhance the representation of each entity. Furthermore, two construction methods of the meta-path - constructing a meta-path based on the same relation (KGCN-SP) and the characteris-tics of KG (KGCN-MP) -are proposed. The experiments based on three real-world datasets demonstrate that the neighbor selection based on the meta-path is able to collect more accurate infor-mation from a KG and improve the recommendation performance effectively.
知识图谱(Knowledge Graph, KG)是一种包含大量实体和关系的定向异构信息网络,在推荐系统中被广泛用作有效的侧信息。此外,在推荐系统中,由于图形卷积网络(GCN)模型在提取拓扑图上的空间特征方面效率高,因此引入了GCN模型来挖掘KG中实体之间的相关性。知识图卷积网络(KGCN)模型通过聚合随机选择的相邻实体的信息来更新当前定位实体的嵌入。然而,它有两个局限性:1)随机选择的邻居信息不能准确地表示KG中的当前实体;2)随着图特征(即KG中实体的空间关系特征和语义信息特征)的增长,模型难以收敛。为了解决这些限制,本文在KGCN模型的邻居选择中引入了元路径(即一系列人为构建的关系),以增强每个实体的表示。在此基础上,提出了基于相同关系构建元路径(KGCN-SP)和基于KG的特性构建元路径(KGCN-MP)两种元路径构建方法。基于三个真实数据集的实验表明,基于元路径的邻居选择能够从KG中收集到更准确的信息,有效地提高了推荐性能。
{"title":"Meta-path Enhanced Knowledge Graph Convolutional Network for Recommender Systems","authors":"Ru Wang, Meng Wu, Shengwei Ji","doi":"10.1109/ICKG52313.2021.00024","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00024","url":null,"abstract":"Knowledge Graph (KG) is a directed heterogeneous information network that contains a large number of entities and relations, which is widely used as effective side information in rec-ommender systems. Moreover, in recommender systems, the Graph Convolutional Network (GCN) model is introduced to mine the relatedness between entities in a KG because of its efficiency in extracting spatial features on topological graphs. The Knowledge Graph Convolutional Network (KGCN) model up-dates the embedding of a currently positioned entity by aggregating the information of adjacent entities selected randomly. Never-theless, it has two limititations: 1) the information of neighbors se-lected randomly cannot accurately represent the current entity in the KG; 2) the model is hard to converge as graph features (i.e. The spatial relation features and semantic information features of en-tities in the KG) grow. To solve these limitations, in this paper, a meta-path (i.e., a sequence of artificially constructed relationships) is introduced into the selection of neighbors in the KGCN model to enhance the representation of each entity. Furthermore, two construction methods of the meta-path - constructing a meta-path based on the same relation (KGCN-SP) and the characteris-tics of KG (KGCN-MP) -are proposed. The experiments based on three real-world datasets demonstrate that the neighbor selection based on the meta-path is able to collect more accurate infor-mation from a KG and improve the recommendation performance effectively.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128725296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
An Ensemble Latent Factor Model for Highly Accurate Web Service QoS Prediction 面向高精度Web服务QoS预测的集成潜在因子模型
Pub Date : 2021-12-01 DOI: 10.1109/ICKG52313.2021.00055
Peng Zhang, Yi He, Di Wu
How to accurately predict quality of service (QoS) data is a great challenge in Web service selection or recommen-dation. To date, a latent factor (LF)-based QoS predictor is one of the most successful and popular approaches to address this chal-lenge as its high efficiency and scalability. However, current LF -based QoS predictors are mostly developed on inner product space with an L2 norm-oriented loss function only, thereby they cannot comprehensively represent target QoS data's characteris-tics to make accurate prediction as inner product space and L2 norm have their respective limitations. To address this issue, this study proposes an ensemble LF (ELF) model. It has three-fold ideas: 1) two kinds of LF models are developed as QoS predictors on inner product space and distance space, respectively, 2) both of these two QoS predictors adopt an Ll-and-L2-norm-oriented loss function, and 3) building an ensemble of these two QoS predictors by a weighting strategy. By doing so, ELF integrates multi-merits originating from inner product space, distance space, L1 norm, and L2 norm, making it achieve highly accurate and robust QoS prediction. Experiments on a real-world QoS dataset demonstrate that the proposed ELF model outperforms state-of-the-art QoS predictors in predicting the missing QoS data.
如何准确预测服务质量(QoS)数据是Web服务选择或推荐中的一大挑战。迄今为止,基于潜在因子(LF)的QoS预测器是解决这一挑战的最成功和最流行的方法之一,因为它具有高效率和可扩展性。然而,目前基于LF的QoS预测器大多是在面向L2范数的内积空间上开发的,仅具有面向L2范数的损失函数,由于内积空间和L2范数有各自的局限性,无法全面表征目标QoS数据的特征,无法进行准确的预测。为了解决这个问题,本研究提出了一个集合LF (ELF)模型。它有三方面的思想:1)分别在内积空间和距离空间上开发两种LF模型作为QoS预测器;2)这两种QoS预测器都采用面向l2和l2范数的损失函数;3)通过加权策略构建这两种QoS预测器的集合。通过这样做,ELF集成了源自内积空间、距离空间、L1范数和L2范数的多种优点,从而实现了高精度和鲁棒性的QoS预测。在真实的QoS数据集上的实验表明,所提出的ELF模型在预测缺失的QoS数据方面优于最先进的QoS预测器。
{"title":"An Ensemble Latent Factor Model for Highly Accurate Web Service QoS Prediction","authors":"Peng Zhang, Yi He, Di Wu","doi":"10.1109/ICKG52313.2021.00055","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00055","url":null,"abstract":"How to accurately predict quality of service (QoS) data is a great challenge in Web service selection or recommen-dation. To date, a latent factor (LF)-based QoS predictor is one of the most successful and popular approaches to address this chal-lenge as its high efficiency and scalability. However, current LF -based QoS predictors are mostly developed on inner product space with an L2 norm-oriented loss function only, thereby they cannot comprehensively represent target QoS data's characteris-tics to make accurate prediction as inner product space and L2 norm have their respective limitations. To address this issue, this study proposes an ensemble LF (ELF) model. It has three-fold ideas: 1) two kinds of LF models are developed as QoS predictors on inner product space and distance space, respectively, 2) both of these two QoS predictors adopt an Ll-and-L2-norm-oriented loss function, and 3) building an ensemble of these two QoS predictors by a weighting strategy. By doing so, ELF integrates multi-merits originating from inner product space, distance space, L1 norm, and L2 norm, making it achieve highly accurate and robust QoS prediction. Experiments on a real-world QoS dataset demonstrate that the proposed ELF model outperforms state-of-the-art QoS predictors in predicting the missing QoS data.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116865137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Implicit Business Competitor Inference Using Heterogeneous Knowledge Graph 基于异构知识图的隐性商业竞争者推断
Pub Date : 2021-12-01 DOI: 10.1109/ICKG52313.2021.00035
Wei Qin, Xiangfeng Luo, Hao Wang
Competitor inference is the task of identifying current or potential competitors given their primary markets and Business Scope. Previous methods have achieved remarkable success on explicit competitor inference using state-of-the-art natural language processing (NLP) techniques, mainly relying on comparative expressions. However, those methods lack interpretability and cannot identify implicit competitors without the explicit mentions of competitive relationships in the text. To remedy these problems, in this paper, we propose a probabilistic graphical model which leverages heterogeneous enterprise knowledge graph containing both structured information, e.g., Product Analysis, Sales Territory, and unstructured information, e.g., Business Scope. The model is defined with first-order logic rules using the declarative language of Probabilistic Soft Logic (PSL). As a result, our model enables predicting implicit competitors while provides pieces of interpretable evidence. Experimental results show that our approach is significantly superior to previous methods.
竞争对手推断是根据他们的主要市场和业务范围确定当前或潜在竞争对手的任务。以前的方法主要依靠比较表达式,使用最先进的自然语言处理(NLP)技术在显式竞争对手推理上取得了显著的成功。然而,这些方法缺乏可解释性,如果文本中没有明确提及竞争关系,则无法识别隐性竞争对手。为了解决这些问题,本文提出了一种概率图模型,该模型利用异构企业知识图,该知识图既包含结构化信息,如产品分析、销售区域,也包含非结构化信息,如业务范围。该模型采用一阶逻辑规则,使用概率软逻辑(PSL)的声明性语言进行定义。因此,我们的模型能够预测隐性竞争对手,同时提供可解释的证据。实验结果表明,我们的方法明显优于以往的方法。
{"title":"Implicit Business Competitor Inference Using Heterogeneous Knowledge Graph","authors":"Wei Qin, Xiangfeng Luo, Hao Wang","doi":"10.1109/ICKG52313.2021.00035","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00035","url":null,"abstract":"Competitor inference is the task of identifying current or potential competitors given their primary markets and Business Scope. Previous methods have achieved remarkable success on explicit competitor inference using state-of-the-art natural language processing (NLP) techniques, mainly relying on comparative expressions. However, those methods lack interpretability and cannot identify implicit competitors without the explicit mentions of competitive relationships in the text. To remedy these problems, in this paper, we propose a probabilistic graphical model which leverages heterogeneous enterprise knowledge graph containing both structured information, e.g., Product Analysis, Sales Territory, and unstructured information, e.g., Business Scope. The model is defined with first-order logic rules using the declarative language of Probabilistic Soft Logic (PSL). As a result, our model enables predicting implicit competitors while provides pieces of interpretable evidence. Experimental results show that our approach is significantly superior to previous methods.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123530516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Knowledge Distillation via Weighted Ensemble of Teaching Assistants 基于助教加权集合的知识提炼
Pub Date : 2021-12-01 DOI: 10.1109/ICKG52313.2021.00014
Durga Prasad Ganta, Himel Das Gupta, Victor S. Sheng
Knowledge distillation in machine learning is the process of transferring knowledge from a large model called teacher to a smaller model called student. Knowledge distillation is one of the techniques to compress the large network (teacher) to a smaller network (student) that can be deployed in small devices such as mobile phones. When the network size gap between the teacher and student increases, the performance of the student network decreases. To solve this problem, an intermediate model is employed between the teacher model and the student model known as the teaching assistant model, which in turn bridges the gap between the teacher and the student. In this research, we have shown that using multiple teaching assistant models, the student model (the smaller model) can be further improved. We combined these multiple teaching assistant model using weighted ensemble learning where we have used a differential evaluation optimization algorithm to generate the weight values.
机器学习中的知识升华是将知识从一个叫做老师的大模型转移到一个叫做学生的小模型的过程。知识蒸馏是将大型网络(教师)压缩为可部署在移动电话等小型设备上的小型网络(学生)的技术之一。当师生之间的网络大小差距增大时,学生网络的性能下降。为了解决这个问题,在教师模型和学生模型之间采用了一种中间模型,即助教模型,它反过来弥合了教师和学生之间的差距。在本研究中,我们已经证明了使用多种助教模型,学生模型(较小的模型)可以进一步改进。我们使用加权集成学习来组合这些多个助教模型,其中我们使用微分评估优化算法来生成权重值。
{"title":"Knowledge Distillation via Weighted Ensemble of Teaching Assistants","authors":"Durga Prasad Ganta, Himel Das Gupta, Victor S. Sheng","doi":"10.1109/ICKG52313.2021.00014","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00014","url":null,"abstract":"Knowledge distillation in machine learning is the process of transferring knowledge from a large model called teacher to a smaller model called student. Knowledge distillation is one of the techniques to compress the large network (teacher) to a smaller network (student) that can be deployed in small devices such as mobile phones. When the network size gap between the teacher and student increases, the performance of the student network decreases. To solve this problem, an intermediate model is employed between the teacher model and the student model known as the teaching assistant model, which in turn bridges the gap between the teacher and the student. In this research, we have shown that using multiple teaching assistant models, the student model (the smaller model) can be further improved. We combined these multiple teaching assistant model using weighted ensemble learning where we have used a differential evaluation optimization algorithm to generate the weight values.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122162794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Accelerating Learning Bayesian Network Structures by Reducing Redundant CI Tests 通过减少冗余CI测试加速学习贝叶斯网络结构
Pub Date : 2021-12-01 DOI: 10.1109/ICKG52313.2021.00016
Wentao Hu, Shuai Yang, Xianjie Guo, Kui Yu
The type of constraint-based methods is one of the most important approaches to learn Bayesian network (BN) structures from observational data with conditional independence (CI) tests. In this paper, we find that existing constraint-based methods often perform many redundant CI tests, which significantly reduces the learning efficiency of those algorithms. To tackle this issue, we propose a novel framework to accelerate BN structure learning by reducing redundant CI tests without sacrificing accuracy. Specifically, we first design a CI test cache table to store CI tests. If a CI test has been computed before, the result of the CI test is obtained from the table instead of computing the CI test again. If not, the CI test is computed and stored in the table. Then based on the table, we propose two CI test cache table based PC (CTPC) learning frameworks for reducing redundant CI tests for BN structure learning. Finally, we instantiate the proposed frameworks with existing well-established local and global BN structure learning algorithms. Using twelve benchmark BNs, the extensive experiments have demonstrated that the proposed frameworks can significantly accelerate existing BN structure learning algorithms without sacrificing accuracy.
基于约束的贝叶斯网络学习方法是通过条件独立性检验从观测数据中学习贝叶斯网络结构的重要方法之一。在本文中,我们发现现有的基于约束的方法经常执行许多冗余的CI测试,这大大降低了这些算法的学习效率。为了解决这个问题,我们提出了一个新的框架,通过减少冗余的CI测试来加速BN结构的学习,而不牺牲准确性。具体来说,我们首先设计一个CI测试缓存表来存储CI测试。如果之前已经计算过CI测试,则从表中获得CI测试的结果,而不是再次计算CI测试。如果没有,则计算CI测试并将其存储在表中。然后在表的基础上,提出了两种基于CI测试缓存表的PC (CTPC)学习框架,以减少BN结构学习中冗余的CI测试。最后,我们用现有的完善的局部和全局BN结构学习算法实例化了所提出的框架。使用12个基准BN进行的大量实验表明,所提出的框架可以在不牺牲精度的情况下显著加速现有BN结构学习算法。
{"title":"Accelerating Learning Bayesian Network Structures by Reducing Redundant CI Tests","authors":"Wentao Hu, Shuai Yang, Xianjie Guo, Kui Yu","doi":"10.1109/ICKG52313.2021.00016","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00016","url":null,"abstract":"The type of constraint-based methods is one of the most important approaches to learn Bayesian network (BN) structures from observational data with conditional independence (CI) tests. In this paper, we find that existing constraint-based methods often perform many redundant CI tests, which significantly reduces the learning efficiency of those algorithms. To tackle this issue, we propose a novel framework to accelerate BN structure learning by reducing redundant CI tests without sacrificing accuracy. Specifically, we first design a CI test cache table to store CI tests. If a CI test has been computed before, the result of the CI test is obtained from the table instead of computing the CI test again. If not, the CI test is computed and stored in the table. Then based on the table, we propose two CI test cache table based PC (CTPC) learning frameworks for reducing redundant CI tests for BN structure learning. Finally, we instantiate the proposed frameworks with existing well-established local and global BN structure learning algorithms. Using twelve benchmark BNs, the extensive experiments have demonstrated that the proposed frameworks can significantly accelerate existing BN structure learning algorithms without sacrificing accuracy.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122182656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
ICBK 2021 Programme Committee ICBK 2021计划委员会
Pub Date : 2021-12-01 DOI: 10.1109/ickg52313.2021.00007
{"title":"ICBK 2021 Programme Committee","authors":"","doi":"10.1109/ickg52313.2021.00007","DOIUrl":"https://doi.org/10.1109/ickg52313.2021.00007","url":null,"abstract":"","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"187 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131746002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on Crowdsourcing Truth Inference Method Based on Graph Embedding 基于图嵌入的众包真值推理方法研究
Pub Date : 2021-12-01 DOI: 10.1109/ICKG52313.2021.00036
Liangzhu Zhou, Xingrui Zhuo, Gongqing Wu, Zan Zhang, Xianyu Bao
Crowdsourcing is a cheap and popular method to solve problems that are difficult for computers to handle. Due to the differences in ability among workers on crowdsourcing platforms, existing research use aggregation strategies to deal with the labels of different workers to improve the utility of crowdsourcing data. However, most of these studies are based on probabilistic graphical models, which have problems such as difficulty in setting initial parameters. This paper proposes a novel crowdsourcing method Truth Inference based on Graph Embedding (TIGE) for single-choice questions, the method draws on the idea of graph autoencoder, constructs feature vectors for each crowdsourcing task, embeds the relationship between crowdsourcing tasks and workers in graphs, then uses graph neural networks to convert crowdsourcing problems into graph node prediction problems. The feature vectors are continuously optimized in the convolutional layer to obtain the final result. Compared with the six state-of-the-art algorithms on real-world datasets, our method has significant advantages in accuracy and F1-score.
众包是解决计算机难以处理的问题的一种廉价而流行的方法。由于众包平台上工作人员的能力存在差异,现有研究采用聚合策略对不同工作人员的标签进行处理,以提高众包数据的效用。然而,这些研究大多基于概率图模型,存在初始参数设置困难等问题。针对单项选择题,提出了一种基于图嵌入的真值推断(TIGE)众包方法,该方法利用图自编码器的思想,为每个众包任务构造特征向量,将众包任务与工作人员之间的关系嵌入到图中,利用图神经网络将众包问题转化为图节点预测问题。特征向量在卷积层不断优化,得到最终结果。与六种最先进的算法在真实数据集上的比较,我们的方法在准确率和f1得分方面具有显著优势。
{"title":"Research on Crowdsourcing Truth Inference Method Based on Graph Embedding","authors":"Liangzhu Zhou, Xingrui Zhuo, Gongqing Wu, Zan Zhang, Xianyu Bao","doi":"10.1109/ICKG52313.2021.00036","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00036","url":null,"abstract":"Crowdsourcing is a cheap and popular method to solve problems that are difficult for computers to handle. Due to the differences in ability among workers on crowdsourcing platforms, existing research use aggregation strategies to deal with the labels of different workers to improve the utility of crowdsourcing data. However, most of these studies are based on probabilistic graphical models, which have problems such as difficulty in setting initial parameters. This paper proposes a novel crowdsourcing method Truth Inference based on Graph Embedding (TIGE) for single-choice questions, the method draws on the idea of graph autoencoder, constructs feature vectors for each crowdsourcing task, embeds the relationship between crowdsourcing tasks and workers in graphs, then uses graph neural networks to convert crowdsourcing problems into graph node prediction problems. The feature vectors are continuously optimized in the convolutional layer to obtain the final result. Compared with the six state-of-the-art algorithms on real-world datasets, our method has significant advantages in accuracy and F1-score.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"239 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124627431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Novel Homophily-aware Correction Approach for Crowdsourced Labels Using Information Entropy 基于信息熵的众包标签同质性校正方法
Pub Date : 2021-12-01 DOI: 10.1109/ICKG52313.2021.00013
Kang Yan, Jian Lu, Qingren Wang, Wei Li
Crowdsourcing provides a cost effective and conve-nient way for label collection. However, it fails to guarantee the quality of crowdsourced labels. Inspired by homophily in social networks denoting the tendency of individuals with similar char-acteristics to be friends with each other, in this paper we propose a novel Homophily-aware Correction Approach for crowdsourced labels using Information Entropy (namely HaCAIE), to further achieve quality improvement of crowdsourced labels. Specifically, Our HaCAIE can be decomposed into three phases: $i$) seeking full semantic relations among entities, where HaCAIE models multiple explicit and implicit semantic relations among labelers, tasks and categories, based on homogeneous information network and related techniques; ii) calculating homophily, where HaCAIE utilizes adjacent relation matrices of labelers and tasks to calculate homophily among labelers; and iii) correcting labels, where for each task, HaCAIE employs information entropy and constructs a corresponding star homophily network to perform label correction. Our experimental results on six real-world datasets not only show that our HaCAIE performs well, but also demonstrate that HaCAIE can collaborate well with different inference algorithms in the field of crowdsourcing.
众包为标签收集提供了一种既经济又方便的方式。然而,它并不能保证众包标签的质量。受社交网络中具有相似特征的个体倾向于成为朋友的同质性的启发,本文提出了一种新的基于信息熵的众包标签同质性意识校正方法(即HaCAIE),以进一步提高众包标签的质量。具体来说,我们的HaCAIE可以分为三个阶段:$i$)寻求实体之间的完整语义关系,其中HaCAIE基于同构信息网络和相关技术,对标注器、任务和类别之间的多个显式和隐式语义关系进行建模;ii)计算同质性,其中HaCAIE利用标记器和任务的相邻关系矩阵计算标记器之间的同质性;iii)标签校正,HaCAIE利用信息熵,构建相应的星形同质网络,对每一项任务进行标签校正。我们在六个真实数据集上的实验结果不仅表明了我们的HaCAIE算法的良好性能,而且还证明了HaCAIE算法可以很好地与众包领域中不同的推理算法协同工作。
{"title":"A Novel Homophily-aware Correction Approach for Crowdsourced Labels Using Information Entropy","authors":"Kang Yan, Jian Lu, Qingren Wang, Wei Li","doi":"10.1109/ICKG52313.2021.00013","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00013","url":null,"abstract":"Crowdsourcing provides a cost effective and conve-nient way for label collection. However, it fails to guarantee the quality of crowdsourced labels. Inspired by homophily in social networks denoting the tendency of individuals with similar char-acteristics to be friends with each other, in this paper we propose a novel Homophily-aware Correction Approach for crowdsourced labels using Information Entropy (namely HaCAIE), to further achieve quality improvement of crowdsourced labels. Specifically, Our HaCAIE can be decomposed into three phases: $i$) seeking full semantic relations among entities, where HaCAIE models multiple explicit and implicit semantic relations among labelers, tasks and categories, based on homogeneous information network and related techniques; ii) calculating homophily, where HaCAIE utilizes adjacent relation matrices of labelers and tasks to calculate homophily among labelers; and iii) correcting labels, where for each task, HaCAIE employs information entropy and constructs a corresponding star homophily network to perform label correction. Our experimental results on six real-world datasets not only show that our HaCAIE performs well, but also demonstrate that HaCAIE can collaborate well with different inference algorithms in the field of crowdsourcing.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130740341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Influence Maximization Using User Connectivity Guarantee in Social Networks 基于用户连通性保证的社交网络影响最大化
Pub Date : 2021-12-01 DOI: 10.1109/ICKG52313.2021.00056
Xiyu Qiao, Yuliang Ma, Yelie Yuan, Xiangmin Zhou
With the rapid development of social networks, the influence maximization problem has attracted more and more attention from academia and industry. Its aim is to find a set of nodes as seeds to spread the influence as widely as possible. However, most of the existing researches neglected the connectivity of seeds, which has effect on the process of information diffusion. In this paper, we propose a novel problem, connectivity guaranteed influence maximization, which suggests a fixed number of new links to the seed set with the aim of maximizing the influence of seed nodes while guaranteeing the connectivity of the induced subgraphs consisting of active nodes. To tackle this problem, we propose a Connectivity Guaranteed Influence Maximization (CGIM) algorithm based on user connec-tivity and link recommendation. Specifically, Jaccard coefficient is first used to calculate the influence between users. Then a Connectivity Guarantee based Link Addition (CGLA) algorithm is proposed to keep the connectivity of the induced sub graphs formed by all active nodes after influence propagation. Following that, an improved approximate influence maximization algorithm is proposed to maximize the influence by recommending a number of new links to the seed set. Experimental results on real social network datasets show that the proposed CGIM algorithm can maximize the influence of seed nodes while guarantee user connectivity. and has good performance and scalability.
随着社交网络的快速发展,影响力最大化问题越来越受到学术界和业界的关注。其目的是找到一组节点作为种子,以尽可能广泛地传播影响。然而,现有的研究大多忽略了种子的连通性,而种子的连通性影响着信息的传播过程。在本文中,我们提出了一个新的问题,即连通性保证影响最大化,该问题建议在保证由活动节点组成的诱导子图的连通性的同时,为种子集设置固定数量的新链接,以最大化种子节点的影响。为了解决这个问题,我们提出了一种基于用户连通性和链接推荐的连接保证影响最大化(CGIM)算法。具体来说,首先使用Jaccard系数来计算用户之间的影响。然后提出了一种基于连通性保证的链路添加算法(CGLA),以保证影响传播后所有活动节点形成的诱导子图的连通性。然后,提出了一种改进的近似影响最大化算法,通过向种子集推荐一些新链接来最大化影响。在真实社交网络数据集上的实验结果表明,CGIM算法在保证用户连通性的同时,能够最大限度地发挥种子节点的影响。并具有良好的性能和可扩展性。
{"title":"Influence Maximization Using User Connectivity Guarantee in Social Networks","authors":"Xiyu Qiao, Yuliang Ma, Yelie Yuan, Xiangmin Zhou","doi":"10.1109/ICKG52313.2021.00056","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00056","url":null,"abstract":"With the rapid development of social networks, the influence maximization problem has attracted more and more attention from academia and industry. Its aim is to find a set of nodes as seeds to spread the influence as widely as possible. However, most of the existing researches neglected the connectivity of seeds, which has effect on the process of information diffusion. In this paper, we propose a novel problem, connectivity guaranteed influence maximization, which suggests a fixed number of new links to the seed set with the aim of maximizing the influence of seed nodes while guaranteeing the connectivity of the induced subgraphs consisting of active nodes. To tackle this problem, we propose a Connectivity Guaranteed Influence Maximization (CGIM) algorithm based on user connec-tivity and link recommendation. Specifically, Jaccard coefficient is first used to calculate the influence between users. Then a Connectivity Guarantee based Link Addition (CGLA) algorithm is proposed to keep the connectivity of the induced sub graphs formed by all active nodes after influence propagation. Following that, an improved approximate influence maximization algorithm is proposed to maximize the influence by recommending a number of new links to the seed set. Experimental results on real social network datasets show that the proposed CGIM algorithm can maximize the influence of seed nodes while guarantee user connectivity. and has good performance and scalability.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129450183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Genetic Algorithm for Residual Static Correction 残差静校正的遗传算法
Pub Date : 2021-12-01 DOI: 10.1109/ICKG52313.2021.00069
Miao Wu, Shulin Pan, Fan Min
Residual static correction is a necessary step to improve the resolution in the seismic exploration process. It is a challenging task because a large number of parameters need to be adjusted. Some machine learning methods have been proposed to deal with this problem, but the results should be further strengthened. In this paper, we propose the genetic-based residual static correction (GBRS) algorithm with three techniques. First, the original encodings is generated by per-forming floating encoding on the offset of each point. Second, a new encodings is constructed through paired crossover on the original ones. Third, the fitness function is used to select new original encodings to promote the evolution of the population. Experiment data with 50 shots and 50 receivers are generated using a simulation model. Results show that our algorithm usually converges in less 100 iterations to the optimal solution.
在地震勘探过程中,剩余静校正是提高分辨率的必要步骤。这是一项具有挑战性的任务,因为需要调整大量参数。已经提出了一些机器学习方法来处理这个问题,但结果还有待进一步加强。本文提出了一种基于遗传的残差静校正(GBRS)算法。首先,通过对每个点的偏移量进行浮点编码来生成原始编码。其次,在原有编码的基础上进行配对交叉,构造新的编码;第三,利用适应度函数选择新的原始编码,促进种群的进化。利用仿真模型生成了50次射击和50次接收机的实验数据。结果表明,该算法通常在不到100次迭代的情况下收敛到最优解。
{"title":"A Genetic Algorithm for Residual Static Correction","authors":"Miao Wu, Shulin Pan, Fan Min","doi":"10.1109/ICKG52313.2021.00069","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00069","url":null,"abstract":"Residual static correction is a necessary step to improve the resolution in the seismic exploration process. It is a challenging task because a large number of parameters need to be adjusted. Some machine learning methods have been proposed to deal with this problem, but the results should be further strengthened. In this paper, we propose the genetic-based residual static correction (GBRS) algorithm with three techniques. First, the original encodings is generated by per-forming floating encoding on the offset of each point. Second, a new encodings is constructed through paired crossover on the original ones. Third, the fitness function is used to select new original encodings to promote the evolution of the population. Experiment data with 50 shots and 50 receivers are generated using a simulation model. Results show that our algorithm usually converges in less 100 iterations to the optimal solution.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114379648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2021 IEEE International Conference on Big Knowledge (ICBK)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1