SGKD:用于图表示学习的可扩展和有效的知识蒸馏框架

Yufei He, Yao Ma
{"title":"SGKD:用于图表示学习的可扩展和有效的知识蒸馏框架","authors":"Yufei He, Yao Ma","doi":"10.1109/ICDMW58026.2022.00091","DOIUrl":null,"url":null,"abstract":"As Graph Neural Networks (GNNs) are widely used in various fields, there is a growing demand for improving their efficiency and scalablity. Knowledge Distillation (KD), a classical methods for model compression and acceleration, has been gradually introduced into the field of graph learning. More recently, it has been shown that, through knowledge distillation, the predictive capability of a well-trained GNN model can be transferred to lightweight and easy-to-deploy MLP models. Such distilled MLPs are able to achieve comparable performance as their corresponding G NN teachers while being significantly more efficient in terms of both space and time. However, the research of KD for graph learning is still in its early stage and there exist several limitations in the existing KD framework. The major issues lie in distilled MLPs lack useful information about the graph structure and logits of teacher are not always reliable. In this paper, we propose a Scalable and effective graph neural network Knowledge Distillation framework (SGKD) to address these issues. Specifically, to include the graph, we use feature propagation as preprocessing to provide MLPs with graph structure-aware features in the original feature space; to address unreliable logits of teacher, we introduce simple yet effective training strategies such as masking and temperature. With these innovations, our framework is able to be more effective while remaining scalable and efficient in training and inference. We conducted comprehensive experiments on eight datasets of different sizes - up to 100 million nodes - under various settings. The results demonstrated that SG KD is able to significantly outperform existing KD methods and even achieve comparable performance with their state-of-the-art GNN teachers.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"SGKD: A Scalable and Effective Knowledge Distillation Framework for Graph Representation Learning\",\"authors\":\"Yufei He, Yao Ma\",\"doi\":\"10.1109/ICDMW58026.2022.00091\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As Graph Neural Networks (GNNs) are widely used in various fields, there is a growing demand for improving their efficiency and scalablity. Knowledge Distillation (KD), a classical methods for model compression and acceleration, has been gradually introduced into the field of graph learning. More recently, it has been shown that, through knowledge distillation, the predictive capability of a well-trained GNN model can be transferred to lightweight and easy-to-deploy MLP models. Such distilled MLPs are able to achieve comparable performance as their corresponding G NN teachers while being significantly more efficient in terms of both space and time. However, the research of KD for graph learning is still in its early stage and there exist several limitations in the existing KD framework. The major issues lie in distilled MLPs lack useful information about the graph structure and logits of teacher are not always reliable. In this paper, we propose a Scalable and effective graph neural network Knowledge Distillation framework (SGKD) to address these issues. Specifically, to include the graph, we use feature propagation as preprocessing to provide MLPs with graph structure-aware features in the original feature space; to address unreliable logits of teacher, we introduce simple yet effective training strategies such as masking and temperature. With these innovations, our framework is able to be more effective while remaining scalable and efficient in training and inference. We conducted comprehensive experiments on eight datasets of different sizes - up to 100 million nodes - under various settings. The results demonstrated that SG KD is able to significantly outperform existing KD methods and even achieve comparable performance with their state-of-the-art GNN teachers.\",\"PeriodicalId\":146687,\"journal\":{\"name\":\"2022 IEEE International Conference on Data Mining Workshops (ICDMW)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Data Mining Workshops (ICDMW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDMW58026.2022.00091\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW58026.2022.00091","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

随着图神经网络在各个领域的广泛应用,人们对其效率和可扩展性的要求越来越高。知识蒸馏(Knowledge Distillation, KD)是一种经典的模型压缩和加速方法,已逐渐被引入图学习领域。最近,研究表明,通过知识蒸馏,训练有素的GNN模型的预测能力可以转移到轻量级且易于部署的MLP模型中。这种经过提炼的mlp能够达到与相应的gnn教师相当的性能,同时在空间和时间方面都显着提高效率。然而,KD用于图学习的研究还处于起步阶段,现有的KD框架存在一些局限性。主要问题在于提取的mlp缺乏关于图结构的有用信息,并且教师的逻辑并不总是可靠的。在本文中,我们提出了一个可扩展和有效的图神经网络知识蒸馏框架(SGKD)来解决这些问题。具体来说,为了包含图,我们使用特征传播作为预处理,在原始特征空间中为mlp提供图结构感知特征;为了解决教师逻辑不可靠的问题,我们引入了简单而有效的训练策略,如遮蔽和温度。通过这些创新,我们的框架能够更有效,同时在训练和推理中保持可扩展性和效率。我们在8个不同规模的数据集上进行了全面的实验,在不同的设置下,多达1亿个节点。结果表明,SG KD能够显著优于现有的KD方法,甚至达到与他们最先进的GNN教师相当的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
SGKD: A Scalable and Effective Knowledge Distillation Framework for Graph Representation Learning
As Graph Neural Networks (GNNs) are widely used in various fields, there is a growing demand for improving their efficiency and scalablity. Knowledge Distillation (KD), a classical methods for model compression and acceleration, has been gradually introduced into the field of graph learning. More recently, it has been shown that, through knowledge distillation, the predictive capability of a well-trained GNN model can be transferred to lightweight and easy-to-deploy MLP models. Such distilled MLPs are able to achieve comparable performance as their corresponding G NN teachers while being significantly more efficient in terms of both space and time. However, the research of KD for graph learning is still in its early stage and there exist several limitations in the existing KD framework. The major issues lie in distilled MLPs lack useful information about the graph structure and logits of teacher are not always reliable. In this paper, we propose a Scalable and effective graph neural network Knowledge Distillation framework (SGKD) to address these issues. Specifically, to include the graph, we use feature propagation as preprocessing to provide MLPs with graph structure-aware features in the original feature space; to address unreliable logits of teacher, we introduce simple yet effective training strategies such as masking and temperature. With these innovations, our framework is able to be more effective while remaining scalable and efficient in training and inference. We conducted comprehensive experiments on eight datasets of different sizes - up to 100 million nodes - under various settings. The results demonstrated that SG KD is able to significantly outperform existing KD methods and even achieve comparable performance with their state-of-the-art GNN teachers.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Above Ground Biomass Estimation of a Cocoa Plantation using Machine Learning Backdoor Poisoning of Encrypted Traffic Classifiers Identifying Patterns of Vulnerability Incidence in Foundational Machine Learning Repositories on GitHub: An Unsupervised Graph Embedding Approach Data-driven Kernel Subspace Clustering with Local Manifold Preservation Persona-Based Conversational AI: State of the Art and Challenges
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1