Pub Date : 2023-09-01DOI: 10.1007/s41019-023-00223-w
Kangfei Zhao, Zongyan He, Jeffrey Xu Yu, Yu Rong
Abstract Deep Learning (DL) has been widely used in many applications, and its success is achieved with large training data. A key issue is how to provide a DL solution when there is no large training data to learn initially. In this paper, we explore a meta-learning approach for a specific problem, subgraph isomorphism counting, which is a fundamental problem in graph analysis to count the number of a given pattern graph, p , in a data graph, g , that matches p . There are various data graphs and pattern graphs. A subgraph isomorphism counting query is specified by a pair, ( g , p ). This problem is NP-hard and needs large training data to learn by DL in nature. We design a Gaussian Process (GP) model which combines Graph Neural Network with Bayesian nonparametric, and we train the GP by a meta-learning algorithm on a small set of training data. By meta-learning, we can obtain a generalized meta-model to better encode the information of data and pattern graphs and capture the prior of small tasks. With the meta-model learned, we handle a collection of pairs ( g , p ), as a task, where some pairs may be associated with the ground-truth, and some pairs are the queries to answer. There are two cases. One is there are some with ground-truth (few-shot), and one is there is none with ground-truth (zero-shot). We provide our solutions for both. In particular, for zero-shot, we propose a new data-driven approach to predict the count values. Note that zero-shot learning for our regression tasks is difficult, and there is no hands-on solution in the literature. We conducted extensive experimental studies to confirm that our approach is robust to model degeneration on small training data, and our meta-model can fast adapt to new queries by few-shot and zero-shot learning.
{"title":"Learning with Small Data: Subgraph Counting Queries","authors":"Kangfei Zhao, Zongyan He, Jeffrey Xu Yu, Yu Rong","doi":"10.1007/s41019-023-00223-w","DOIUrl":"https://doi.org/10.1007/s41019-023-00223-w","url":null,"abstract":"Abstract Deep Learning (DL) has been widely used in many applications, and its success is achieved with large training data. A key issue is how to provide a DL solution when there is no large training data to learn initially. In this paper, we explore a meta-learning approach for a specific problem, subgraph isomorphism counting, which is a fundamental problem in graph analysis to count the number of a given pattern graph, p , in a data graph, g , that matches p . There are various data graphs and pattern graphs. A subgraph isomorphism counting query is specified by a pair, ( g , p ). This problem is NP-hard and needs large training data to learn by DL in nature. We design a Gaussian Process (GP) model which combines Graph Neural Network with Bayesian nonparametric, and we train the GP by a meta-learning algorithm on a small set of training data. By meta-learning, we can obtain a generalized meta-model to better encode the information of data and pattern graphs and capture the prior of small tasks. With the meta-model learned, we handle a collection of pairs ( g , p ), as a task, where some pairs may be associated with the ground-truth, and some pairs are the queries to answer. There are two cases. One is there are some with ground-truth (few-shot), and one is there is none with ground-truth (zero-shot). We provide our solutions for both. In particular, for zero-shot, we propose a new data-driven approach to predict the count values. Note that zero-shot learning for our regression tasks is difficult, and there is no hands-on solution in the literature. We conducted extensive experimental studies to confirm that our approach is robust to model degeneration on small training data, and our meta-model can fast adapt to new queries by few-shot and zero-shot learning.","PeriodicalId":52220,"journal":{"name":"Data Science and Engineering","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136355012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-29DOI: 10.1007/s41019-023-00225-8
Junyang Chen, Ziyi Chen, Mengzhu Wang, Ge Fan, Guo Zhong, Ou Liu, Wenfeng Du, Zhenghua Xu, Zhiguo Gong
{"title":"A Neural Inference of User Social Interest for Item Recommendation","authors":"Junyang Chen, Ziyi Chen, Mengzhu Wang, Ge Fan, Guo Zhong, Ou Liu, Wenfeng Du, Zhenghua Xu, Zhiguo Gong","doi":"10.1007/s41019-023-00225-8","DOIUrl":"https://doi.org/10.1007/s41019-023-00225-8","url":null,"abstract":"","PeriodicalId":52220,"journal":{"name":"Data Science and Engineering","volume":null,"pages":null},"PeriodicalIF":4.2,"publicationDate":"2023-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80204113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-05-05DOI: 10.1007/s41019-023-00214-x
Zi-Yuan Chen, Xin Wang, Chenxu Wang, Zhao Li
{"title":"PosKHG: A Position-Aware Knowledge Hypergraph Model for Link Prediction","authors":"Zi-Yuan Chen, Xin Wang, Chenxu Wang, Zhao Li","doi":"10.1007/s41019-023-00214-x","DOIUrl":"https://doi.org/10.1007/s41019-023-00214-x","url":null,"abstract":"","PeriodicalId":52220,"journal":{"name":"Data Science and Engineering","volume":null,"pages":null},"PeriodicalIF":4.2,"publicationDate":"2023-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74459706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-26DOI: 10.1007/s41019-023-00209-8
Di Zhu, Hai-Lian Yin, Yi Xu, Jiaqi Wu, Bowen Zhang, Yang Cheng, Zhanzuo Yin, Ziqiang Yu, Hao Wen, Bo-wen Li
{"title":"A Survey of Advanced Information Fusion System: from Model-Driven to Knowledge-Enabled","authors":"Di Zhu, Hai-Lian Yin, Yi Xu, Jiaqi Wu, Bowen Zhang, Yang Cheng, Zhanzuo Yin, Ziqiang Yu, Hao Wen, Bo-wen Li","doi":"10.1007/s41019-023-00209-8","DOIUrl":"https://doi.org/10.1007/s41019-023-00209-8","url":null,"abstract":"","PeriodicalId":52220,"journal":{"name":"Data Science and Engineering","volume":null,"pages":null},"PeriodicalIF":4.2,"publicationDate":"2023-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86197356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-26DOI: 10.1007/s41019-023-00213-y
Anis El Rabaa, Shady Elbassuoni, Jihad Hanna, A. E. Mouawad, Ayham Olleik, S. Amer-Yahia
{"title":"A Framework to Maximize Group Fairness for Workers on Online Labor Platforms","authors":"Anis El Rabaa, Shady Elbassuoni, Jihad Hanna, A. E. Mouawad, Ayham Olleik, S. Amer-Yahia","doi":"10.1007/s41019-023-00213-y","DOIUrl":"https://doi.org/10.1007/s41019-023-00213-y","url":null,"abstract":"","PeriodicalId":52220,"journal":{"name":"Data Science and Engineering","volume":null,"pages":null},"PeriodicalIF":4.2,"publicationDate":"2023-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89587192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-24DOI: 10.1007/s41019-023-00210-1
Bo Ning, Deji Zhao, Xinjian Zhang, Chao Wang, Shuangyong Song
{"title":"UMP-MG: A Uni-directed Message-Passing Multi-label Generation Model for Hierarchical Text Classification","authors":"Bo Ning, Deji Zhao, Xinjian Zhang, Chao Wang, Shuangyong Song","doi":"10.1007/s41019-023-00210-1","DOIUrl":"https://doi.org/10.1007/s41019-023-00210-1","url":null,"abstract":"","PeriodicalId":52220,"journal":{"name":"Data Science and Engineering","volume":null,"pages":null},"PeriodicalIF":4.2,"publicationDate":"2023-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82032200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}