Data Science and Engineering最新文献

英文中文

Personalized Re-ranking for Recommendation with Mask Pretraining 基于掩码预训练的个性化推荐重新排序

IF 4.2 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Data Science and Engineering

Pub Date : 2023-09-02 DOI: 10.1007/s41019-023-00219-6

Peng Han, Silin Zhou, Jie Yu, Zichen Xu, Lisi Chen, Shuo Shang

引用次数: 1

Special Issue of DASFAA 2023 dasfaa2023特刊

2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Data Science and Engineering

Pub Date : 2023-09-01 DOI: 10.1007/s41019-023-00231-w

Xin Wang, Maria Luisa Sapino, Wook-Shin Han, Yingxiao Shao, Hongzhi Yin

引用次数: 0

Efficient Network Representation Learning via Cluster Similarity 基于聚类相似性的高效网络表示学习

2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Data Science and Engineering

Pub Date : 2023-09-01 DOI: 10.1007/s41019-023-00222-x

Yasuhiro Fujiwara, Yasutoshi Ida, Atsutoshi Kumagai, Masahiro Nakano, Akisato Kimura, Naonori Ueda

Abstract Network representation learning is a de facto tool for graph analytics. The mainstream of the previous approaches is to factorize the proximity matrix between nodes. However, if n is the number of nodes, since the size of the proximity matrix is $$n times n$$ n × n , it needs $$O(n^3)$$ O ( n 3 ) time and $$O(n^2)$$ O ( n 2 ) space to perform network representation learning; they are significantly high for large-scale graphs. This paper introduces the novel idea of using similarities between clusters instead of proximities between nodes; the proposed approach computes the representations of the clusters from similarities between clusters and computes the representations of nodes by referring to them. If l is the number of clusters, since $$l ll n$$ l ≪ n , we can efficiently obtain the representations of clusters from a small $$l times l$$ l × l similarity matrix. Furthermore, since nodes in each cluster share similar structural properties, we can effectively compute the representation vectors of nodes. Experiments show that our approach can perform network representation learning more efficiently and effectively than existing approaches.

网络表示学习实际上是图分析的工具。以往的主流方法是对节点间的接近矩阵进行因式分解。但如果n为节点数，由于邻近矩阵的大小为$$n times n$$ n × n，进行网络表示学习需要$$O(n^3)$$ O (n 3)时间和$$O(n^2)$$ O (n 2)空间;对于大规模图形来说，它们非常高。本文介绍了利用聚类之间的相似度代替节点之间的接近度的新思想;该方法根据聚类之间的相似性计算聚类的表示，并通过引用节点来计算节点的表示。如果l是聚类的数目，由于$$l ll n$$ l≪n，我们可以从一个小的$$l times l$$ l × l相似矩阵中有效地得到聚类的表示。此外，由于每个集群中的节点具有相似的结构属性，我们可以有效地计算节点的表示向量。实验表明，我们的方法可以比现有的方法更有效地进行网络表示学习。

{"title":"Efficient Network Representation Learning via Cluster Similarity","authors":"Yasuhiro Fujiwara, Yasutoshi Ida, Atsutoshi Kumagai, Masahiro Nakano, Akisato Kimura, Naonori Ueda","doi":"10.1007/s41019-023-00222-x","DOIUrl":"https://doi.org/10.1007/s41019-023-00222-x","url":null,"abstract":"Abstract Network representation learning is a de facto tool for graph analytics. The mainstream of the previous approaches is to factorize the proximity matrix between nodes. However, if n is the number of nodes, since the size of the proximity matrix is $$n times n$$ <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\"> <mml:mrow> <mml:mi>n</mml:mi> <mml:mo>×</mml:mo> <mml:mi>n</mml:mi> </mml:mrow> </mml:math> , it needs $$O(n^3)$$ <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\"> <mml:mrow> <mml:mi>O</mml:mi> <mml:mo>(</mml:mo> <mml:msup> <mml:mi>n</mml:mi> <mml:mn>3</mml:mn> </mml:msup> <mml:mo>)</mml:mo> </mml:mrow> </mml:math> time and $$O(n^2)$$ <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\"> <mml:mrow> <mml:mi>O</mml:mi> <mml:mo>(</mml:mo> <mml:msup> <mml:mi>n</mml:mi> <mml:mn>2</mml:mn> </mml:msup> <mml:mo>)</mml:mo> </mml:mrow> </mml:math> space to perform network representation learning; they are significantly high for large-scale graphs. This paper introduces the novel idea of using similarities between clusters instead of proximities between nodes; the proposed approach computes the representations of the clusters from similarities between clusters and computes the representations of nodes by referring to them. If l is the number of clusters, since $$l ll n$$ <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\"> <mml:mrow> <mml:mi>l</mml:mi> <mml:mo>≪</mml:mo> <mml:mi>n</mml:mi> </mml:mrow> </mml:math> , we can efficiently obtain the representations of clusters from a small $$l times l$$ <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\"> <mml:mrow> <mml:mi>l</mml:mi> <mml:mo>×</mml:mo> <mml:mi>l</mml:mi> </mml:mrow> </mml:math> similarity matrix. Furthermore, since nodes in each cluster share similar structural properties, we can effectively compute the representation vectors of nodes. Experiments show that our approach can perform network representation learning more efficiently and effectively than existing approaches.","PeriodicalId":52220,"journal":{"name":"Data Science and Engineering","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135298458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Fully Dynamic Contraction Hierarchies with Label Restrictions on Road Networks 路网上具有标签限制的完全动态收缩层次结构

IF 4.2 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Data Science and Engineering

Pub Date : 2023-09-01 DOI: 10.1007/s41019-023-00227-6

Zi Chen, Bo Feng, Long Yuan, Xuemin Lin, Liping Wang

引用次数: 0

Deep Learning-Based Bloom Filter for Efficient Multi-key Membership Testing 基于深度学习的高效多键隶属度测试布隆过滤器

IF 4.2 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Data Science and Engineering

Pub Date : 2023-09-01 DOI: 10.1007/s41019-023-00224-9

Haitian Chen, Ziwei Wang, Yunchuan Li, Ruixin Yang, Yan Zhao, Ruibo Zhou, Kai Zheng

引用次数: 0

Combining Graph Contrastive Embedding and Multi-head Cross-Attention Transfer for Cross-Domain Recommendation 结合图对比嵌入和多头交叉注意转移的跨领域推荐

IF 4.2 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Data Science and Engineering

Pub Date : 2023-09-01 DOI: 10.1007/s41019-023-00226-7

Shuo Xiao, Dongqing Zhu, Chaogang Tang, Zhenzhen Huang

引用次数: 0

Learning with Small Data: Subgraph Counting Queries 小数据学习:子图计数查询

2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Data Science and Engineering

Pub Date : 2023-09-01 DOI: 10.1007/s41019-023-00223-w

Kangfei Zhao, Zongyan He, Jeffrey Xu Yu, Yu Rong

Abstract Deep Learning (DL) has been widely used in many applications, and its success is achieved with large training data. A key issue is how to provide a DL solution when there is no large training data to learn initially. In this paper, we explore a meta-learning approach for a specific problem, subgraph isomorphism counting, which is a fundamental problem in graph analysis to count the number of a given pattern graph, p , in a data graph, g , that matches p . There are various data graphs and pattern graphs. A subgraph isomorphism counting query is specified by a pair, ( g , p ). This problem is NP-hard and needs large training data to learn by DL in nature. We design a Gaussian Process (GP) model which combines Graph Neural Network with Bayesian nonparametric, and we train the GP by a meta-learning algorithm on a small set of training data. By meta-learning, we can obtain a generalized meta-model to better encode the information of data and pattern graphs and capture the prior of small tasks. With the meta-model learned, we handle a collection of pairs ( g , p ), as a task, where some pairs may be associated with the ground-truth, and some pairs are the queries to answer. There are two cases. One is there are some with ground-truth (few-shot), and one is there is none with ground-truth (zero-shot). We provide our solutions for both. In particular, for zero-shot, we propose a new data-driven approach to predict the count values. Note that zero-shot learning for our regression tasks is difficult, and there is no hands-on solution in the literature. We conducted extensive experimental studies to confirm that our approach is robust to model degeneration on small training data, and our meta-model can fast adapt to new queries by few-shot and zero-shot learning.

深度学习(Deep Learning, DL)在许多应用中得到了广泛的应用，它的成功离不开大量的训练数据。一个关键问题是如何在最初没有大量训练数据可供学习的情况下提供DL解决方案。在本文中，我们探索了一个特定问题的元学习方法，即子图同构计数，这是图分析中的一个基本问题，用于计算数据图g中与p匹配的给定模式图p的个数。有各种各样的数据图和模式图。子图同构计数查询由一对(g, p)指定。这个问题本质上是np困难的，需要大量的训练数据来进行深度学习。我们设计了一个结合了图神经网络和贝叶斯非参数的高斯过程模型，并在一个小的训练数据集上使用元学习算法对高斯过程进行训练。通过元学习，我们可以得到一个广义的元模型来更好地编码数据和模式图的信息，并捕获小任务的先验。通过学习元模型，我们处理一组对(g, p)，作为一个任务，其中一些对可能与基本事实相关联，一些对是要回答的查询。有两种情况。一种是有一些是基本事实(few-shot)，一种是没有基本事实(zero-shot)。我们为这两方面提供解决方案。特别是对于零射，我们提出了一种新的数据驱动方法来预测计数值。请注意，我们的回归任务的零学习是困难的，并且在文献中没有实际的解决方案。我们进行了大量的实验研究，以证实我们的方法对小训练数据上的模型退化具有鲁棒性，并且我们的元模型可以通过few-shot和zero-shot学习快速适应新的查询。

{"title":"Learning with Small Data: Subgraph Counting Queries","authors":"Kangfei Zhao, Zongyan He, Jeffrey Xu Yu, Yu Rong","doi":"10.1007/s41019-023-00223-w","DOIUrl":"https://doi.org/10.1007/s41019-023-00223-w","url":null,"abstract":"Abstract Deep Learning (DL) has been widely used in many applications, and its success is achieved with large training data. A key issue is how to provide a DL solution when there is no large training data to learn initially. In this paper, we explore a meta-learning approach for a specific problem, subgraph isomorphism counting, which is a fundamental problem in graph analysis to count the number of a given pattern graph, p , in a data graph, g , that matches p . There are various data graphs and pattern graphs. A subgraph isomorphism counting query is specified by a pair, ( g , p ). This problem is NP-hard and needs large training data to learn by DL in nature. We design a Gaussian Process (GP) model which combines Graph Neural Network with Bayesian nonparametric, and we train the GP by a meta-learning algorithm on a small set of training data. By meta-learning, we can obtain a generalized meta-model to better encode the information of data and pattern graphs and capture the prior of small tasks. With the meta-model learned, we handle a collection of pairs ( g , p ), as a task, where some pairs may be associated with the ground-truth, and some pairs are the queries to answer. There are two cases. One is there are some with ground-truth (few-shot), and one is there is none with ground-truth (zero-shot). We provide our solutions for both. In particular, for zero-shot, we propose a new data-driven approach to predict the count values. Note that zero-shot learning for our regression tasks is difficult, and there is no hands-on solution in the literature. We conducted extensive experimental studies to confirm that our approach is robust to model degeneration on small training data, and our meta-model can fast adapt to new queries by few-shot and zero-shot learning.","PeriodicalId":52220,"journal":{"name":"Data Science and Engineering","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136355012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SSTP: Social and Spatial-Temporal Aware Next Point-of-Interest Recommendation SSTP:社会和时空感知下一个兴趣点建议

IF 4.2 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Data Science and Engineering

Pub Date : 2023-08-30 DOI: 10.1007/s41019-023-00221-y

Junzhuang Wu, Yujing Zhang, Yuhua Li, Yixiong Zou, Rui Li, Zhenyu Zhang

引用次数: 0

A Neural Inference of User Social Interest for Item Recommendation 面向项目推荐的用户社会兴趣神经推理

IF 4.2 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Data Science and Engineering

Pub Date : 2023-08-29 DOI: 10.1007/s41019-023-00225-8

Junyang Chen, Ziyi Chen, Mengzhu Wang, Ge Fan, Guo Zhong, Ou Liu, Wenfeng Du, Zhenghua Xu, Zhiguo Gong

引用次数: 0

A One-Size-Fits-Three Representation Learning Framework for Patient Similarity Search 一种适合三种患者相似度搜索的学习框架

IF 4.2 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Data Science and Engineering

Pub Date : 2023-08-12 DOI: 10.1007/s41019-023-00216-9

Yefan Huang, Feng Luo, Xiaoli Wang, Zhu Di, Bo Li, Bin Luo

引用次数: 1

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Data Science and Engineering

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀