Pub Date : 2024-08-08DOI: 10.1109/TKDE.2024.3440654
Mang Ye;Yi Yu;Ziqin Shen;Wei Yu;Qingyan Zeng
The rising popularity of tabular data in data science applications has led to a surge of interest in utilizing deep neural networks (DNNs) to address tabular problems. Existing deep neural network methods are not effective in handling two fundamental challenges that are inherent in tabular data: permutation invariance (where the labels remain unchanged regardless of element order) and local dependency (where predictive labels are solely determined by local features). Furthermore, given the inherent heterogeneity among elements in tabular data, effectively capturing heterogeneous feature interactions remains unresolved. In this paper, we propose a novel Multiplex Cross-Feature Interaction Network (MPCFIN) by explicitly and systematically modeling feature relations with interactive graph neural networks. Specifically, MPCFIN first learns the most relevant features associated with individual features, and merges them to form cross-feature embedding. Subsequently, we design a multiplex graph neural network to learn enhanced representation for each sample. Comprehensive experiments on seven datasets demonstrate that MPCFIN exhibits superior performance over deep neural network methods in modeling the tabular data, showcasing consistent interpretability in its cross-feature embedding module for medical diagnosis applications.
{"title":"Cross-Feature Interactive Tabular Data Modeling With Multiplex Graph Neural Networks","authors":"Mang Ye;Yi Yu;Ziqin Shen;Wei Yu;Qingyan Zeng","doi":"10.1109/TKDE.2024.3440654","DOIUrl":"10.1109/TKDE.2024.3440654","url":null,"abstract":"The rising popularity of tabular data in data science applications has led to a surge of interest in utilizing deep neural networks (DNNs) to address tabular problems. Existing deep neural network methods are not effective in handling two fundamental challenges that are inherent in tabular data: permutation invariance (where the labels remain unchanged regardless of element order) and local dependency (where predictive labels are solely determined by local features). Furthermore, given the inherent heterogeneity among elements in tabular data, effectively capturing heterogeneous feature interactions remains unresolved. In this paper, we propose a novel Multiplex Cross-Feature Interaction Network (MPCFIN) by explicitly and systematically modeling feature relations with interactive graph neural networks. Specifically, MPCFIN first learns the most relevant features associated with individual features, and merges them to form cross-feature embedding. Subsequently, we design a multiplex graph neural network to learn enhanced representation for each sample. Comprehensive experiments on seven datasets demonstrate that MPCFIN exhibits superior performance over deep neural network methods in modeling the tabular data, showcasing consistent interpretability in its cross-feature embedding module for medical diagnosis applications.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"36 12","pages":"7851-7864"},"PeriodicalIF":8.9,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141935907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Federated Trajectory Matching (FTM) is gaining increasing importance in big trajectory data analytics, supporting diverse applications such as public health, law enforcement, and emergency response. FTM retrieves trajectories that match with a query trajectory from a large-scale trajectory database, while safeguarding the privacy of trajectories in both the query and the database. A naive solution to FTM is to process the query through Secure Multi-party Computation (SMC) across the entire database, which is inherently secure yet inevitably slow due to the massive secure operations. A promising acceleration strategy is to filter irrelevant trajectories from the database based on the query, thus reducing the SMC operations. However, a key challenge is how to publish the query in a way that both preserves privacy and enables efficient trajectory filtering. In this paper, we design ${sf GIST}$