Jiayun Zhang;Xinyang Zhang;Dezhi Hong;Rajesh K. Gupta;Jingbo Shang
{"title":"Contextual Inference From Sparse Shopping Transactions Based on Motif Patterns","authors":"Jiayun Zhang;Xinyang Zhang;Dezhi Hong;Rajesh K. Gupta;Jingbo Shang","doi":"10.1109/TKDE.2024.3452638","DOIUrl":null,"url":null,"abstract":"Inferring contextual information such as demographics from historical transactions is valuable to public agencies and businesses. Existing methods are data-hungry and do not work well when the available records of transactions are sparse. We consider here specifically inference of demographic information using limited historical grocery transactions from a few random trips that a typical business or public service organization may see. We propose a novel method called \n<sc>DemoMotif</small>\n to build a network model from heterogeneous data and identify subgraph patterns (i.e., motifs) that enable us to infer demographic attributes. We then design a novel motif context selection algorithm to find specific node combinations significant to certain demographic groups. Finally, we learn representations of households using these selected motif instances as context, and employ a standard classifier (e.g., SVM) for inference. For evaluation purposes, we use three real-world consumer datasets, spanning different regions and time periods in the U.S. We evaluate the framework for predicting three attributes: ethnicity, seniority of household heads, and presence of children. Extensive experiments and case studies demonstrate that \n<sc>DemoMotif</small>\n is capable of inferring household demographics using only a small number (e.g., fewer than 10) of random grocery trips, significantly outperforming the state-of-the-art.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 2","pages":"572-583"},"PeriodicalIF":8.9000,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10837578/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Inferring contextual information such as demographics from historical transactions is valuable to public agencies and businesses. Existing methods are data-hungry and do not work well when the available records of transactions are sparse. We consider here specifically inference of demographic information using limited historical grocery transactions from a few random trips that a typical business or public service organization may see. We propose a novel method called
DemoMotif
to build a network model from heterogeneous data and identify subgraph patterns (i.e., motifs) that enable us to infer demographic attributes. We then design a novel motif context selection algorithm to find specific node combinations significant to certain demographic groups. Finally, we learn representations of households using these selected motif instances as context, and employ a standard classifier (e.g., SVM) for inference. For evaluation purposes, we use three real-world consumer datasets, spanning different regions and time periods in the U.S. We evaluate the framework for predicting three attributes: ethnicity, seniority of household heads, and presence of children. Extensive experiments and case studies demonstrate that
DemoMotif
is capable of inferring household demographics using only a small number (e.g., fewer than 10) of random grocery trips, significantly outperforming the state-of-the-art.
期刊介绍:
The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.