{"title":"Advancing General Sensor Data Synthesis by Integrating LLMs and Domain-Specific Generative Models","authors":"Xiaomao Zhou;Qingmin Jia;Yujiao Hu","doi":"10.1109/LSENS.2024.3470748","DOIUrl":null,"url":null,"abstract":"Synthetic data has become essential in machine learning and data science, addressing real-world data limitations such as scarcity, privacy, and cost. While existing generative models are effective in synthesizing various sensor data, they struggle with performance and generalization. This letter introduces a large language model (LLM)-driven framework that leverages LLMs and domain-specific generative models (DGMs) for general sensor data synthesis. Specifically, our method employs LLMs as the core to analyze data generation tasks, decompose complex tasks into manageable subtasks, and delegate each to the most suitable DGM, thereby automatically constructing customized data generation pipelines. Meanwhile, the integration of reinforcement learning (RL) is promising to enhance the framework's ability to optimally utilize DGMs, resulting in data generation with superior quality and control flexibility. Experimental results demonstrate the effectiveness of LLMs in understanding diverse tasks and in facilitating general sensor data synthesis through collaborative interactions with diverse DGMs.","PeriodicalId":13014,"journal":{"name":"IEEE Sensors Letters","volume":"8 11","pages":"1-4"},"PeriodicalIF":2.2000,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Sensors Letters","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10700677/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Synthetic data has become essential in machine learning and data science, addressing real-world data limitations such as scarcity, privacy, and cost. While existing generative models are effective in synthesizing various sensor data, they struggle with performance and generalization. This letter introduces a large language model (LLM)-driven framework that leverages LLMs and domain-specific generative models (DGMs) for general sensor data synthesis. Specifically, our method employs LLMs as the core to analyze data generation tasks, decompose complex tasks into manageable subtasks, and delegate each to the most suitable DGM, thereby automatically constructing customized data generation pipelines. Meanwhile, the integration of reinforcement learning (RL) is promising to enhance the framework's ability to optimally utilize DGMs, resulting in data generation with superior quality and control flexibility. Experimental results demonstrate the effectiveness of LLMs in understanding diverse tasks and in facilitating general sensor data synthesis through collaborative interactions with diverse DGMs.