{"title":"Latent Space Score-based Diffusion Model for Probabilistic Multivariate Time Series Imputation","authors":"Guojun Liang, Najmeh Abiri, Atiye Sadat Hashemi, Jens Lundström, Stefan Byttner, Prayag Tiwari","doi":"arxiv-2409.08917","DOIUrl":null,"url":null,"abstract":"Accurate imputation is essential for the reliability and success of\ndownstream tasks. Recently, diffusion models have attracted great attention in\nthis field. However, these models neglect the latent distribution in a\nlower-dimensional space derived from the observed data, which limits the\ngenerative capacity of the diffusion model. Additionally, dealing with the\noriginal missing data without labels becomes particularly problematic. To\naddress these issues, we propose the Latent Space Score-Based Diffusion Model\n(LSSDM) for probabilistic multivariate time series imputation. Observed values\nare projected onto low-dimensional latent space and coarse values of the\nmissing data are reconstructed without knowing their ground truth values by\nthis unsupervised learning approach. Finally, the reconstructed values are fed\ninto a conditional diffusion model to obtain the precise imputed values of the\ntime series. In this way, LSSDM not only possesses the power to identify the\nlatent distribution but also seamlessly integrates the diffusion model to\nobtain the high-fidelity imputed values and assess the uncertainty of the\ndataset. Experimental results demonstrate that LSSDM achieves superior\nimputation performance while also providing a better explanation and\nuncertainty analysis of the imputation mechanism. The website of the code is\n\\textit{https://github.com/gorgen2020/LSSDM\\_imputation}.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"27 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.08917","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Accurate imputation is essential for the reliability and success of
downstream tasks. Recently, diffusion models have attracted great attention in
this field. However, these models neglect the latent distribution in a
lower-dimensional space derived from the observed data, which limits the
generative capacity of the diffusion model. Additionally, dealing with the
original missing data without labels becomes particularly problematic. To
address these issues, we propose the Latent Space Score-Based Diffusion Model
(LSSDM) for probabilistic multivariate time series imputation. Observed values
are projected onto low-dimensional latent space and coarse values of the
missing data are reconstructed without knowing their ground truth values by
this unsupervised learning approach. Finally, the reconstructed values are fed
into a conditional diffusion model to obtain the precise imputed values of the
time series. In this way, LSSDM not only possesses the power to identify the
latent distribution but also seamlessly integrates the diffusion model to
obtain the high-fidelity imputed values and assess the uncertainty of the
dataset. Experimental results demonstrate that LSSDM achieves superior
imputation performance while also providing a better explanation and
uncertainty analysis of the imputation mechanism. The website of the code is
\textit{https://github.com/gorgen2020/LSSDM\_imputation}.