{"title":"On Entropic Learning from Noisy Time Series in the Small Data Regime","authors":"Davide Bassetti, Lukáš Pospíšil, Illia Horenko","doi":"10.3390/e26070553","DOIUrl":null,"url":null,"abstract":"In this work, we present a novel methodology for performing the supervised classification of time-ordered noisy data; we call this methodology Entropic Sparse Probabilistic Approximation with Markov regularization (eSPA-Markov). It is an extension of entropic learning methodologies, allowing the simultaneous learning of segmentation patterns, entropy-optimal feature space discretizations, and Bayesian classification rules. We prove the conditions for the existence and uniqueness of the learning problem solution and propose a one-shot numerical learning algorithm that—in the leading order—scales linearly in dimension. We show how this technique can be used for the computationally scalable identification of persistent (metastable) regime affiliations and regime switches from high-dimensional non-stationary and noisy time series, i.e., when the size of the data statistics is small compared to their dimensionality and when the noise variance is larger than the variance in the signal. We demonstrate its performance on a set of toy learning problems, comparing eSPA-Markov to state-of-the-art techniques, including deep learning and random forests. We show how this technique can be used for the analysis of noisy time series from DNA and RNA Nanopore sequencing.","PeriodicalId":11694,"journal":{"name":"Entropy","volume":null,"pages":null},"PeriodicalIF":2.1000,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Entropy","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.3390/e26070553","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PHYSICS, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
In this work, we present a novel methodology for performing the supervised classification of time-ordered noisy data; we call this methodology Entropic Sparse Probabilistic Approximation with Markov regularization (eSPA-Markov). It is an extension of entropic learning methodologies, allowing the simultaneous learning of segmentation patterns, entropy-optimal feature space discretizations, and Bayesian classification rules. We prove the conditions for the existence and uniqueness of the learning problem solution and propose a one-shot numerical learning algorithm that—in the leading order—scales linearly in dimension. We show how this technique can be used for the computationally scalable identification of persistent (metastable) regime affiliations and regime switches from high-dimensional non-stationary and noisy time series, i.e., when the size of the data statistics is small compared to their dimensionality and when the noise variance is larger than the variance in the signal. We demonstrate its performance on a set of toy learning problems, comparing eSPA-Markov to state-of-the-art techniques, including deep learning and random forests. We show how this technique can be used for the analysis of noisy time series from DNA and RNA Nanopore sequencing.
期刊介绍:
Entropy (ISSN 1099-4300), an international and interdisciplinary journal of entropy and information studies, publishes reviews, regular research papers and short notes. Our aim is to encourage scientists to publish as much as possible their theoretical and experimental details. There is no restriction on the length of the papers. If there are computation and the experiment, the details must be provided so that the results can be reproduced.