首页 > 最新文献

Foundations of data science (Springfield, Mo.)最新文献

英文 中文
The rankability of weighted data from pairwise comparisons 两两比较中加权数据的排名
Q2 MATHEMATICS, APPLIED Pub Date : 2021-01-01 DOI: 10.3934/FODS.2021002
Paul E. Anderson, T. Chartier, A. Langville, Kathryn E. Pedings-Behling
In prior work [ 4 ], Anderson et al. introduced a new problem, the rankability problem, which refers to a dataset's inherent ability to produce a meaningful ranking of its items. Ranking is a fundamental data science task with numerous applications that include web search, data mining, cybersecurity, machine learning, and statistical learning theory. Yet little attention has been paid to the question of whether a dataset is suitable for ranking. As a result, when a ranking method is applied to a dataset with low rankability, the resulting ranking may not be reliable. Rankability paper [ 4 ] and its methods studied unweighted data for which the dominance relations are binary, i.e., an item either dominates or is dominated by another item. In this paper, we extend rankability methods to weighted data for which an item may dominate another by any finite amount. We present combinatorial approaches to a weighted rankability measure and apply our new measure to several weighted datasets.
在之前的工作[4]中,Anderson等人引入了一个新问题,即排名问题,这是指数据集对其项目产生有意义排名的固有能力。排名是一项基础数据科学任务,有许多应用,包括网络搜索、数据挖掘、网络安全、机器学习和统计学习理论。然而,很少有人关注数据集是否适合进行排名的问题。因此,当排名方法应用于排名性较低的数据集时,所得到的排名可能不可靠。排名论文[4]及其方法研究了优势关系为二元的未加权数据,即一个项目占主导地位或被另一个项目占主导地位。在本文中,我们将排名方法扩展到一个项目可以以任意有限的量支配另一个项目的加权数据。我们提出了加权排名度量的组合方法,并将我们的新度量应用于几个加权数据集。
{"title":"The rankability of weighted data from pairwise comparisons","authors":"Paul E. Anderson, T. Chartier, A. Langville, Kathryn E. Pedings-Behling","doi":"10.3934/FODS.2021002","DOIUrl":"https://doi.org/10.3934/FODS.2021002","url":null,"abstract":"In prior work [ 4 ], Anderson et al. introduced a new problem, the rankability problem, which refers to a dataset's inherent ability to produce a meaningful ranking of its items. Ranking is a fundamental data science task with numerous applications that include web search, data mining, cybersecurity, machine learning, and statistical learning theory. Yet little attention has been paid to the question of whether a dataset is suitable for ranking. As a result, when a ranking method is applied to a dataset with low rankability, the resulting ranking may not be reliable. Rankability paper [ 4 ] and its methods studied unweighted data for which the dominance relations are binary, i.e., an item either dominates or is dominated by another item. In this paper, we extend rankability methods to weighted data for which an item may dominate another by any finite amount. We present combinatorial approaches to a weighted rankability measure and apply our new measure to several weighted datasets.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70248064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Learning landmark geodesics using the ensemble Kalman filter 使用集合卡尔曼滤波器学习地标测地线
Q2 MATHEMATICS, APPLIED Pub Date : 2021-01-01 DOI: 10.3934/fods.2021020
Andreas Bock, C. Cotter
We study the problem of diffeomorphometric geodesic landmark matching where the objective is to find a diffeomorphism that, via its group action, maps between two sets of landmarks. It is well-known that the motion of the landmarks, and thereby the diffeomorphism, can be encoded by an initial momentum leading to a formulation where the landmark matching problem can be solved as an optimisation problem over such momenta. The novelty of our work lies in the application of a derivative-free Bayesian inverse method for learning the optimal momentum encoding the diffeomorphic mapping between the template and the target. The method we apply is the ensemble Kalman filter, an extension of the Kalman filter to nonlinear operators. We describe an efficient implementation of the algorithm and show several numerical results for various target shapes.
我们研究了微分同构的测地线地标匹配问题,其目标是找到一个通过群作用在两组地标之间映射的微分同构。众所周知,地标的运动,从而微分同构,可以通过一个初始动量编码,导致一个公式,其中地标匹配问题可以作为一个优化问题解决在这样的动量。我们工作的新颖之处在于应用无导数贝叶斯逆方法来学习模板和目标之间差分映射的最优动量编码。我们采用的方法是集合卡尔曼滤波,这是卡尔曼滤波在非线性算子上的扩展。我们描述了一种有效的算法实现,并给出了几种不同形状目标的数值结果。
{"title":"Learning landmark geodesics using the ensemble Kalman filter","authors":"Andreas Bock, C. Cotter","doi":"10.3934/fods.2021020","DOIUrl":"https://doi.org/10.3934/fods.2021020","url":null,"abstract":"We study the problem of diffeomorphometric geodesic landmark matching where the objective is to find a diffeomorphism that, via its group action, maps between two sets of landmarks. It is well-known that the motion of the landmarks, and thereby the diffeomorphism, can be encoded by an initial momentum leading to a formulation where the landmark matching problem can be solved as an optimisation problem over such momenta. The novelty of our work lies in the application of a derivative-free Bayesian inverse method for learning the optimal momentum encoding the diffeomorphic mapping between the template and the target. The method we apply is the ensemble Kalman filter, an extension of the Kalman filter to nonlinear operators. We describe an efficient implementation of the algorithm and show several numerical results for various target shapes.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70248353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Reconstructing linearly embedded graphs: A first step to stratified space learning 重构线性嵌入图:分层空间学习的第一步
Q2 MATHEMATICS, APPLIED Pub Date : 2021-01-01 DOI: 10.3934/fods.2021026
Yossi Bokor Bleile, Katharine Turner, Christopher Williams
In this paper, we consider the simplest class of stratified spaces – linearly embedded graphs. We present an algorithm that learns the abstract structure of an embedded graph and models the specific embedding from a point cloud sampled from it. We use tools and inspiration from computational geometry, algebraic topology, and topological data analysis and prove the correctness of the identified abstract structure under assumptions on the embedding. The algorithm is implemented in the Julia package Skyler, which we used for the numerical simulations in this paper.
本文考虑了最简单的一类分层空间——线性嵌入图。我们提出了一种算法,该算法学习嵌入图的抽象结构,并从从中采样的点云中对特定嵌入建模。我们利用计算几何、代数拓扑和拓扑数据分析的工具和灵感,证明了在嵌入假设下识别的抽象结构的正确性。该算法在Julia软件包Skyler中实现,本文使用该软件包进行数值模拟。
{"title":"Reconstructing linearly embedded graphs: A first step to stratified space learning","authors":"Yossi Bokor Bleile, Katharine Turner, Christopher Williams","doi":"10.3934/fods.2021026","DOIUrl":"https://doi.org/10.3934/fods.2021026","url":null,"abstract":"In this paper, we consider the simplest class of stratified spaces – linearly embedded graphs. We present an algorithm that learns the abstract structure of an embedded graph and models the specific embedding from a point cloud sampled from it. We use tools and inspiration from computational geometry, algebraic topology, and topological data analysis and prove the correctness of the identified abstract structure under assumptions on the embedding. The algorithm is implemented in the Julia package Skyler, which we used for the numerical simulations in this paper.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70248419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Score matching filters for Gaussian Markov random fields with a linear model of the precision matrix 分数匹配滤波器高斯马尔可夫随机场与精度矩阵的线性模型
Q2 MATHEMATICS, APPLIED Pub Date : 2021-01-01 DOI: 10.3934/fods.2021030
Marie Turčičová, J. Mandel, K. Eben
We present an ensemble filtering method based on a linear model for the precision matrix (the inverse of the covariance) with the parameters determined by Score Matching Estimation. The method provides a rigorous covariance regularization when the underlying random field is Gaussian Markov. The parameters are found by solving a system of linear equations. The analysis step uses the inverse formulation of the Kalman update. Several filter versions, differing in the construction of the analysis ensemble, are proposed, as well as a Score matching version of the Extended Kalman Filter.
我们提出了一种基于精度矩阵(协方差逆)的线性模型的集成滤波方法,其参数由分数匹配估计确定。当底层随机场为高斯马尔可夫时,该方法提供了严格的协方差正则化。这些参数是通过求解一个线性方程组得到的。分析步骤使用卡尔曼更新的逆公式。提出了几种不同于分析集合构造的滤波器版本,以及扩展卡尔曼滤波器的分数匹配版本。
{"title":"Score matching filters for Gaussian Markov random fields with a linear model of the precision matrix","authors":"Marie Turčičová, J. Mandel, K. Eben","doi":"10.3934/fods.2021030","DOIUrl":"https://doi.org/10.3934/fods.2021030","url":null,"abstract":"We present an ensemble filtering method based on a linear model for the precision matrix (the inverse of the covariance) with the parameters determined by Score Matching Estimation. The method provides a rigorous covariance regularization when the underlying random field is Gaussian Markov. The parameters are found by solving a system of linear equations. The analysis step uses the inverse formulation of the Kalman update. Several filter versions, differing in the construction of the analysis ensemble, are proposed, as well as a Score matching version of the Extended Kalman Filter.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70248469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An international initiative of predicting the SARS-CoV-2 pandemic using ensemble data assimilation 利用集合数据同化预测SARS-CoV-2大流行的国际倡议
Q2 MATHEMATICS, APPLIED Pub Date : 2020-12-11 DOI: 10.3934/fods.2021001
G. Evensen, Javier Amezcua, M. Bocquet, A. Carrassi, A. Farchi, A. Fowler, P. Houtekamer, C. Jones, R. Moraes, M. Pulido, C. Sampson, F. Vossepoel
This work demonstrates the efficiency of using iterative ensemble smoothers to estimate the parameters of an SEIR model. We have extended a standard SEIR model with age-classes and compartments of sick, hospitalized, and dead. The data conditioned on are the daily numbers of accumulated deaths and the number of hospitalized. Also, it is possible to condition the model on the number of cases obtained from testing. We start from a wide prior distribution for the model parameters; then, the ensemble conditioning leads to a posterior ensemble of estimated parameters yielding model predictions in close agreement with the observations. The updated ensemble of model simulations has predictive capabilities and include uncertainty estimates. In particular, we estimate the effective reproductive number as a function of time, and we can assess the impact of different intervention measures. By starting from the updated set of model parameters, we can make accurate short-term predictions of the epidemic development assuming knowledge of the future effective reproductive number. Also, the model system allows for the computation of long-term scenarios of the epidemic under different assumptions. We have applied the model system on data sets from several countries, i.e., the four European countries Norway, England, The Netherlands, and France; the province of Quebec in Canada; the South American countries Argentina and Brazil; and the four US states Alabama, North Carolina, California, and New York. These countries and states all have vastly different developments of the epidemic, and we could accurately model the SARS-CoV-2 outbreak in all of them. We realize that more complex models, e.g., with regional compartments, may be desirable, and we suggest that the approach used here should be applicable also for these models.
这项工作证明了使用迭代集成平滑器来估计SEIR模型参数的效率。我们扩展了一个标准的SEIR模型,该模型包含了患病、住院和死亡的年龄等级和隔间。以每日累计死亡人数和住院人数为条件的数据。此外,可以根据从测试中获得的病例数量来调整模型。我们从模型参数的广泛先验分布开始;然后,集合条件导致估计参数的后验集合,从而产生与观测结果非常一致的模型预测。更新后的模型模拟集合具有预测能力,并包括不确定性估计。特别是,我们将有效繁殖数量估计为时间的函数,我们可以评估不同干预措施的影响。通过从更新后的一组模型参数开始,假设知道未来的有效繁殖数,我们可以对疫情发展做出准确的短期预测。此外,该模型系统允许在不同假设下计算疫情的长期情景。我们已经将模型系统应用于几个国家的数据集,即四个欧洲国家挪威、英国、荷兰和法国;加拿大魁北克省;南美洲国家阿根廷和巴西;以及美国四个州阿拉巴马州、北卡罗来纳州、加利福尼亚州和纽约州。这些国家和州的疫情发展都大不相同,我们可以准确地模拟所有国家的严重急性呼吸系统综合征冠状病毒2型疫情。我们意识到,可能需要更复杂的模型,例如具有区域分区的模型,我们建议此处使用的方法也应适用于这些模型。
{"title":"An international initiative of predicting the SARS-CoV-2 pandemic using ensemble data assimilation","authors":"G. Evensen, Javier Amezcua, M. Bocquet, A. Carrassi, A. Farchi, A. Fowler, P. Houtekamer, C. Jones, R. Moraes, M. Pulido, C. Sampson, F. Vossepoel","doi":"10.3934/fods.2021001","DOIUrl":"https://doi.org/10.3934/fods.2021001","url":null,"abstract":"This work demonstrates the efficiency of using iterative ensemble smoothers to estimate the parameters of an SEIR model. We have extended a standard SEIR model with age-classes and compartments of sick, hospitalized, and dead. The data conditioned on are the daily numbers of accumulated deaths and the number of hospitalized. Also, it is possible to condition the model on the number of cases obtained from testing. We start from a wide prior distribution for the model parameters; then, the ensemble conditioning leads to a posterior ensemble of estimated parameters yielding model predictions in close agreement with the observations. The updated ensemble of model simulations has predictive capabilities and include uncertainty estimates. In \u0000particular, we estimate the effective reproductive number as a function of time, and we can assess the impact of different intervention measures. By starting from the updated set of model parameters, we can make accurate short-term predictions of the epidemic development assuming \u0000knowledge of the future effective reproductive number. Also, the model system allows for the computation of long-term scenarios of the epidemic under different assumptions. We have applied the model system on data sets from several countries, i.e., the four European countries Norway, England, The Netherlands, and France; the province of Quebec in Canada; the South American countries Argentina and Brazil; and the four US states Alabama, North Carolina, California, and New York. These countries and states all have vastly different developments of the epidemic, and we could accurately model the SARS-CoV-2 outbreak in all of them. We realize that more complex models, e.g., with regional compartments, may be desirable, and we suggest that the approach used here should be applicable also for these models.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43519659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
A surrogate-based approach to nonlinear, non-Gaussian joint state-parameter data assimilation 一种基于代理的非线性非高斯联合状态参数数据同化方法
Q2 MATHEMATICS, APPLIED Pub Date : 2020-12-08 DOI: 10.3934/fods.2021019
J. Maclean, E. Spiller
Many recent advances in sequential assimilation of data into nonlinear high-dimensional models are modifications to particle filters which employ efficient searches of a high-dimensional state space. In this work, we present a complementary strategy that combines statistical emulators and particle filters. The emulators are used to learn and offer a computationally cheap approximation to the forward dynamic mapping. This emulator-particle filter (Emu-PF) approach requires a modest number of forward-model runs, but yields well-resolved posterior distributions even in non-Gaussian cases. We explore several modifications to the Emu-PF that utilize mechanisms for dimension reduction to efficiently fit the statistical emulator, and present a series of simulation experiments on an atypical Lorenz-96 system to demonstrate their performance. We conclude with a discussion on how the Emu-PF can be paired with modern particle filtering algorithms.
在将数据序贯同化到非线性高维模型方面的许多最新进展是对粒子滤波的改进,粒子滤波利用高维状态空间的有效搜索。在这项工作中,我们提出了一种结合统计模拟器和粒子滤波器的互补策略。仿真器用于学习并提供一个计算成本低廉的前向动态映射近似值。这种仿真粒子滤波(Emu-PF)方法需要少量的前向模型运行,但即使在非高斯情况下也能产生很好的后验分布。我们探索了Emu-PF的几种修改,利用降维机制来有效地适应统计模拟器,并在非典型Lorenz-96系统上进行了一系列仿真实验来证明它们的性能。最后,我们讨论了如何将Emu-PF与现代粒子滤波算法配对。
{"title":"A surrogate-based approach to nonlinear, non-Gaussian joint state-parameter data assimilation","authors":"J. Maclean, E. Spiller","doi":"10.3934/fods.2021019","DOIUrl":"https://doi.org/10.3934/fods.2021019","url":null,"abstract":"Many recent advances in sequential assimilation of data into nonlinear high-dimensional models are modifications to particle filters which employ efficient searches of a high-dimensional state space. In this work, we present a complementary strategy that combines statistical emulators and particle filters. The emulators are used to learn and offer a computationally cheap approximation to the forward dynamic mapping. This emulator-particle filter (Emu-PF) approach requires a modest number of forward-model runs, but yields well-resolved posterior distributions even in non-Gaussian cases. We explore several modifications to the Emu-PF that utilize mechanisms for dimension reduction to efficiently fit the statistical emulator, and present a series of simulation experiments on an atypical Lorenz-96 system to demonstrate their performance. We conclude with a discussion on how the Emu-PF can be paired with modern particle filtering algorithms.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48331060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Estimating linear response statistics using orthogonal polynomials: An rkhs formulation 估计线性响应统计使用正交多项式:一个rkhs公式
Q2 MATHEMATICS, APPLIED Pub Date : 2020-12-08 DOI: 10.3934/fods.2020021
He Zhang, J. Harlim, Xiantao Li
We study the problem of estimating linear response statistics under external perturbations using time series of unperturbed dynamics. Based on the fluctuation-dissipation theory, this problem is reformulated as an unsupervised learning task of estimating a density function. We consider a nonparametric density estimator formulated by the kernel embedding of distributions with "Mercer-type" kernels, constructed based on the classical orthogonal polynomials defined on non-compact domains. While the resulting representation is analogous to Polynomial Chaos Expansion (PCE), the connection to the reproducing kernel Hilbert space (RKHS) theory allows one to establish the uniform convergence of the estimator and to systematically address a practical question of identifying the PCE basis for a consistent estimation. We also provide practical conditions for the well-posedness of not only the estimator but also of the underlying response statistics. Finally, we provide a statistical error bound for the density estimation that accounts for the Monte-Carlo averaging over non-i.i.d time series and the biases due to a finite basis truncation. This error bound provides a means to understand the feasibility as well as limitation of the kernel embedding with Mercer-type kernels. Numerically, we verify the effectiveness of the estimator on two stochastic dynamics with known, yet, non-trivial equilibrium densities.
研究了利用无扰动动力学时间序列估计外部扰动下线性响应统计量的问题。基于涨落耗散理论,将该问题重新表述为一个估计密度函数的无监督学习任务。我们考虑了一个非参数密度估计量,它是基于定义在非紧域上的经典正交多项式,由具有“mercer型”核的分布的核嵌入来表示的。虽然结果表示类似于多项式混沌展开(PCE),但与再现核希尔伯特空间(RKHS)理论的联系允许人们建立估计量的一致收敛性,并系统地解决识别一致估计的PCE基础的实际问题。我们还为估计量和底层响应统计量的适定性提供了实际条件。最后,我们为密度估计提供了一个统计误差界,它解释了非i -i上的蒙特卡罗平均。D时间序列和有限基截断引起的偏差。这个错误界为理解用mercer型核嵌入核的可行性和局限性提供了一种方法。在数值上,我们验证了估计器在两个随机动力学上的有效性,这些随机动力学具有已知的非平凡平衡密度。
{"title":"Estimating linear response statistics using orthogonal polynomials: An rkhs formulation","authors":"He Zhang, J. Harlim, Xiantao Li","doi":"10.3934/fods.2020021","DOIUrl":"https://doi.org/10.3934/fods.2020021","url":null,"abstract":"We study the problem of estimating linear response statistics under external perturbations using time series of unperturbed dynamics. Based on the fluctuation-dissipation theory, this problem is reformulated as an unsupervised learning task of estimating a density function. We consider a nonparametric density estimator formulated by the kernel embedding of distributions with \"Mercer-type\" kernels, constructed based on the classical orthogonal polynomials defined on non-compact domains. While the resulting representation is analogous to Polynomial Chaos Expansion (PCE), the connection to the reproducing kernel Hilbert space (RKHS) theory allows one to establish the uniform convergence of the estimator and to systematically address a practical question of identifying the PCE basis for a consistent estimation. We also provide practical conditions for the well-posedness of not only the estimator but also of the underlying response statistics. Finally, we provide a statistical error bound for the density estimation that accounts for the Monte-Carlo averaging over non-i.i.d time series and the biases due to a finite basis truncation. This error bound provides a means to understand the feasibility as well as limitation of the kernel embedding with Mercer-type kernels. Numerically, we verify the effectiveness of the estimator on two stochastic dynamics with known, yet, non-trivial equilibrium densities.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48255540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
ANAPT: Additive noise analysis for persistence thresholding 持久性阈值的加性噪声分析
Q2 MATHEMATICS, APPLIED Pub Date : 2020-12-07 DOI: 10.3934/fods.2022005
Audun D. Myers, Firas A. Khasawneh, Brittany Terese Fasy
We introduce a novel method for Additive Noise Analysis for Persistence Thresholding (ANAPT) which separates significant features in the sublevel set persistence diagram of a time series based on a statistics analysis of the persistence of a noise distribution. Specifically, we consider an additive noise model and leverage the statistical analysis to provide a noise cutoff or confidence interval in the persistence diagram for the observed time series. This analysis is done for several common noise models including Gaussian, uniform, exponential, and Rayleigh distributions. ANAPT is computationally efficient, does not require any signal pre-filtering, is widely applicable, and has open-source software available. We demonstrate the functionality of ANAPT with both numerically simulated examples and an experimental data set. Additionally, we provide an efficient begin{document}$ Theta(nlog(n)) $end{document} algorithm for calculating the zero-dimensional sublevel set persistence homology.
We introduce a novel method for Additive Noise Analysis for Persistence Thresholding (ANAPT) which separates significant features in the sublevel set persistence diagram of a time series based on a statistics analysis of the persistence of a noise distribution. Specifically, we consider an additive noise model and leverage the statistical analysis to provide a noise cutoff or confidence interval in the persistence diagram for the observed time series. This analysis is done for several common noise models including Gaussian, uniform, exponential, and Rayleigh distributions. ANAPT is computationally efficient, does not require any signal pre-filtering, is widely applicable, and has open-source software available. We demonstrate the functionality of ANAPT with both numerically simulated examples and an experimental data set. Additionally, we provide an efficient begin{document}$ Theta(nlog(n)) $end{document} algorithm for calculating the zero-dimensional sublevel set persistence homology.
{"title":"ANAPT: Additive noise analysis for persistence thresholding","authors":"Audun D. Myers, Firas A. Khasawneh, Brittany Terese Fasy","doi":"10.3934/fods.2022005","DOIUrl":"https://doi.org/10.3934/fods.2022005","url":null,"abstract":"We introduce a novel method for Additive Noise Analysis for Persistence Thresholding (ANAPT) which separates significant features in the sublevel set persistence diagram of a time series based on a statistics analysis of the persistence of a noise distribution. Specifically, we consider an additive noise model and leverage the statistical analysis to provide a noise cutoff or confidence interval in the persistence diagram for the observed time series. This analysis is done for several common noise models including Gaussian, uniform, exponential, and Rayleigh distributions. ANAPT is computationally efficient, does not require any signal pre-filtering, is widely applicable, and has open-source software available. We demonstrate the functionality of ANAPT with both numerically simulated examples and an experimental data set. Additionally, we provide an efficient begin{document}$ Theta(nlog(n)) $end{document} algorithm for calculating the zero-dimensional sublevel set persistence homology.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44284181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Mean field limit of Ensemble Square Root filters - discrete and continuous time 集合平方根滤波器的平均场极限-离散和连续时间
Q2 MATHEMATICS, APPLIED Pub Date : 2020-11-20 DOI: 10.3934/FODS.2021003
Theresa Lange, W. Stannat
Consider the class of Ensemble Square Root filtering algorithms for the numerical approximation of the posterior distribution of nonlinear Markovian signals partially observed with linear observations corrupted with independent measurement noise. We analyze the asymptotic behavior of these algorithms in the large ensemble limit both in discrete and continuous time. We identify limiting mean-field processes on the level of the ensemble members, prove corresponding propagation of chaos results and derive associated convergence rates in terms of the ensemble size. In continuous time we also identify the stochastic partial differential equation driving the distribution of the mean-field process and perform a comparison with the Kushner-Stratonovich equation.
考虑一类集成平方根滤波算法,用于非线性马尔可夫信号部分观测到的后验分布的数值逼近,线性观测被独立测量噪声破坏。我们分析了这些算法在离散时间和连续时间的大集合极限下的渐近行为。我们在集合成员的水平上确定了极限平均场过程,证明了混沌结果的相应传播,并根据集合大小推导了相关的收敛速率。在连续时间条件下,我们还确定了驱动平均场过程分布的随机偏微分方程,并与Kushner-Stratonovich方程进行了比较。
{"title":"Mean field limit of Ensemble Square Root filters - discrete and continuous time","authors":"Theresa Lange, W. Stannat","doi":"10.3934/FODS.2021003","DOIUrl":"https://doi.org/10.3934/FODS.2021003","url":null,"abstract":"Consider the class of Ensemble Square Root filtering algorithms for the numerical approximation of the posterior distribution of nonlinear Markovian signals partially observed with linear observations corrupted with independent measurement noise. We analyze the asymptotic behavior of these algorithms in the large ensemble limit both in discrete and continuous time. We identify limiting mean-field processes on the level of the ensemble members, prove corresponding propagation of chaos results and derive associated convergence rates in terms of the ensemble size. In continuous time we also identify the stochastic partial differential equation driving the distribution of the mean-field process and perform a comparison with the Kushner-Stratonovich equation.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":"50 14","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41267351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Feedback particle filter for collective inference 用于集体推理的反馈粒子滤波器
Q2 MATHEMATICS, APPLIED Pub Date : 2020-10-13 DOI: 10.3934/fods.2021018
Jin W. Kim, P. Mehta

The purpose of this paper is to describe the feedback particle filter algorithm for problems where there are a large number (begin{document}$ M $end{document}) of non-interacting agents (targets) with a large number (begin{document}$ M $end{document}) of non-agent specific observations (measurements) that originate from these agents. In its basic form, the problem is characterized by data association uncertainty whereby the association between the observations and agents must be deduced in addition to the agent state. In this paper, the large-begin{document}$ M $end{document} limit is interpreted as a problem of collective inference. This viewpoint is used to derive the equation for the empirical distribution of the hidden agent states. A feedback particle filter (FPF) algorithm for this problem is presented and illustrated via numerical simulations. Results are presented for the Euclidean and the finite state-space cases, both in continuous-time settings. The classical FPF algorithm is shown to be the special case (with begin{document}$ M = 1 $end{document}) of these more general results. The simulations help show that the algorithm well approximates the empirical distribution of the hidden states for large begin{document}$ M $end{document}.

The purpose of this paper is to describe the feedback particle filter algorithm for problems where there are a large number (begin{document}$ M $end{document}) of non-interacting agents (targets) with a large number (begin{document}$ M $end{document}) of non-agent specific observations (measurements) that originate from these agents. In its basic form, the problem is characterized by data association uncertainty whereby the association between the observations and agents must be deduced in addition to the agent state. In this paper, the large-begin{document}$ M $end{document} limit is interpreted as a problem of collective inference. This viewpoint is used to derive the equation for the empirical distribution of the hidden agent states. A feedback particle filter (FPF) algorithm for this problem is presented and illustrated via numerical simulations. Results are presented for the Euclidean and the finite state-space cases, both in continuous-time settings. The classical FPF algorithm is shown to be the special case (with begin{document}$ M = 1 $end{document}) of these more general results. The simulations help show that the algorithm well approximates the empirical distribution of the hidden states for large begin{document}$ M $end{document}.
{"title":"Feedback particle filter for collective inference","authors":"Jin W. Kim, P. Mehta","doi":"10.3934/fods.2021018","DOIUrl":"https://doi.org/10.3934/fods.2021018","url":null,"abstract":"<p style='text-indent:20px;'>The purpose of this paper is to describe the feedback particle filter algorithm for problems where there are a large number (<inline-formula><tex-math id=\"M1\">begin{document}$ M $end{document}</tex-math></inline-formula>) of non-interacting agents (targets) with a large number (<inline-formula><tex-math id=\"M2\">begin{document}$ M $end{document}</tex-math></inline-formula>) of non-agent specific observations (measurements) that originate from these agents. In its basic form, the problem is characterized by data association uncertainty whereby the association between the observations and agents must be deduced in addition to the agent state. In this paper, the large-<inline-formula><tex-math id=\"M3\">begin{document}$ M $end{document}</tex-math></inline-formula> limit is interpreted as a problem of collective inference. This viewpoint is used to derive the equation for the empirical distribution of the hidden agent states. A feedback particle filter (FPF) algorithm for this problem is presented and illustrated via numerical simulations. Results are presented for the Euclidean and the finite state-space cases, both in continuous-time settings. The classical FPF algorithm is shown to be the special case (with <inline-formula><tex-math id=\"M4\">begin{document}$ M = 1 $end{document}</tex-math></inline-formula>) of these more general results. The simulations help show that the algorithm well approximates the empirical distribution of the hidden states for large <inline-formula><tex-math id=\"M5\">begin{document}$ M $end{document}</tex-math></inline-formula>.</p>","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46470057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
Foundations of data science (Springfield, Mo.)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1