首页 > 最新文献

Foundations of data science (Springfield, Mo.)最新文献

英文 中文
A surrogate-based approach to nonlinear, non-Gaussian joint state-parameter data assimilation 一种基于代理的非线性非高斯联合状态参数数据同化方法
Q2 MATHEMATICS, APPLIED Pub Date : 2020-12-08 DOI: 10.3934/fods.2021019
J. Maclean, E. Spiller
Many recent advances in sequential assimilation of data into nonlinear high-dimensional models are modifications to particle filters which employ efficient searches of a high-dimensional state space. In this work, we present a complementary strategy that combines statistical emulators and particle filters. The emulators are used to learn and offer a computationally cheap approximation to the forward dynamic mapping. This emulator-particle filter (Emu-PF) approach requires a modest number of forward-model runs, but yields well-resolved posterior distributions even in non-Gaussian cases. We explore several modifications to the Emu-PF that utilize mechanisms for dimension reduction to efficiently fit the statistical emulator, and present a series of simulation experiments on an atypical Lorenz-96 system to demonstrate their performance. We conclude with a discussion on how the Emu-PF can be paired with modern particle filtering algorithms.
在将数据序贯同化到非线性高维模型方面的许多最新进展是对粒子滤波的改进,粒子滤波利用高维状态空间的有效搜索。在这项工作中,我们提出了一种结合统计模拟器和粒子滤波器的互补策略。仿真器用于学习并提供一个计算成本低廉的前向动态映射近似值。这种仿真粒子滤波(Emu-PF)方法需要少量的前向模型运行,但即使在非高斯情况下也能产生很好的后验分布。我们探索了Emu-PF的几种修改,利用降维机制来有效地适应统计模拟器,并在非典型Lorenz-96系统上进行了一系列仿真实验来证明它们的性能。最后,我们讨论了如何将Emu-PF与现代粒子滤波算法配对。
{"title":"A surrogate-based approach to nonlinear, non-Gaussian joint state-parameter data assimilation","authors":"J. Maclean, E. Spiller","doi":"10.3934/fods.2021019","DOIUrl":"https://doi.org/10.3934/fods.2021019","url":null,"abstract":"Many recent advances in sequential assimilation of data into nonlinear high-dimensional models are modifications to particle filters which employ efficient searches of a high-dimensional state space. In this work, we present a complementary strategy that combines statistical emulators and particle filters. The emulators are used to learn and offer a computationally cheap approximation to the forward dynamic mapping. This emulator-particle filter (Emu-PF) approach requires a modest number of forward-model runs, but yields well-resolved posterior distributions even in non-Gaussian cases. We explore several modifications to the Emu-PF that utilize mechanisms for dimension reduction to efficiently fit the statistical emulator, and present a series of simulation experiments on an atypical Lorenz-96 system to demonstrate their performance. We conclude with a discussion on how the Emu-PF can be paired with modern particle filtering algorithms.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48331060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Estimating linear response statistics using orthogonal polynomials: An rkhs formulation 估计线性响应统计使用正交多项式:一个rkhs公式
Q2 MATHEMATICS, APPLIED Pub Date : 2020-12-08 DOI: 10.3934/fods.2020021
He Zhang, J. Harlim, Xiantao Li
We study the problem of estimating linear response statistics under external perturbations using time series of unperturbed dynamics. Based on the fluctuation-dissipation theory, this problem is reformulated as an unsupervised learning task of estimating a density function. We consider a nonparametric density estimator formulated by the kernel embedding of distributions with "Mercer-type" kernels, constructed based on the classical orthogonal polynomials defined on non-compact domains. While the resulting representation is analogous to Polynomial Chaos Expansion (PCE), the connection to the reproducing kernel Hilbert space (RKHS) theory allows one to establish the uniform convergence of the estimator and to systematically address a practical question of identifying the PCE basis for a consistent estimation. We also provide practical conditions for the well-posedness of not only the estimator but also of the underlying response statistics. Finally, we provide a statistical error bound for the density estimation that accounts for the Monte-Carlo averaging over non-i.i.d time series and the biases due to a finite basis truncation. This error bound provides a means to understand the feasibility as well as limitation of the kernel embedding with Mercer-type kernels. Numerically, we verify the effectiveness of the estimator on two stochastic dynamics with known, yet, non-trivial equilibrium densities.
研究了利用无扰动动力学时间序列估计外部扰动下线性响应统计量的问题。基于涨落耗散理论,将该问题重新表述为一个估计密度函数的无监督学习任务。我们考虑了一个非参数密度估计量,它是基于定义在非紧域上的经典正交多项式,由具有“mercer型”核的分布的核嵌入来表示的。虽然结果表示类似于多项式混沌展开(PCE),但与再现核希尔伯特空间(RKHS)理论的联系允许人们建立估计量的一致收敛性,并系统地解决识别一致估计的PCE基础的实际问题。我们还为估计量和底层响应统计量的适定性提供了实际条件。最后,我们为密度估计提供了一个统计误差界,它解释了非i -i上的蒙特卡罗平均。D时间序列和有限基截断引起的偏差。这个错误界为理解用mercer型核嵌入核的可行性和局限性提供了一种方法。在数值上,我们验证了估计器在两个随机动力学上的有效性,这些随机动力学具有已知的非平凡平衡密度。
{"title":"Estimating linear response statistics using orthogonal polynomials: An rkhs formulation","authors":"He Zhang, J. Harlim, Xiantao Li","doi":"10.3934/fods.2020021","DOIUrl":"https://doi.org/10.3934/fods.2020021","url":null,"abstract":"We study the problem of estimating linear response statistics under external perturbations using time series of unperturbed dynamics. Based on the fluctuation-dissipation theory, this problem is reformulated as an unsupervised learning task of estimating a density function. We consider a nonparametric density estimator formulated by the kernel embedding of distributions with \"Mercer-type\" kernels, constructed based on the classical orthogonal polynomials defined on non-compact domains. While the resulting representation is analogous to Polynomial Chaos Expansion (PCE), the connection to the reproducing kernel Hilbert space (RKHS) theory allows one to establish the uniform convergence of the estimator and to systematically address a practical question of identifying the PCE basis for a consistent estimation. We also provide practical conditions for the well-posedness of not only the estimator but also of the underlying response statistics. Finally, we provide a statistical error bound for the density estimation that accounts for the Monte-Carlo averaging over non-i.i.d time series and the biases due to a finite basis truncation. This error bound provides a means to understand the feasibility as well as limitation of the kernel embedding with Mercer-type kernels. Numerically, we verify the effectiveness of the estimator on two stochastic dynamics with known, yet, non-trivial equilibrium densities.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48255540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
ANAPT: Additive noise analysis for persistence thresholding 持久性阈值的加性噪声分析
Q2 MATHEMATICS, APPLIED Pub Date : 2020-12-07 DOI: 10.3934/fods.2022005
Audun D. Myers, Firas A. Khasawneh, Brittany Terese Fasy
We introduce a novel method for Additive Noise Analysis for Persistence Thresholding (ANAPT) which separates significant features in the sublevel set persistence diagram of a time series based on a statistics analysis of the persistence of a noise distribution. Specifically, we consider an additive noise model and leverage the statistical analysis to provide a noise cutoff or confidence interval in the persistence diagram for the observed time series. This analysis is done for several common noise models including Gaussian, uniform, exponential, and Rayleigh distributions. ANAPT is computationally efficient, does not require any signal pre-filtering, is widely applicable, and has open-source software available. We demonstrate the functionality of ANAPT with both numerically simulated examples and an experimental data set. Additionally, we provide an efficient begin{document}$ Theta(nlog(n)) $end{document} algorithm for calculating the zero-dimensional sublevel set persistence homology.
We introduce a novel method for Additive Noise Analysis for Persistence Thresholding (ANAPT) which separates significant features in the sublevel set persistence diagram of a time series based on a statistics analysis of the persistence of a noise distribution. Specifically, we consider an additive noise model and leverage the statistical analysis to provide a noise cutoff or confidence interval in the persistence diagram for the observed time series. This analysis is done for several common noise models including Gaussian, uniform, exponential, and Rayleigh distributions. ANAPT is computationally efficient, does not require any signal pre-filtering, is widely applicable, and has open-source software available. We demonstrate the functionality of ANAPT with both numerically simulated examples and an experimental data set. Additionally, we provide an efficient begin{document}$ Theta(nlog(n)) $end{document} algorithm for calculating the zero-dimensional sublevel set persistence homology.
{"title":"ANAPT: Additive noise analysis for persistence thresholding","authors":"Audun D. Myers, Firas A. Khasawneh, Brittany Terese Fasy","doi":"10.3934/fods.2022005","DOIUrl":"https://doi.org/10.3934/fods.2022005","url":null,"abstract":"We introduce a novel method for Additive Noise Analysis for Persistence Thresholding (ANAPT) which separates significant features in the sublevel set persistence diagram of a time series based on a statistics analysis of the persistence of a noise distribution. Specifically, we consider an additive noise model and leverage the statistical analysis to provide a noise cutoff or confidence interval in the persistence diagram for the observed time series. This analysis is done for several common noise models including Gaussian, uniform, exponential, and Rayleigh distributions. ANAPT is computationally efficient, does not require any signal pre-filtering, is widely applicable, and has open-source software available. We demonstrate the functionality of ANAPT with both numerically simulated examples and an experimental data set. Additionally, we provide an efficient begin{document}$ Theta(nlog(n)) $end{document} algorithm for calculating the zero-dimensional sublevel set persistence homology.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44284181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Mean field limit of Ensemble Square Root filters - discrete and continuous time 集合平方根滤波器的平均场极限-离散和连续时间
Q2 MATHEMATICS, APPLIED Pub Date : 2020-11-20 DOI: 10.3934/FODS.2021003
Theresa Lange, W. Stannat
Consider the class of Ensemble Square Root filtering algorithms for the numerical approximation of the posterior distribution of nonlinear Markovian signals partially observed with linear observations corrupted with independent measurement noise. We analyze the asymptotic behavior of these algorithms in the large ensemble limit both in discrete and continuous time. We identify limiting mean-field processes on the level of the ensemble members, prove corresponding propagation of chaos results and derive associated convergence rates in terms of the ensemble size. In continuous time we also identify the stochastic partial differential equation driving the distribution of the mean-field process and perform a comparison with the Kushner-Stratonovich equation.
考虑一类集成平方根滤波算法,用于非线性马尔可夫信号部分观测到的后验分布的数值逼近,线性观测被独立测量噪声破坏。我们分析了这些算法在离散时间和连续时间的大集合极限下的渐近行为。我们在集合成员的水平上确定了极限平均场过程,证明了混沌结果的相应传播,并根据集合大小推导了相关的收敛速率。在连续时间条件下,我们还确定了驱动平均场过程分布的随机偏微分方程,并与Kushner-Stratonovich方程进行了比较。
{"title":"Mean field limit of Ensemble Square Root filters - discrete and continuous time","authors":"Theresa Lange, W. Stannat","doi":"10.3934/FODS.2021003","DOIUrl":"https://doi.org/10.3934/FODS.2021003","url":null,"abstract":"Consider the class of Ensemble Square Root filtering algorithms for the numerical approximation of the posterior distribution of nonlinear Markovian signals partially observed with linear observations corrupted with independent measurement noise. We analyze the asymptotic behavior of these algorithms in the large ensemble limit both in discrete and continuous time. We identify limiting mean-field processes on the level of the ensemble members, prove corresponding propagation of chaos results and derive associated convergence rates in terms of the ensemble size. In continuous time we also identify the stochastic partial differential equation driving the distribution of the mean-field process and perform a comparison with the Kushner-Stratonovich equation.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":"50 14","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41267351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Feedback particle filter for collective inference 用于集体推理的反馈粒子滤波器
Q2 MATHEMATICS, APPLIED Pub Date : 2020-10-13 DOI: 10.3934/fods.2021018
Jin W. Kim, P. Mehta

The purpose of this paper is to describe the feedback particle filter algorithm for problems where there are a large number (begin{document}$ M $end{document}) of non-interacting agents (targets) with a large number (begin{document}$ M $end{document}) of non-agent specific observations (measurements) that originate from these agents. In its basic form, the problem is characterized by data association uncertainty whereby the association between the observations and agents must be deduced in addition to the agent state. In this paper, the large-begin{document}$ M $end{document} limit is interpreted as a problem of collective inference. This viewpoint is used to derive the equation for the empirical distribution of the hidden agent states. A feedback particle filter (FPF) algorithm for this problem is presented and illustrated via numerical simulations. Results are presented for the Euclidean and the finite state-space cases, both in continuous-time settings. The classical FPF algorithm is shown to be the special case (with begin{document}$ M = 1 $end{document}) of these more general results. The simulations help show that the algorithm well approximates the empirical distribution of the hidden states for large begin{document}$ M $end{document}.

The purpose of this paper is to describe the feedback particle filter algorithm for problems where there are a large number (begin{document}$ M $end{document}) of non-interacting agents (targets) with a large number (begin{document}$ M $end{document}) of non-agent specific observations (measurements) that originate from these agents. In its basic form, the problem is characterized by data association uncertainty whereby the association between the observations and agents must be deduced in addition to the agent state. In this paper, the large-begin{document}$ M $end{document} limit is interpreted as a problem of collective inference. This viewpoint is used to derive the equation for the empirical distribution of the hidden agent states. A feedback particle filter (FPF) algorithm for this problem is presented and illustrated via numerical simulations. Results are presented for the Euclidean and the finite state-space cases, both in continuous-time settings. The classical FPF algorithm is shown to be the special case (with begin{document}$ M = 1 $end{document}) of these more general results. The simulations help show that the algorithm well approximates the empirical distribution of the hidden states for large begin{document}$ M $end{document}.
{"title":"Feedback particle filter for collective inference","authors":"Jin W. Kim, P. Mehta","doi":"10.3934/fods.2021018","DOIUrl":"https://doi.org/10.3934/fods.2021018","url":null,"abstract":"<p style='text-indent:20px;'>The purpose of this paper is to describe the feedback particle filter algorithm for problems where there are a large number (<inline-formula><tex-math id=\"M1\">begin{document}$ M $end{document}</tex-math></inline-formula>) of non-interacting agents (targets) with a large number (<inline-formula><tex-math id=\"M2\">begin{document}$ M $end{document}</tex-math></inline-formula>) of non-agent specific observations (measurements) that originate from these agents. In its basic form, the problem is characterized by data association uncertainty whereby the association between the observations and agents must be deduced in addition to the agent state. In this paper, the large-<inline-formula><tex-math id=\"M3\">begin{document}$ M $end{document}</tex-math></inline-formula> limit is interpreted as a problem of collective inference. This viewpoint is used to derive the equation for the empirical distribution of the hidden agent states. A feedback particle filter (FPF) algorithm for this problem is presented and illustrated via numerical simulations. Results are presented for the Euclidean and the finite state-space cases, both in continuous-time settings. The classical FPF algorithm is shown to be the special case (with <inline-formula><tex-math id=\"M4\">begin{document}$ M = 1 $end{document}</tex-math></inline-formula>) of these more general results. The simulations help show that the algorithm well approximates the empirical distribution of the hidden states for large <inline-formula><tex-math id=\"M5\">begin{document}$ M $end{document}</tex-math></inline-formula>.</p>","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46470057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Multiple hypothesis testing with persistent homology 具有持久同源性的多重假设检验
Q2 MATHEMATICS, APPLIED Pub Date : 2020-10-10 DOI: 10.3934/fods.2022018
Mikael Vejdemo-Johansson, Sayan Mukherjee
Multiple hypothesis testing requires a control procedure: the error probabilities in statistical testing compound when several tests are performed for the same conclusion. A common type of multiple hypothesis testing error rates is the FamilyWise Error Rate (FWER) which measures the probability that any one of the performed tests rejects its null hypothesis erroneously. These are often controlled using Bonferroni’s method or later more sophisticated approaches all of which involve replacing the test level α with α/k, reducing it by a factor of the number of simultaneous tests performed. Common paradigms for hypothesis testing in persistent homology are often based on permutation testing, however increasing the number of permutations to meet a Bonferroni-style threshold can be prohibitively expensive. In this paper we propose a null model based approach to testing for acyclicity (ie trivial homology), coupled with a Family-Wise Error Rate (FWER) control method that does not suffer from these computational costs.
多重假设检验需要一个控制程序:当对同一结论进行多次检验时,统计检验中的错误概率。多假设测试错误率的一种常见类型是FamilyWise错误率(FWER),它测量任何一个执行的测试错误地拒绝其零假设的概率。这些通常使用Bonferroni的方法或后来更复杂的方法进行控制,所有这些方法都涉及用α/k代替测试水平α,将其减少一倍于同时进行的测试数量。持久同源性中假设检验的常见范式通常基于排列检验,然而,增加排列数量以满足Bonferroni风格的阈值可能代价高昂。在本文中,我们提出了一种基于零模型的方法来测试非循环性(即平凡同源性),并结合了一种不受这些计算成本影响的家族错误率(FWER)控制方法。
{"title":"Multiple hypothesis testing with persistent homology","authors":"Mikael Vejdemo-Johansson, Sayan Mukherjee","doi":"10.3934/fods.2022018","DOIUrl":"https://doi.org/10.3934/fods.2022018","url":null,"abstract":"Multiple hypothesis testing requires a control procedure: the error probabilities in statistical testing compound when several tests are performed for the same conclusion. A common type of multiple hypothesis testing error rates is the FamilyWise Error Rate (FWER) which measures the probability that any one of the performed tests rejects its null hypothesis erroneously. These are often controlled using Bonferroni’s method or later more sophisticated approaches all of which involve replacing the test level α with α/k, reducing it by a factor of the number of simultaneous tests performed. Common paradigms for hypothesis testing in persistent homology are often based on permutation testing, however increasing the number of permutations to meet a Bonferroni-style threshold can be prohibitively expensive. In this paper we propose a null model based approach to testing for acyclicity (ie trivial homology), coupled with a Family-Wise Error Rate (FWER) control method that does not suffer from these computational costs.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48362858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Wave-shape oscillatory model for nonstationary periodic time series analysis 非平稳周期时间序列分析的波形振荡模型
Q2 MATHEMATICS, APPLIED Pub Date : 2020-07-13 DOI: 10.3934/FODS.2021009
Yu-Ting Lin, John Malik, Hau‐Tieng Wu
The oscillations observed in many time series, particularly in biomedicine, exhibit morphological variations over time. These morphological variations are caused by intrinsic or extrinsic changes to the state of the generating system, henceforth referred to as dynamics. To model these time series (including and specifically pathophysiological ones) and estimate the underlying dynamics, we provide a novel wave-shape oscillatory model. In this model, time-dependent variations in cycle shape occur along a manifold called the wave-shape manifold. To estimate the wave-shape manifold associated with an oscillatory time series, study the dynamics, and visualize the time-dependent changes along the wave-shape manifold, we apply the well-established diffusion maps (DM) algorithm to the set of all observed oscillations. We provide a theoretical guarantee on the dynamical information recovered by the DM algorithm under the proposed model. Applying the proposed model and algorithm to arterial blood pressure (ABP) signals recorded during general anesthesia leads to the extraction of nociception information. Applying the wave-shape oscillatory model and the DM algorithm to cardiac cycles in the electrocardiogram (ECG) leads to ectopy detection and a new ECG-derived respiratory signal, even when the subject has atrial fibrillation.
在许多时间序列中观察到的振荡,特别是在生物医学中,表现出随时间的形态学变化。这些形态变化是由发电系统状态的内在或外在变化引起的,下文称为动力学。为了对这些时间序列(包括,特别是病理生理序列)进行建模并估计潜在的动力学,我们提供了一个新的波形振荡模型。在这个模型中,周期形状随时间的变化沿着一个称为波形流形的流形发生。为了估计与振荡时间序列相关的波形流形,研究动力学,并可视化沿波形流形的随时间变化,我们将公认的扩散图(DM)算法应用于所有观测到的振荡集。我们为DM算法在所提出的模型下恢复动态信息提供了理论保证。将所提出的模型和算法应用于全麻期间记录的动脉血压(ABP)信号,可以提取伤害感受信息。将波形振荡模型和DM算法应用于心电图(ECG)中的心动周期会导致异位检测和新的ECG衍生的呼吸信号,即使受试者患有心房颤动。
{"title":"Wave-shape oscillatory model for nonstationary periodic time series analysis","authors":"Yu-Ting Lin, John Malik, Hau‐Tieng Wu","doi":"10.3934/FODS.2021009","DOIUrl":"https://doi.org/10.3934/FODS.2021009","url":null,"abstract":"The oscillations observed in many time series, particularly in biomedicine, exhibit morphological variations over time. These morphological variations are caused by intrinsic or extrinsic changes to the state of the generating system, henceforth referred to as dynamics. To model these time series (including and specifically pathophysiological ones) and estimate the underlying dynamics, we provide a novel wave-shape oscillatory model. In this model, time-dependent variations in cycle shape occur along a manifold called the wave-shape manifold. To estimate the wave-shape manifold associated with an oscillatory time series, study the dynamics, and visualize the time-dependent changes along the wave-shape manifold, we apply the well-established diffusion maps (DM) algorithm to the set of all observed oscillations. We provide a theoretical guarantee on the dynamical information recovered by the DM algorithm under the proposed model. Applying the proposed model and algorithm to arterial blood pressure (ABP) signals recorded during general anesthesia leads to the extraction of nociception information. Applying the wave-shape oscillatory model and the DM algorithm to cardiac cycles in the electrocardiogram (ECG) leads to ectopy detection and a new ECG-derived respiratory signal, even when the subject has atrial fibrillation.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45887443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
The (homological) persistence of gerrymandering 选区划分不公的(同源)持续性
Q2 MATHEMATICS, APPLIED Pub Date : 2020-07-05 DOI: 10.3934/FODS.2021007
M. Duchin, Tom Needham, Thomas Weighill
We apply persistent homology, the dominant tool from the field of topological data analysis, to study electoral redistricting. Our method combines the geographic information from a political districting plan with election data to produce a persistence diagram. We are then able to visualize and analyze large ensembles of computer-generated districting plans of the type commonly used in modern redistricting research (and court challenges). We set out three applications: zoning a state at each scale of districting, comparing elections, and seeking signals of gerrymandering. Our case studies focus on redistricting in Pennsylvania and North Carolina, two states whose legal challenges to enacted plans have raised considerable public interest in the last few years. To address the question of robustness of the persistence diagrams to perturbations in vote data and in district boundaries, we translate the classical stability theorem of Cohen--Steiner et al. into our setting and find that it can be phrased in a manner that is easy to interpret. We accompany the theoretical bound with an empirical demonstration to illustrate diagram stability in practice.
我们应用拓扑数据分析领域的主要工具持久同源性来研究选举选区的重新划分。我们的方法将政治区划计划中的地理信息与选举数据相结合,生成持久图。然后,我们能够可视化和分析现代重新划分研究(和法庭挑战)中常用的计算机生成的大规模划分计划。我们提出了三个应用程序:按每种选区划分一个州,比较选举,以及寻找不公正选区划分的信号。我们的案例研究集中在宾夕法尼亚州和北卡罗来纳州的重新划分,这两个州对已颁布计划的法律挑战在过去几年中引起了相当大的公众兴趣。为了解决持久性图对投票数据和地区边界扰动的稳健性问题,我们将Cohen–Steiner等人的经典稳定性定理转化为我们的设置,并发现它可以用一种易于解释的方式来表达。我们在理论界的同时进行了实证论证,以说明图表在实践中的稳定性。
{"title":"The (homological) persistence of gerrymandering","authors":"M. Duchin, Tom Needham, Thomas Weighill","doi":"10.3934/FODS.2021007","DOIUrl":"https://doi.org/10.3934/FODS.2021007","url":null,"abstract":"We apply persistent homology, the dominant tool from the field of topological data analysis, to study electoral redistricting. Our method combines the geographic information from a political districting plan with election data to produce a persistence diagram. We are then able to visualize and analyze large ensembles of computer-generated districting plans of the type commonly used in modern redistricting research (and court challenges). We set out three applications: zoning a state at each scale of districting, comparing elections, and seeking signals of gerrymandering. Our case studies focus on redistricting in Pennsylvania and North Carolina, two states whose legal challenges to enacted plans have raised considerable public interest in the last few years. \u0000To address the question of robustness of the persistence diagrams to perturbations in vote data and in district boundaries, we translate the classical stability theorem of Cohen--Steiner et al. into our setting and find that it can be phrased in a manner that is easy to interpret. We accompany the theoretical bound with an empirical demonstration to illustrate diagram stability in practice.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45846167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Posterior contraction rates for non-parametric state and drift estimation 非参数状态和漂移估计的后验收缩率
Q2 MATHEMATICS, APPLIED Pub Date : 2020-03-20 DOI: 10.3934/fods.2020016
S. Reich, P. Rozdeba
We consider a combined state and drift estimation problem for the linear stochastic heat equation. The infinite-dimensional Bayesian inference problem is formulated in terms of the Kalman-Bucy filter over an extended state space, and its long-time asymptotic properties are studied. Asymptotic posterior contraction rates in the unknown drift function are the main contribution of this paper. Such rates have been studied before for stationary non-parametric Bayesian inverse problems, and here we demonstrate the consistency of our time-dependent formulation with these previous results building upon scale separation and a slow manifold approximation.
考虑了线性随机热方程的组合状态估计和漂移估计问题。利用扩展状态空间上的Kalman-Bucy滤波器,提出了无限维贝叶斯推理问题,并研究了该问题的长时间渐近性质。未知漂移函数中的渐近后验收缩率是本文的主要贡献。这种速率之前已经研究过平稳非参数贝叶斯反问题,在这里,我们证明了我们的时间相关公式与这些基于尺度分离和慢流形近似的先前结果的一致性。
{"title":"Posterior contraction rates for non-parametric state and drift estimation","authors":"S. Reich, P. Rozdeba","doi":"10.3934/fods.2020016","DOIUrl":"https://doi.org/10.3934/fods.2020016","url":null,"abstract":"We consider a combined state and drift estimation problem for the linear stochastic heat equation. The infinite-dimensional Bayesian inference problem is formulated in terms of the Kalman-Bucy filter over an extended state space, and its long-time asymptotic properties are studied. Asymptotic posterior contraction rates in the unknown drift function are the main contribution of this paper. Such rates have been studied before for stationary non-parametric Bayesian inverse problems, and here we demonstrate the consistency of our time-dependent formulation with these previous results building upon scale separation and a slow manifold approximation.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42505068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Probabilistic learning on manifolds 流形上的概率学习
Q2 MATHEMATICS, APPLIED Pub Date : 2020-02-28 DOI: 10.3934/fods.2020013
Christian Soize, R. Ghanem
This paper presents mathematical results in support of the methodology of the probabilistic learning on manifolds (PLoM) recently introduced by the authors, which has been used with success for analyzing complex engineering systems. The PLoM considers a given initial dataset constituted of a small number of points given in an Euclidean space, which are interpreted as independent realizations of a vector-valued random variable for which its non-Gaussian probability measure is unknown but is, textit{a priori}, concentrated in an unknown subset of the Euclidean space. The objective is to construct a learned dataset constituted of additional realizations that allow the evaluation of converged statistics. A transport of the probability measure estimated with the initial dataset is done through a linear transformation constructed using a reduced-order diffusion-maps basis. In this paper, it is proven that this transported measure is a marginal distribution of the invariant measure of a reduced-order Ito stochastic differential equation that corresponds to a dissipative Hamiltonian dynamical system. This construction allows for preserving the concentration of the probability measure. This property is shown by analyzing a distance between the random matrix constructed with the PLoM and the matrix representing the initial dataset, as a function of the dimension of the basis. It is further proven that this distance has a minimum for a dimension of the reduced-order diffusion-maps basis that is strictly smaller than the number of points in the initial dataset. Finally, a brief numerical application illustrates the mathematical results.
本文给出了支持作者最近提出的流形概率学习(PLoM)方法的数学结果,该方法已成功地用于分析复杂工程系统。PLoM考虑一个给定的初始数据集,该数据集由欧几里得空间中给定的少量点组成,这些点被解释为向量值随机变量的独立实现,其非高斯概率测度是未知的,但textit{先验}地集中在欧几里得空间的未知子集中。目标是构建一个由其他实现组成的学习数据集,这些实现允许对聚合统计进行评估。用初始数据集估计的概率测度的传输是通过使用降阶扩散映射基础构造的线性变换来完成的。本文证明了该传递测度是对应于耗散哈密顿动力系统的降阶Ito随机微分方程不变测度的一个边际分布。这种构造允许保持概率测度的集中。通过分析用PLoM构造的随机矩阵与表示初始数据集的矩阵之间的距离作为基维数的函数来显示这一特性。进一步证明,对于降阶扩散映射基的一个维,该距离有一个最小值,该最小值严格小于初始数据集中的点数。最后,通过一个简单的数值应用说明了数学结果。
{"title":"Probabilistic learning on manifolds","authors":"Christian Soize, R. Ghanem","doi":"10.3934/fods.2020013","DOIUrl":"https://doi.org/10.3934/fods.2020013","url":null,"abstract":"This paper presents mathematical results in support of the methodology of the probabilistic learning on manifolds (PLoM) recently introduced by the authors, which has been used with success for analyzing complex engineering systems. The PLoM considers a given initial dataset constituted of a small number of points given in an Euclidean space, which are interpreted as independent realizations of a vector-valued random variable for which its non-Gaussian probability measure is unknown but is, textit{a priori}, concentrated in an unknown subset of the Euclidean space. The objective is to construct a learned dataset constituted of additional realizations that allow the evaluation of converged statistics. A transport of the probability measure estimated with the initial dataset is done through a linear transformation constructed using a reduced-order diffusion-maps basis. In this paper, it is proven that this transported measure is a marginal distribution of the invariant measure of a reduced-order Ito stochastic differential equation that corresponds to a dissipative Hamiltonian dynamical system. This construction allows for preserving the concentration of the probability measure. This property is shown by analyzing a distance between the random matrix constructed with the PLoM and the matrix representing the initial dataset, as a function of the dimension of the basis. It is further proven that this distance has a minimum for a dimension of the reduced-order diffusion-maps basis that is strictly smaller than the number of points in the initial dataset. Finally, a brief numerical application illustrates the mathematical results.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44044177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
期刊
Foundations of data science (Springfield, Mo.)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1