首页 > 最新文献

arXiv - STAT - Machine Learning最新文献

英文 中文
PieClam: A Universal Graph Autoencoder Based on Overlapping Inclusive and Exclusive Communities PieClam:基于重叠包容和排斥社区的通用图自动编码器
Pub Date : 2024-09-18 DOI: arxiv-2409.11618
Daniel Zilberg, Ron Levie
We propose PieClam (Prior Inclusive Exclusive Cluster Affiliation Model): aprobabilistic graph model for representing any graph as overlapping generalizedcommunities. Our method can be interpreted as a graph autoencoder: nodes areembedded into a code space by an algorithm that maximizes the log-likelihood ofthe decoded graph, given the input graph. PieClam is a community affiliationmodel that extends well-known methods like BigClam in two main manners. First,instead of the decoder being defined via pairwise interactions between thenodes in the code space, we also incorporate a learned prior on thedistribution of nodes in the code space, turning our method into a graphgenerative model. Secondly, we generalize the notion of communities by allowingnot only sets of nodes with strong connectivity, which we call inclusivecommunities, but also sets of nodes with strong disconnection, which we callexclusive communities. To model both types of communities, we propose a newtype of decoder based the Lorentz inner product, which we prove to be much moreexpressive than standard decoders based on standard inner products or normdistances. By introducing a new graph similarity measure, that we call the logcut distance, we show that PieClam is a universal autoencoder, able touniformly approximately reconstruct any graph. Our method is shown to obtaincompetitive performance in graph anomaly detection benchmarks.
我们提出了 PieClam(Prior Inclusive Exclusive Cluster Affiliation Model):一种将任何图形表示为重叠广义群体的概率图模型。我们的方法可以解释为图自动编码器:在给定输入图的情况下,通过最大化解码图的对数似然的算法将节点嵌入代码空间。PieClam 是一种社区隶属度模型,它在两个主要方面对 BigClam 等著名方法进行了扩展。首先,解码器不是通过代码空间中节点之间的成对交互来定义的,我们还加入了代码空间中节点分布的先验学习,从而将我们的方法转化为图生成模型。其次,我们对社群的概念进行了概括,不仅允许具有强连接性的节点集(我们称之为包容性社群)存在,还允许具有强断开性的节点集存在(我们称之为排斥性社群)。为了给这两类社群建模,我们提出了一种基于洛伦兹内积的新型解码器,事实证明它比基于标准内积或规范差的标准解码器更具表现力。通过引入一种新的图相似性度量(我们称之为 logcut 距离),我们证明了 PieClam 是一种通用的自动编码器,能够统一地近似重构任何图。在图异常检测基准测试中,我们的方法获得了极具竞争力的性能。
{"title":"PieClam: A Universal Graph Autoencoder Based on Overlapping Inclusive and Exclusive Communities","authors":"Daniel Zilberg, Ron Levie","doi":"arxiv-2409.11618","DOIUrl":"https://doi.org/arxiv-2409.11618","url":null,"abstract":"We propose PieClam (Prior Inclusive Exclusive Cluster Affiliation Model): a\u0000probabilistic graph model for representing any graph as overlapping generalized\u0000communities. Our method can be interpreted as a graph autoencoder: nodes are\u0000embedded into a code space by an algorithm that maximizes the log-likelihood of\u0000the decoded graph, given the input graph. PieClam is a community affiliation\u0000model that extends well-known methods like BigClam in two main manners. First,\u0000instead of the decoder being defined via pairwise interactions between the\u0000nodes in the code space, we also incorporate a learned prior on the\u0000distribution of nodes in the code space, turning our method into a graph\u0000generative model. Secondly, we generalize the notion of communities by allowing\u0000not only sets of nodes with strong connectivity, which we call inclusive\u0000communities, but also sets of nodes with strong disconnection, which we call\u0000exclusive communities. To model both types of communities, we propose a new\u0000type of decoder based the Lorentz inner product, which we prove to be much more\u0000expressive than standard decoders based on standard inner products or norm\u0000distances. By introducing a new graph similarity measure, that we call the log\u0000cut distance, we show that PieClam is a universal autoencoder, able to\u0000uniformly approximately reconstruct any graph. Our method is shown to obtain\u0000competitive performance in graph anomaly detection benchmarks.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Recurrent Interpolants for Probabilistic Time Series Prediction 用于时间序列概率预测的递归插值法
Pub Date : 2024-09-18 DOI: arxiv-2409.11684
Yu Chen, Marin Biloš, Sarthak Mittal, Wei Deng, Kashif Rasul, Anderson Schneider
Sequential models such as recurrent neural networks or transformer-basedmodels became textit{de facto} tools for multivariate time series forecastingin a probabilistic fashion, with applications to a wide range of datasets, suchas finance, biology, medicine, etc. Despite their adeptness in capturingdependencies, assessing prediction uncertainty, and efficiency in training,challenges emerge in modeling high-dimensional complex distributions andcross-feature dependencies. To tackle these issues, recent works delve intogenerative modeling by employing diffusion or flow-based models. Notably, theintegration of stochastic differential equations or probability flowsuccessfully extends these methods to probabilistic time series imputation andforecasting. However, scalability issues necessitate a computational-friendlyframework for large-scale generative model-based predictions. This workproposes a novel approach by blending the computational efficiency of recurrentneural networks with the high-quality probabilistic modeling of the diffusionmodel, which addresses challenges and advances generative models' applicationin time series forecasting. Our method relies on the foundation of stochasticinterpolants and the extension to a broader conditional generation frameworkwith additional control features, offering insights for future developments inthis dynamic field.
递归神经网络或基于变换器的模型等序列模型已成为以概率方式进行多变量时间序列预测的工具,广泛应用于金融、生物、医学等数据集。尽管它们在捕捉依赖性、评估预测不确定性和训练效率方面表现出色,但在对高维复杂分布和交叉特征依赖性建模方面仍存在挑战。为了解决这些问题,最近的研究通过采用基于扩散或流动的模型来深入研究生成模型。值得注意的是,随机微分方程或概率流的整合成功地将这些方法扩展到了概率时间序列估算和预测。然而,由于可扩展性问题,有必要为基于生成模型的大规模预测建立一个便于计算的框架。本研究提出了一种新方法,将递归神经网络的计算效率与扩散模型的高质量概率建模相结合,从而解决了这一难题,并推进了生成模型在时间序列预测中的应用。我们的方法以随机插值为基础,并扩展到更广泛的条件生成框架,具有额外的控制功能,为这一动态领域的未来发展提供了启示。
{"title":"Recurrent Interpolants for Probabilistic Time Series Prediction","authors":"Yu Chen, Marin Biloš, Sarthak Mittal, Wei Deng, Kashif Rasul, Anderson Schneider","doi":"arxiv-2409.11684","DOIUrl":"https://doi.org/arxiv-2409.11684","url":null,"abstract":"Sequential models such as recurrent neural networks or transformer-based\u0000models became textit{de facto} tools for multivariate time series forecasting\u0000in a probabilistic fashion, with applications to a wide range of datasets, such\u0000as finance, biology, medicine, etc. Despite their adeptness in capturing\u0000dependencies, assessing prediction uncertainty, and efficiency in training,\u0000challenges emerge in modeling high-dimensional complex distributions and\u0000cross-feature dependencies. To tackle these issues, recent works delve into\u0000generative modeling by employing diffusion or flow-based models. Notably, the\u0000integration of stochastic differential equations or probability flow\u0000successfully extends these methods to probabilistic time series imputation and\u0000forecasting. However, scalability issues necessitate a computational-friendly\u0000framework for large-scale generative model-based predictions. This work\u0000proposes a novel approach by blending the computational efficiency of recurrent\u0000neural networks with the high-quality probabilistic modeling of the diffusion\u0000model, which addresses challenges and advances generative models' application\u0000in time series forecasting. Our method relies on the foundation of stochastic\u0000interpolants and the extension to a broader conditional generation framework\u0000with additional control features, offering insights for future developments in\u0000this dynamic field.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fitting Multilevel Factor Models 拟合多层次因子模型
Pub Date : 2024-09-18 DOI: arxiv-2409.12067
Tetiana Parshakova, Trevor Hastie, Stephen Boyd
We examine a special case of the multilevel factor model, with covariancegiven by multilevel low rank (MLR) matrix~cite{parshakova2023factor}. Wedevelop a novel, fast implementation of the expectation-maximization (EM)algorithm, tailored for multilevel factor models, to maximize the likelihood ofthe observed data. This method accommodates any hierarchical structure andmaintains linear time and storage complexities per iteration. This is achievedthrough a new efficient technique for computing the inverse of the positivedefinite MLR matrix. We show that the inverse of an invertible PSD MLR matrixis also an MLR matrix with the same sparsity in factors, and we use therecursive Sherman-Morrison-Woodbury matrix identity to obtain the factors ofthe inverse. Additionally, we present an algorithm that computes the Choleskyfactorization of an expanded matrix with linear time and space complexities,yielding the covariance matrix as its Schur complement. This paper isaccompanied by an open-source package that implements the proposed methods.
我们研究了多层次因子模型的一个特例,其协方差由多层次低阶(MLR)矩阵给出~cite{parshakova2023factor}。我们为多层次因子模型开发了一种新颖、快速的期望最大化(EM)算法,以最大化观测数据的可能性。该方法可适应任何层次结构,并保持每次迭代的线性时间和存储复杂性。这是通过一种计算正定有限 MLR 矩阵逆的高效新技术实现的。我们证明,可逆 PSD MLR 矩阵的逆矩阵也是具有相同稀疏因子的 MLR 矩阵,我们使用游标式 Sherman-Morrison-Woodbury 矩阵标识来获得逆矩阵的因子。此外,我们还提出了一种算法,能以线性的时间和空间复杂度计算扩展矩阵的 Cholesky 因子化,得到协方差矩阵的舒尔补码。本文附有一个开源软件包,用于实现所提出的方法。
{"title":"Fitting Multilevel Factor Models","authors":"Tetiana Parshakova, Trevor Hastie, Stephen Boyd","doi":"arxiv-2409.12067","DOIUrl":"https://doi.org/arxiv-2409.12067","url":null,"abstract":"We examine a special case of the multilevel factor model, with covariance\u0000given by multilevel low rank (MLR) matrix~cite{parshakova2023factor}. We\u0000develop a novel, fast implementation of the expectation-maximization (EM)\u0000algorithm, tailored for multilevel factor models, to maximize the likelihood of\u0000the observed data. This method accommodates any hierarchical structure and\u0000maintains linear time and storage complexities per iteration. This is achieved\u0000through a new efficient technique for computing the inverse of the positive\u0000definite MLR matrix. We show that the inverse of an invertible PSD MLR matrix\u0000is also an MLR matrix with the same sparsity in factors, and we use the\u0000recursive Sherman-Morrison-Woodbury matrix identity to obtain the factors of\u0000the inverse. Additionally, we present an algorithm that computes the Cholesky\u0000factorization of an expanded matrix with linear time and space complexities,\u0000yielding the covariance matrix as its Schur complement. This paper is\u0000accompanied by an open-source package that implements the proposed methods.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cartan moving frames and the data manifolds 卡坦动帧和数据流形
Pub Date : 2024-09-18 DOI: arxiv-2409.12057
Eliot Tron, Rita Fioresi, Nicolas Couellan, Stéphane Puechmorel
The purpose of this paper is to employ the language of Cartan moving framesto study the geometry of the data manifolds and its Riemannian structure, viathe data information metric and its curvature at data points. Using thisframework and through experiments, explanations on the response of a neuralnetwork are given by pointing out the output classes that are easily reachablefrom a given input. This emphasizes how the proposed mathematical relationshipbetween the output of the network and the geometry of its inputs can beexploited as an explainable artificial intelligence tool.
本文旨在运用卡坦运动帧语言研究数据流形的几何及其黎曼结构、数据信息度量及其在数据点上的曲率。利用这一框架并通过实验,通过指出给定输入容易达到的输出类别来解释神经网络的响应。这就强调了所提出的网络输出与其输入几何之间的数学关系如何能够作为一种可解释的人工智能工具加以利用。
{"title":"Cartan moving frames and the data manifolds","authors":"Eliot Tron, Rita Fioresi, Nicolas Couellan, Stéphane Puechmorel","doi":"arxiv-2409.12057","DOIUrl":"https://doi.org/arxiv-2409.12057","url":null,"abstract":"The purpose of this paper is to employ the language of Cartan moving frames\u0000to study the geometry of the data manifolds and its Riemannian structure, via\u0000the data information metric and its curvature at data points. Using this\u0000framework and through experiments, explanations on the response of a neural\u0000network are given by pointing out the output classes that are easily reachable\u0000from a given input. This emphasizes how the proposed mathematical relationship\u0000between the output of the network and the geometry of its inputs can be\u0000exploited as an explainable artificial intelligence tool.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Symmetry-Based Structured Matrices for Efficient Approximately Equivariant Networks 基于对称性结构矩阵的高效近似等价网络
Pub Date : 2024-09-18 DOI: arxiv-2409.11772
Ashwin Samudre, Mircea Petrache, Brian D. Nord, Shubhendu Trivedi
There has been much recent interest in designing symmetry-aware neuralnetworks (NNs) exhibiting relaxed equivariance. Such NNs aim to interpolatebetween being exactly equivariant and being fully flexible, affordingconsistent performance benefits. In a separate line of work, certain structuredparameter matrices -- those with displacement structure, characterized by lowdisplacement rank (LDR) -- have been used to design small-footprint NNs.Displacement structure enables fast function and gradient evaluation, butpermits accurate approximations via compression primarily to classicalconvolutional neural networks (CNNs). In this work, we propose a generalframework -- based on a novel construction of symmetry-based structuredmatrices -- to build approximately equivariant NNs with significantly reducedparameter counts. Our framework integrates the two aforementioned lines of workvia the use of so-called Group Matrices (GMs), a forgotten precursor to themodern notion of regular representations of finite groups. GMs allow the designof structured matrices -- resembling LDR matrices -- which generalize thelinear operations of a classical CNN from cyclic groups to general finitegroups and their homogeneous spaces. We show that GMs can be employed to extendall the elementary operations of CNNs to general discrete groups. Further, thetheory of structured matrices based on GMs provides a generalization of LDRtheory focussed on matrices with cyclic structure, providing a tool forimplementing approximate equivariance for discrete groups. We test GM-basedarchitectures on a variety of tasks in the presence of relaxed symmetry. Wereport that our framework consistently performs competitively compared toapproximately equivariant NNs, and other structured matrix-based compressionframeworks, sometimes with a one or two orders of magnitude lower parametercount.
最近,人们对设计对称感知神经网络(NN)表现出宽松的等差性兴趣浓厚。这类神经网络的目标是在精确等差性和完全灵活性之间进行穿插,从而带来一致的性能优势。在另一项研究中,某些结构化参数矩阵--具有位移结构、以低位移秩(LDR)为特征的矩阵--已被用于设计小尺寸 NN。位移结构可实现快速函数和梯度评估,但主要通过压缩实现精确逼近经典卷积神经网络(CNN)。在这项工作中,我们提出了一个通用框架--基于对称结构矩阵的新颖构造--来构建近似等变的 NN,并显著减少参数数量。我们的框架通过使用所谓的群矩阵(GMs)整合了上述两方面的工作,GMs 是有限群正则表达式这一现代概念被遗忘的前身。GMs允许设计结构化矩阵--类似于LDR矩阵--将经典CNN的线性运算从循环群推广到一般有限群及其同质空间。我们证明,可以利用 GM 将 CNN 的所有基本操作扩展到一般离散群。此外,基于 GM 的结构矩阵理论提供了对 LDR 理论的概括,该理论侧重于具有循环结构的矩阵,为离散群提供了实现近似等差数列的工具。我们在各种任务中测试了在松弛对称性条件下基于 GM 的架构。结果表明,与近似等差数列网络和其他基于结构矩阵的压缩框架相比,我们的框架在性能上始终具有竞争力,有时甚至比它们低一到两个数量级的参数。
{"title":"Symmetry-Based Structured Matrices for Efficient Approximately Equivariant Networks","authors":"Ashwin Samudre, Mircea Petrache, Brian D. Nord, Shubhendu Trivedi","doi":"arxiv-2409.11772","DOIUrl":"https://doi.org/arxiv-2409.11772","url":null,"abstract":"There has been much recent interest in designing symmetry-aware neural\u0000networks (NNs) exhibiting relaxed equivariance. Such NNs aim to interpolate\u0000between being exactly equivariant and being fully flexible, affording\u0000consistent performance benefits. In a separate line of work, certain structured\u0000parameter matrices -- those with displacement structure, characterized by low\u0000displacement rank (LDR) -- have been used to design small-footprint NNs.\u0000Displacement structure enables fast function and gradient evaluation, but\u0000permits accurate approximations via compression primarily to classical\u0000convolutional neural networks (CNNs). In this work, we propose a general\u0000framework -- based on a novel construction of symmetry-based structured\u0000matrices -- to build approximately equivariant NNs with significantly reduced\u0000parameter counts. Our framework integrates the two aforementioned lines of work\u0000via the use of so-called Group Matrices (GMs), a forgotten precursor to the\u0000modern notion of regular representations of finite groups. GMs allow the design\u0000of structured matrices -- resembling LDR matrices -- which generalize the\u0000linear operations of a classical CNN from cyclic groups to general finite\u0000groups and their homogeneous spaces. We show that GMs can be employed to extend\u0000all the elementary operations of CNNs to general discrete groups. Further, the\u0000theory of structured matrices based on GMs provides a generalization of LDR\u0000theory focussed on matrices with cyclic structure, providing a tool for\u0000implementing approximate equivariance for discrete groups. We test GM-based\u0000architectures on a variety of tasks in the presence of relaxed symmetry. We\u0000report that our framework consistently performs competitively compared to\u0000approximately equivariant NNs, and other structured matrix-based compression\u0000frameworks, sometimes with a one or two orders of magnitude lower parameter\u0000count.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning Unstable Continuous-Time Stochastic Linear Control Systems 学习不稳定的连续时间随机线性控制系统
Pub Date : 2024-09-17 DOI: arxiv-2409.11327
Reza Sadeghi Hafshejani, Mohamad Kazem Shirani Fradonbeh
We study the problem of system identification for stochastic continuous-timedynamics, based on a single finite-length state trajectory. We present a methodfor estimating the possibly unstable open-loop matrix by employing properlyrandomized control inputs. Then, we establish theoretical performanceguarantees showing that the estimation error decays with trajectory length, ameasure of excitability, and the signal-to-noise ratio, while it grows withdimension. Numerical illustrations that showcase the rates of learning thedynamics, will be provided as well. To perform the theoretical analysis, wedevelop new technical tools that are of independent interest. That includesnon-asymptotic stochastic bounds for highly non-stationary martingales andgeneralized laws of iterated logarithms, among others.
我们研究了基于单个有限长状态轨迹的随机连续时间动力学的系统识别问题。我们提出了一种通过采用适当的随机控制输入来估计可能不稳定的开环矩阵的方法。然后,我们建立了理论性能保证,表明估计误差随轨迹长度、兴奋性度量和信噪比的增加而减小,同时它的增长是渐进的。此外,我们还将提供数字图解,展示学习动力学的速度。为了进行理论分析,我们开发了具有独立意义的新技术工具。其中包括高度非稳态马氏随机边界和迭代对数的广义法则等。
{"title":"Learning Unstable Continuous-Time Stochastic Linear Control Systems","authors":"Reza Sadeghi Hafshejani, Mohamad Kazem Shirani Fradonbeh","doi":"arxiv-2409.11327","DOIUrl":"https://doi.org/arxiv-2409.11327","url":null,"abstract":"We study the problem of system identification for stochastic continuous-time\u0000dynamics, based on a single finite-length state trajectory. We present a method\u0000for estimating the possibly unstable open-loop matrix by employing properly\u0000randomized control inputs. Then, we establish theoretical performance\u0000guarantees showing that the estimation error decays with trajectory length, a\u0000measure of excitability, and the signal-to-noise ratio, while it grows with\u0000dimension. Numerical illustrations that showcase the rates of learning the\u0000dynamics, will be provided as well. To perform the theoretical analysis, we\u0000develop new technical tools that are of independent interest. That includes\u0000non-asymptotic stochastic bounds for highly non-stationary martingales and\u0000generalized laws of iterated logarithms, among others.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"119 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Latent mixed-effect models for high-dimensional longitudinal data 高维纵向数据的潜在混合效应模型
Pub Date : 2024-09-17 DOI: arxiv-2409.11008
Priscilla Ong, Manuel Haußmann, Otto Lönnroth, Harri Lähdesmäki
Modelling longitudinal data is an important yet challenging task. Thesedatasets can be high-dimensional, contain non-linear effects and time-varyingcovariates. Gaussian process (GP) prior-based variational autoencoders (VAEs)have emerged as a promising approach due to their ability to model time-seriesdata. However, they are costly to train and struggle to fully exploit the richcovariates characteristic of longitudinal data, making them difficult forpractitioners to use effectively. In this work, we leverage linear mixed models(LMMs) and amortized variational inference to provide conditional priors forVAEs, and propose LMM-VAE, a scalable, interpretable and identifiable model. Wehighlight theoretical connections between it and GP-based techniques, providinga unified framework for this class of methods. Our proposal performscompetitively compared to existing approaches across simulated and real-worlddatasets.
建立纵向数据模型是一项重要而又具有挑战性的任务。这些数据集可能是高维数据,包含非线性效应和时变变量。基于高斯过程(GP)先验的变异自动编码器(VAE)因其能够对时间序列数据建模而成为一种很有前途的方法。然而,它们的训练成本很高,而且难以充分利用纵向数据所特有的丰富变量,因此实践者很难有效地使用它们。在这项工作中,我们利用线性混合模型(LMMs)和摊销变异推理(amortized variational inference)为VAEs提供条件先验,并提出了LMM-VAE--一种可扩展、可解释和可识别的模型。我们强调了它与基于 GP 的技术之间的理论联系,为这类方法提供了一个统一的框架。与现有方法相比,我们的建议在模拟和真实世界数据集上的表现极具竞争力。
{"title":"Latent mixed-effect models for high-dimensional longitudinal data","authors":"Priscilla Ong, Manuel Haußmann, Otto Lönnroth, Harri Lähdesmäki","doi":"arxiv-2409.11008","DOIUrl":"https://doi.org/arxiv-2409.11008","url":null,"abstract":"Modelling longitudinal data is an important yet challenging task. These\u0000datasets can be high-dimensional, contain non-linear effects and time-varying\u0000covariates. Gaussian process (GP) prior-based variational autoencoders (VAEs)\u0000have emerged as a promising approach due to their ability to model time-series\u0000data. However, they are costly to train and struggle to fully exploit the rich\u0000covariates characteristic of longitudinal data, making them difficult for\u0000practitioners to use effectively. In this work, we leverage linear mixed models\u0000(LMMs) and amortized variational inference to provide conditional priors for\u0000VAEs, and propose LMM-VAE, a scalable, interpretable and identifiable model. We\u0000highlight theoretical connections between it and GP-based techniques, providing\u0000a unified framework for this class of methods. Our proposal performs\u0000competitively compared to existing approaches across simulated and real-world\u0000datasets.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"212 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the generalization ability of coarse-grained molecular dynamics models for non-equilibrium processes 论粗粒度分子动力学模型对非平衡态过程的泛化能力
Pub Date : 2024-09-17 DOI: arxiv-2409.11519
Liyao Lyu, Huan Lei
One essential goal of constructing coarse-grained molecular dynamics (CGMD)models is to accurately predict non-equilibrium processes beyond the atomisticscale. While a CG model can be constructed by projecting the full dynamics ontoa set of resolved variables, the dynamics of the CG variables can recover thefull dynamics only when the conditional distribution of the unresolvedvariables is close to the one associated with the particular projectionoperator. In particular, the model's applicability to various non-equilibriumprocesses is generally unwarranted due to the inconsistency in the conditionaldistribution. Here, we present a data-driven approach for constructing CGMDmodels that retain certain generalization ability for non-equilibriumprocesses. Unlike the conventional CG models based on pre-selected CG variables(e.g., the center of mass), the present CG model seeks a set of auxiliary CGvariables based on the time-lagged independent component analysis to minimizethe entropy contribution of the unresolved variables. This ensures thedistribution of the unresolved variables under a broad range of non-equilibriumconditions approaches the one under equilibrium. Numerical results of a polymermelt system demonstrate the significance of this broadly-overlooked metric forthe model's generalization ability, and the effectiveness of the present CGmodel for predicting the complex viscoelastic responses under variousnon-equilibrium flows.
构建粗粒度分子动力学(CGMD)模型的一个基本目标是准确预测原子尺度之外的非平衡过程。虽然粗粒度分子动力学模型可以通过将全动力学投影到一组解析变量来构建,但只有当未解析变量的条件分布接近于与特定投影操作符相关的条件分布时,粗粒度分子动力学变量的动力学才能恢复全动力学。特别是,由于条件分布的不一致性,该模型通常无法适用于各种非平衡过程。在这里,我们提出了一种数据驱动的方法来构建 CGMD 模型,该模型对非平衡过程保留了一定的泛化能力。与传统的基于预选 CG 变量(如质心)的 CG 模型不同,本 CG 模型基于时滞独立分量分析寻找一组辅助 CG 变量,以最小化未解决变量的熵贡献。这确保了未解决变量在各种非平衡条件下的分布接近平衡条件下的分布。聚合熔体系统的数值结果表明了这一被广泛忽视的指标对模型泛化能力的重要意义,以及本 CG 模型预测各种非平衡流动下复杂粘弹性响应的有效性。
{"title":"On the generalization ability of coarse-grained molecular dynamics models for non-equilibrium processes","authors":"Liyao Lyu, Huan Lei","doi":"arxiv-2409.11519","DOIUrl":"https://doi.org/arxiv-2409.11519","url":null,"abstract":"One essential goal of constructing coarse-grained molecular dynamics (CGMD)\u0000models is to accurately predict non-equilibrium processes beyond the atomistic\u0000scale. While a CG model can be constructed by projecting the full dynamics onto\u0000a set of resolved variables, the dynamics of the CG variables can recover the\u0000full dynamics only when the conditional distribution of the unresolved\u0000variables is close to the one associated with the particular projection\u0000operator. In particular, the model's applicability to various non-equilibrium\u0000processes is generally unwarranted due to the inconsistency in the conditional\u0000distribution. Here, we present a data-driven approach for constructing CGMD\u0000models that retain certain generalization ability for non-equilibrium\u0000processes. Unlike the conventional CG models based on pre-selected CG variables\u0000(e.g., the center of mass), the present CG model seeks a set of auxiliary CG\u0000variables based on the time-lagged independent component analysis to minimize\u0000the entropy contribution of the unresolved variables. This ensures the\u0000distribution of the unresolved variables under a broad range of non-equilibrium\u0000conditions approaches the one under equilibrium. Numerical results of a polymer\u0000melt system demonstrate the significance of this broadly-overlooked metric for\u0000the model's generalization ability, and the effectiveness of the present CG\u0000model for predicting the complex viscoelastic responses under various\u0000non-equilibrium flows.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"89 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Outlier Detection with Cluster Catch Digraphs 利用群集捕捉图谱检测离群点
Pub Date : 2024-09-17 DOI: arxiv-2409.11596
Rui Shi, Nedret Billor, Elvan Ceyhan
This paper introduces a novel family of outlier detection algorithms based onCluster Catch Digraphs (CCDs), specifically tailored to address the challengesof high dimensionality and varying cluster shapes, which deteriorate theperformance of most traditional outlier detection methods. We propose theUniformity-Based CCD with Mutual Catch Graph (U-MCCD), the Uniformity- andNeighbor-Based CCD with Mutual Catch Graph (UN-MCCD), and their shape-adaptivevariants (SU-MCCD and SUN-MCCD), which are designed to detect outliers in datasets with arbitrary cluster shapes and high dimensions. We present theadvantages and shortcomings of these algorithms and provide the motivation orneed to define each particular algorithm. Through comprehensive Monte Carlosimulations, we assess their performance and demonstrate the robustness andeffectiveness of our algorithms across various settings and contaminationlevels. We also illustrate the use of our algorithms on various real-life datasets. The U-MCCD algorithm efficiently identifies outliers while maintaininghigh true negative rates, and the SU-MCCD algorithm shows substantialimprovement in handling non-uniform clusters. Additionally, the UN-MCCD andSUN-MCCD algorithms address the limitations of existing methods inhigh-dimensional spaces by utilizing Nearest Neighbor Distances (NND) forclustering and outlier detection. Our results indicate that these novelalgorithms offer substantial advancements in the accuracy and adaptability ofoutlier detection, providing a valuable tool for various real-worldapplications. Keyword: Outlier detection, Graph-based clustering, Cluster catch digraphs,$k$-nearest-neighborhood, Mutual catch graphs, Nearest neighbor distance.
本文介绍了一种基于簇捕获图(CCD)的新型离群点检测算法系列,该算法专门用于解决高维度和不同簇形状带来的挑战,而这些挑战会降低大多数传统离群点检测方法的性能。我们提出了具有相互捕捉图的基于均匀性的 CCD(U-MCCD)、具有相互捕捉图的基于均匀性和邻居的 CCD(UN-MCCD),以及它们的形状自适应变体(SU-MCCD 和 SUN-MCCD),旨在检测具有任意聚类形状和高维度的数据集中的离群值。我们介绍了这些算法的优缺点,并提供了定义每种特定算法的动机或需要。通过全面的蒙特卡洛模拟,我们评估了这些算法的性能,并证明了我们的算法在各种设置和污染水平下的鲁棒性和有效性。我们还在各种实际数据集上展示了算法的应用。U-MCCD 算法能有效识别异常值,同时保持较高的真阴性率;SU-MCCD 算法在处理非均匀聚类方面有很大改进。此外,UN-MCCD 和 SU-MCCD 算法利用近邻距离(NND)进行聚类和离群点检测,解决了现有方法在高维空间中的局限性。我们的研究结果表明,这些新型算法在离群点检测的准确性和适应性方面取得了重大进步,为各种实际应用提供了宝贵的工具。关键词离群点检测、基于图的聚类、聚类捕获数字图、$k$-最近邻、相互捕获图、最近邻距离。
{"title":"Outlier Detection with Cluster Catch Digraphs","authors":"Rui Shi, Nedret Billor, Elvan Ceyhan","doi":"arxiv-2409.11596","DOIUrl":"https://doi.org/arxiv-2409.11596","url":null,"abstract":"This paper introduces a novel family of outlier detection algorithms based on\u0000Cluster Catch Digraphs (CCDs), specifically tailored to address the challenges\u0000of high dimensionality and varying cluster shapes, which deteriorate the\u0000performance of most traditional outlier detection methods. We propose the\u0000Uniformity-Based CCD with Mutual Catch Graph (U-MCCD), the Uniformity- and\u0000Neighbor-Based CCD with Mutual Catch Graph (UN-MCCD), and their shape-adaptive\u0000variants (SU-MCCD and SUN-MCCD), which are designed to detect outliers in data\u0000sets with arbitrary cluster shapes and high dimensions. We present the\u0000advantages and shortcomings of these algorithms and provide the motivation or\u0000need to define each particular algorithm. Through comprehensive Monte Carlo\u0000simulations, we assess their performance and demonstrate the robustness and\u0000effectiveness of our algorithms across various settings and contamination\u0000levels. We also illustrate the use of our algorithms on various real-life data\u0000sets. The U-MCCD algorithm efficiently identifies outliers while maintaining\u0000high true negative rates, and the SU-MCCD algorithm shows substantial\u0000improvement in handling non-uniform clusters. Additionally, the UN-MCCD and\u0000SUN-MCCD algorithms address the limitations of existing methods in\u0000high-dimensional spaces by utilizing Nearest Neighbor Distances (NND) for\u0000clustering and outlier detection. Our results indicate that these novel\u0000algorithms offer substantial advancements in the accuracy and adaptability of\u0000outlier detection, providing a valuable tool for various real-world\u0000applications. Keyword: Outlier detection, Graph-based clustering, Cluster catch digraphs,\u0000$k$-nearest-neighborhood, Mutual catch graphs, Nearest neighbor distance.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"24 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Gaussian Process for operator learning: an uncertainty aware resolution independent operator learning algorithm for computational mechanics 面向算子学习的高斯过程:用于计算力学的不确定性感知分辨率独立算子学习算法
Pub Date : 2024-09-17 DOI: arxiv-2409.10972
Sawan Kumar, Rajdip Nayek, Souvik Chakraborty
The growing demand for accurate, efficient, and scalable solutions incomputational mechanics highlights the need for advanced operator learningalgorithms that can efficiently handle large datasets while providing reliableuncertainty quantification. This paper introduces a novel Gaussian Process (GP)based neural operator for solving parametric differential equations. Theapproach proposed leverages the expressive capability of deterministic neuraloperators and the uncertainty awareness of conventional GP. In particular, wepropose a ``neural operator-embedded kernel'' wherein the GP kernel isformulated in the latent space learned using a neural operator. Further, weexploit a stochastic dual descent (SDD) algorithm for simultaneously trainingthe neural operator parameters and the GP hyperparameters. Our approachaddresses the (a) resolution dependence and (b) cubic complexity of traditionalGP models, allowing for input-resolution independence and scalability inhigh-dimensional and non-linear parametric systems, such as those encounteredin computational mechanics. We apply our method to a range of non-linearparametric partial differential equations (PDEs) and demonstrate itssuperiority in both computational efficiency and accuracy compared to standardGP models and wavelet neural operators. Our experimental results highlight theefficacy of this framework in solving complex PDEs while maintaining robustnessin uncertainty estimation, positioning it as a scalable and reliableoperator-learning algorithm for computational mechanics.
对精确、高效、可扩展的计算力学解决方案的需求日益增长,这凸显了对先进算子学习算法的需求,这种算法既能高效处理大型数据集,又能提供可靠的不确定性量化。本文介绍了一种基于高斯过程(GP)的新型神经算子,用于求解参数微分方程。本文提出的方法充分利用了确定性神经算子的表达能力和传统 GP 的不确定性意识。特别是,我们提出了一种 "神经算子嵌入内核",其中 GP 内核是在使用神经算子学习的潜空间中形成的。此外,我们还利用随机双降(SDD)算法同时训练神经算子参数和 GP 超参数。我们的方法解决了传统 GP 模型的(a)分辨率依赖性和(b)立方复杂性问题,从而实现了输入分辨率的独立性和高维非线性参数系统的可扩展性,例如在计算力学中遇到的系统。我们将我们的方法应用于一系列非线性参数偏微分方程(PDEs),并证明与标准 GP 模型和小波神经算子相比,我们的方法在计算效率和准确性方面都更胜一筹。我们的实验结果凸显了这一框架在求解复杂偏微分方程时的有效性,同时保持了不确定性估计的鲁棒性,使其成为计算力学领域一种可扩展且可靠的算子学习算法。
{"title":"Towards Gaussian Process for operator learning: an uncertainty aware resolution independent operator learning algorithm for computational mechanics","authors":"Sawan Kumar, Rajdip Nayek, Souvik Chakraborty","doi":"arxiv-2409.10972","DOIUrl":"https://doi.org/arxiv-2409.10972","url":null,"abstract":"The growing demand for accurate, efficient, and scalable solutions in\u0000computational mechanics highlights the need for advanced operator learning\u0000algorithms that can efficiently handle large datasets while providing reliable\u0000uncertainty quantification. This paper introduces a novel Gaussian Process (GP)\u0000based neural operator for solving parametric differential equations. The\u0000approach proposed leverages the expressive capability of deterministic neural\u0000operators and the uncertainty awareness of conventional GP. In particular, we\u0000propose a ``neural operator-embedded kernel'' wherein the GP kernel is\u0000formulated in the latent space learned using a neural operator. Further, we\u0000exploit a stochastic dual descent (SDD) algorithm for simultaneously training\u0000the neural operator parameters and the GP hyperparameters. Our approach\u0000addresses the (a) resolution dependence and (b) cubic complexity of traditional\u0000GP models, allowing for input-resolution independence and scalability in\u0000high-dimensional and non-linear parametric systems, such as those encountered\u0000in computational mechanics. We apply our method to a range of non-linear\u0000parametric partial differential equations (PDEs) and demonstrate its\u0000superiority in both computational efficiency and accuracy compared to standard\u0000GP models and wavelet neural operators. Our experimental results highlight the\u0000efficacy of this framework in solving complex PDEs while maintaining robustness\u0000in uncertainty estimation, positioning it as a scalable and reliable\u0000operator-learning algorithm for computational mechanics.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - STAT - Machine Learning
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1