首页 > 最新文献

arXiv - STAT - Machine Learning最新文献

英文 中文
Automated Discovery of Pairwise Interactions from Unstructured Data 从非结构化数据中自动发现配对交互作用
Pub Date : 2024-09-11 DOI: arxiv-2409.07594
ZuhengDavid, Xu, Moksh Jain, Ali Denton, Shawn Whitfield, Aniket Didolkar, Berton Earnshaw, Jason Hartford
Pairwise interactions between perturbations to a system can provide evidencefor the causal dependencies of the underlying underlying mechanisms of asystem. When observations are low dimensional, hand crafted measurements,detecting interactions amounts to simple statistical tests, but it is notobvious how to detect interactions between perturbations affecting latentvariables. We derive two interaction tests that are based on pairwiseinterventions, and show how these tests can be integrated into an activelearning pipeline to efficiently discover pairwise interactions betweenperturbations. We illustrate the value of these tests in the context ofbiology, where pairwise perturbation experiments are frequently used to revealinteractions that are not observable from any single perturbation. Our testscan be run on unstructured data, such as the pixels in an image, which enablesa more general notion of interaction than typical cell viability experiments,and can be run on cheaper experimental assays. We validate on several syntheticand real biological experiments that our tests are able to identify interactingpairs effectively. We evaluate our approach on a real biological experimentwhere we knocked out 50 pairs of genes and measured the effect with microscopyimages. We show that we are able to recover significantly more known biologicalinteractions than random search and standard active learning baselines.
系统扰动之间的成对交互作用可以为系统潜在机制的因果关系提供证据。当观测数据是低维度的手工测量时,检测交互作用只需进行简单的统计检验,但如何检测影响潜在变量的扰动之间的交互作用并不明显。我们推导出了两种基于成对干预的交互检验,并展示了如何将这些检验集成到主动学习管道中,以高效地发现扰动之间成对的交互作用。我们以生物学为背景说明了这些测试的价值,在生物学中,成对扰动实验经常被用来揭示无法从任何单一扰动中观察到的相互作用。我们的测试可以在非结构化数据(如图像中的像素)上运行,这使得交互作用的概念比典型的细胞活力实验更为宽泛,而且可以在成本更低的实验测定上运行。我们在几个合成和真实生物实验中验证了我们的测试能够有效识别相互作用对。我们在一个真实的生物实验中评估了我们的方法,在该实验中我们敲除了 50 对基因,并通过显微镜图像测量了效果。结果表明,与随机搜索和标准主动学习基线相比,我们能够恢复更多的已知生物相互作用。
{"title":"Automated Discovery of Pairwise Interactions from Unstructured Data","authors":"ZuhengDavid, Xu, Moksh Jain, Ali Denton, Shawn Whitfield, Aniket Didolkar, Berton Earnshaw, Jason Hartford","doi":"arxiv-2409.07594","DOIUrl":"https://doi.org/arxiv-2409.07594","url":null,"abstract":"Pairwise interactions between perturbations to a system can provide evidence\u0000for the causal dependencies of the underlying underlying mechanisms of a\u0000system. When observations are low dimensional, hand crafted measurements,\u0000detecting interactions amounts to simple statistical tests, but it is not\u0000obvious how to detect interactions between perturbations affecting latent\u0000variables. We derive two interaction tests that are based on pairwise\u0000interventions, and show how these tests can be integrated into an active\u0000learning pipeline to efficiently discover pairwise interactions between\u0000perturbations. We illustrate the value of these tests in the context of\u0000biology, where pairwise perturbation experiments are frequently used to reveal\u0000interactions that are not observable from any single perturbation. Our tests\u0000can be run on unstructured data, such as the pixels in an image, which enables\u0000a more general notion of interaction than typical cell viability experiments,\u0000and can be run on cheaper experimental assays. We validate on several synthetic\u0000and real biological experiments that our tests are able to identify interacting\u0000pairs effectively. We evaluate our approach on a real biological experiment\u0000where we knocked out 50 pairs of genes and measured the effect with microscopy\u0000images. We show that we are able to recover significantly more known biological\u0000interactions than random search and standard active learning baselines.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Convergence of continuous-time stochastic gradient descent with applications to linear deep neural networks 连续时间随机梯度下降的收敛性及其在线性深度神经网络中的应用
Pub Date : 2024-09-11 DOI: arxiv-2409.07401
Gabor Lugosi, Eulalia Nualart
We study a continuous-time approximation of the stochastic gradient descentprocess for minimizing the expected loss in learning problems. The main resultsestablish general sufficient conditions for the convergence, extending theresults of Chatterjee (2022) established for (nonstochastic) gradient descent.We show how the main result can be applied to the case of overparametrizedlinear neural network training.
我们研究了随机梯度下降过程的连续时间近似值,用于最小化学习问题中的预期损失。主要结果建立了收敛的一般充分条件,扩展了 Chatterjee (2022) 为(非随机)梯度下降建立的结果。
{"title":"Convergence of continuous-time stochastic gradient descent with applications to linear deep neural networks","authors":"Gabor Lugosi, Eulalia Nualart","doi":"arxiv-2409.07401","DOIUrl":"https://doi.org/arxiv-2409.07401","url":null,"abstract":"We study a continuous-time approximation of the stochastic gradient descent\u0000process for minimizing the expected loss in learning problems. The main results\u0000establish general sufficient conditions for the convergence, extending the\u0000results of Chatterjee (2022) established for (nonstochastic) gradient descent.\u0000We show how the main result can be applied to the case of overparametrized\u0000linear neural network training.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring User-level Gradient Inversion with a Diffusion Prior 利用扩散先验探索用户级梯度反演
Pub Date : 2024-09-11 DOI: arxiv-2409.07291
Zhuohang Li, Andrew Lowy, Jing Liu, Toshiaki Koike-Akino, Bradley Malin, Kieran Parsons, Ye Wang
We explore user-level gradient inversion as a new attack surface indistributed learning. We first investigate existing attacks on their ability tomake inferences about private information beyond training data reconstruction.Motivated by the low reconstruction quality of existing methods, we propose anovel gradient inversion attack that applies a denoising diffusion model as astrong image prior in order to enhance recovery in the large batch setting.Unlike traditional attacks, which aim to reconstruct individual samples andsuffer at large batch and image sizes, our approach instead aims to recover arepresentative image that captures the sensitive shared semantic informationcorresponding to the underlying user. Our experiments with face imagesdemonstrate the ability of our methods to recover realistic facial images alongwith private user attributes.
我们将用户级梯度反转作为一种新的分布式学习攻击面进行了探索。我们首先研究了现有攻击在训练数据重建之外对隐私信息进行推断的能力。由于现有方法的重建质量较低,我们提出了一种新的梯度反转攻击,它应用去噪扩散模型作为强图像先验,以增强大批量环境下的恢复能力。传统攻击旨在重建单个样本,在大批量和大图像规模下会受到影响,而我们的方法则旨在恢复呈现性图像,捕捉与底层用户相对应的敏感共享语义信息。我们对人脸图像的实验证明,我们的方法能够恢复真实的人脸图像以及用户的私人属性。
{"title":"Exploring User-level Gradient Inversion with a Diffusion Prior","authors":"Zhuohang Li, Andrew Lowy, Jing Liu, Toshiaki Koike-Akino, Bradley Malin, Kieran Parsons, Ye Wang","doi":"arxiv-2409.07291","DOIUrl":"https://doi.org/arxiv-2409.07291","url":null,"abstract":"We explore user-level gradient inversion as a new attack surface in\u0000distributed learning. We first investigate existing attacks on their ability to\u0000make inferences about private information beyond training data reconstruction.\u0000Motivated by the low reconstruction quality of existing methods, we propose a\u0000novel gradient inversion attack that applies a denoising diffusion model as a\u0000strong image prior in order to enhance recovery in the large batch setting.\u0000Unlike traditional attacks, which aim to reconstruct individual samples and\u0000suffer at large batch and image sizes, our approach instead aims to recover a\u0000representative image that captures the sensitive shared semantic information\u0000corresponding to the underlying user. Our experiments with face images\u0000demonstrate the ability of our methods to recover realistic facial images along\u0000with private user attributes.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tuning-Free Online Robust Principal Component Analysis through Implicit Regularization 通过隐含正则化实现无调整在线稳健主成分分析
Pub Date : 2024-09-11 DOI: arxiv-2409.07275
Lakshmi Jayalal, Gokularam Muthukrishnan, Sheetal Kalyani
The performance of the standard Online Robust Principal Component Analysis(OR-PCA) technique depends on the optimum tuning of the explicit regularizersand this tuning is dataset sensitive. We aim to remove the dependency on thesetuning parameters by using implicit regularization. We propose to use theimplicit regularization effect of various modified gradient descents to makeOR-PCA tuning free. Our method incorporates three different versions ofmodified gradient descent that separately but naturally encourage sparsity andlow-rank structures in the data. The proposed method performs comparable orbetter than the tuned OR-PCA for both simulated and real-world datasets.Tuning-free ORPCA makes it more scalable for large datasets since we do notrequire dataset-dependent parameter tuning.
标准在线稳健主成分分析(OR-PCA)技术的性能取决于显式正则化的最佳调整,而这种调整对数据集非常敏感。我们的目标是通过使用隐式正则化来消除对调整参数的依赖。我们建议利用各种修正梯度下降的隐式正则化效果,使 OR-PCA 的调整不受限制。我们的方法采用了三种不同版本的修正梯度下降法,分别自然地鼓励数据中的稀疏性和低秩结构。在模拟数据集和实际数据集上,所提出的方法都比经过调整的 OR-PCA 性能更好。
{"title":"Tuning-Free Online Robust Principal Component Analysis through Implicit Regularization","authors":"Lakshmi Jayalal, Gokularam Muthukrishnan, Sheetal Kalyani","doi":"arxiv-2409.07275","DOIUrl":"https://doi.org/arxiv-2409.07275","url":null,"abstract":"The performance of the standard Online Robust Principal Component Analysis\u0000(OR-PCA) technique depends on the optimum tuning of the explicit regularizers\u0000and this tuning is dataset sensitive. We aim to remove the dependency on these\u0000tuning parameters by using implicit regularization. We propose to use the\u0000implicit regularization effect of various modified gradient descents to make\u0000OR-PCA tuning free. Our method incorporates three different versions of\u0000modified gradient descent that separately but naturally encourage sparsity and\u0000low-rank structures in the data. The proposed method performs comparable or\u0000better than the tuned OR-PCA for both simulated and real-world datasets.\u0000Tuning-free ORPCA makes it more scalable for large datasets since we do not\u0000require dataset-dependent parameter tuning.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"203 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reranking Laws for Language Generation: A Communication-Theoretic Perspective 语言生成的重新排序法则:传播理论视角
Pub Date : 2024-09-11 DOI: arxiv-2409.07131
António Farinhas, Haau-Sing Li, André F. T. Martins
To ensure large language models (LLMs) are used safely, one must reduce theirpropensity to hallucinate or to generate unacceptable answers. A simple andoften used strategy is to first let the LLM generate multiple hypotheses andthen employ a reranker to choose the best one. In this paper, we draw aparallel between this strategy and the use of redundancy to decrease the errorrate in noisy communication channels. We conceptualize the generator as asender transmitting multiple descriptions of a message through parallel noisychannels. The receiver decodes the message by ranking the (potentiallycorrupted) descriptions and selecting the one found to be most reliable. Weprovide conditions under which this protocol is asymptotically error-free(i.e., yields an acceptable answer almost surely) even in scenarios where thereranker is imperfect (governed by Mallows or Zipf-Mandelbrot models) and thechannel distributions are statistically dependent. We use our framework toobtain reranking laws which we validate empirically on two real-world tasksusing LLMs: text-to-code generation with DeepSeek-Coder 7B and machinetranslation of medical data with TowerInstruct 13B.
为了确保大型语言模型(LLMs)的安全使用,我们必须降低它们产生幻觉或不可接受答案的可能性。一种简单且常用的策略是,首先让 LLM 生成多个假设,然后使用重排器选择最佳假设。在本文中,我们将这一策略与使用冗余来降低噪声通信信道中的错误率相提并论。我们将生成器概念化为通过并行噪声信道传输多个信息描述的发送者。接收者通过对(可能被破坏的)描述进行排序,并选择其中最可靠的描述来解码信息。我们提供了一些条件,在这些条件下,即使接收者不完美(受 Mallows 或 Zipf-Mandelbrot 模型支配),而且信道分布在统计学上具有依赖性,该协议也能近似无错(即几乎肯定能得到可接受的答案)。我们利用我们的框架获得了重排定律,并在两个使用 LLM 的实际任务中进行了实证验证:使用 DeepSeek-Coder 7B 进行文本到代码的生成,以及使用 TowerInstruct 13B 进行医疗数据的机器翻译。
{"title":"Reranking Laws for Language Generation: A Communication-Theoretic Perspective","authors":"António Farinhas, Haau-Sing Li, André F. T. Martins","doi":"arxiv-2409.07131","DOIUrl":"https://doi.org/arxiv-2409.07131","url":null,"abstract":"To ensure large language models (LLMs) are used safely, one must reduce their\u0000propensity to hallucinate or to generate unacceptable answers. A simple and\u0000often used strategy is to first let the LLM generate multiple hypotheses and\u0000then employ a reranker to choose the best one. In this paper, we draw a\u0000parallel between this strategy and the use of redundancy to decrease the error\u0000rate in noisy communication channels. We conceptualize the generator as a\u0000sender transmitting multiple descriptions of a message through parallel noisy\u0000channels. The receiver decodes the message by ranking the (potentially\u0000corrupted) descriptions and selecting the one found to be most reliable. We\u0000provide conditions under which this protocol is asymptotically error-free\u0000(i.e., yields an acceptable answer almost surely) even in scenarios where the\u0000reranker is imperfect (governed by Mallows or Zipf-Mandelbrot models) and the\u0000channel distributions are statistically dependent. We use our framework to\u0000obtain reranking laws which we validate empirically on two real-world tasks\u0000using LLMs: text-to-code generation with DeepSeek-Coder 7B and machine\u0000translation of medical data with TowerInstruct 13B.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
From optimal score matching to optimal sampling 从最佳分数匹配到最佳抽样
Pub Date : 2024-09-11 DOI: arxiv-2409.07032
Zehao Dou, Subhodh Kotekal, Zhehao Xu, Harrison H. Zhou
The recent, impressive advances in algorithmic generation of high-fidelityimage, audio, and video are largely due to great successes in score-baseddiffusion models. A key implementing step is score matching, that is, theestimation of the score function of the forward diffusion process from trainingdata. As shown in earlier literature, the total variation distance between thelaw of a sample generated from the trained diffusion model and the ground truthdistribution can be controlled by the score matching risk. Despite the widespread use of score-based diffusion models, basic theoreticalquestions concerning exact optimal statistical rates for score estimation andits application to density estimation remain open. We establish the sharpminimax rate of score estimation for smooth, compactly supported densities.Formally, given (n) i.i.d. samples from an unknown (alpha)-H"{o}lderdensity (f) supported on ([-1, 1]), we prove the minimax rate of estimatingthe score function of the diffused distribution (f * mathcal{N}(0, t)) withrespect to the score matching loss is (frac{1}{nt^2} wedgefrac{1}{nt^{3/2}} wedge (t^{alpha-1} + n^{-2(alpha-1)/(2alpha+1)})) forall (alpha > 0) and (t ge 0). As a consequence, it is shown the law(hat{f}) of a sample generated from the diffusion model achieves the sharpminimax rate (bE(dTV(hat{f}, f)^2) lesssim n^{-2alpha/(2alpha+1)}) forall (alpha > 0) without any extraneous logarithmic terms which are prevalentin the literature, and without the need for early stopping which has beenrequired for all existing procedures to the best of our knowledge.
最近,高保真图像、音频和视频的算法生成技术取得了令人瞩目的进步,这在很大程度上归功于基于分数的扩散模型所取得的巨大成功。一个关键的实现步骤是分数匹配,即从训练数据中估计前向扩散过程的分数函数。如早期文献所示,由训练好的扩散模型生成的样本规律与地面真实分布之间的总变异距离可由分数匹配风险控制。尽管基于分数的扩散模型得到了广泛应用,但有关分数估计的精确最优统计率及其在密度估计中的应用等基本理论问题仍未解决。我们建立了平滑、紧凑支撑密度的分数估计的锐敏最大率。形式上,给定(n)个 i i d 样本,这些样本来自一个未知的支持在([-1, 1])上的 (α)-H{o}lderdensity (f),我们证明相对于分数匹配损失,估计扩散分布 (f*mathcal{N}(0, t))的分数函数的最小率是(frac{1}{nt^2})。wedgefrac{1}{nt^{3/2}}(t^{alpha-1} + n^{-2(alpha-1)/(2alpha+1)})) forall (alpha > 0) and(t ge 0).结果表明,由扩散模型生成的样本的律(hat{f})达到了sharpminimax率(bE(dTV(hat{f}、f)^2) lesssim n^{-2alpha/(2alpha+1)}) forall (alpha > 0) without any extraneous logarithmic terms which are prevalent in the literature, and without the need for early stopping which has beenrequired for all existing procedures to the best of our knowledge.
{"title":"From optimal score matching to optimal sampling","authors":"Zehao Dou, Subhodh Kotekal, Zhehao Xu, Harrison H. Zhou","doi":"arxiv-2409.07032","DOIUrl":"https://doi.org/arxiv-2409.07032","url":null,"abstract":"The recent, impressive advances in algorithmic generation of high-fidelity\u0000image, audio, and video are largely due to great successes in score-based\u0000diffusion models. A key implementing step is score matching, that is, the\u0000estimation of the score function of the forward diffusion process from training\u0000data. As shown in earlier literature, the total variation distance between the\u0000law of a sample generated from the trained diffusion model and the ground truth\u0000distribution can be controlled by the score matching risk. Despite the widespread use of score-based diffusion models, basic theoretical\u0000questions concerning exact optimal statistical rates for score estimation and\u0000its application to density estimation remain open. We establish the sharp\u0000minimax rate of score estimation for smooth, compactly supported densities.\u0000Formally, given (n) i.i.d. samples from an unknown (alpha)-H\"{o}lder\u0000density (f) supported on ([-1, 1]), we prove the minimax rate of estimating\u0000the score function of the diffused distribution (f * mathcal{N}(0, t)) with\u0000respect to the score matching loss is (frac{1}{nt^2} wedge\u0000frac{1}{nt^{3/2}} wedge (t^{alpha-1} + n^{-2(alpha-1)/(2alpha+1)})) for\u0000all (alpha > 0) and (t ge 0). As a consequence, it is shown the law\u0000(hat{f}) of a sample generated from the diffusion model achieves the sharp\u0000minimax rate (bE(dTV(hat{f}, f)^2) lesssim n^{-2alpha/(2alpha+1)}) for\u0000all (alpha > 0) without any extraneous logarithmic terms which are prevalent\u0000in the literature, and without the need for early stopping which has been\u0000required for all existing procedures to the best of our knowledge.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Training-Free Guidance for Discrete Diffusion Models for Molecular Generation 分子生成离散扩散模型的免训练指导
Pub Date : 2024-09-11 DOI: arxiv-2409.07359
Thomas J. Kerby, Kevin R. Moon
Training-free guidance methods for continuous data have seen an explosion ofinterest due to the fact that they enable foundation diffusion models to bepaired with interchangable guidance models. Currently, equivalent guidancemethods for discrete diffusion models are unknown. We present a framework forapplying training-free guidance to discrete data and demonstrate its utility onmolecular graph generation tasks using the discrete diffusion modelarchitecture of DiGress. We pair this model with guidance functions that returnthe proportion of heavy atoms that are a specific atom type and the molecularweight of the heavy atoms and demonstrate our method's ability to guide thedata generation.
连续数据的免训练指导方法使基础扩散模型可以与可互换的指导模型配对,因此引起了人们的极大兴趣。目前,离散扩散模型的等效指导方法尚不为人知。我们提出了一种将免训练指导应用于离散数据的框架,并利用 DiGress 的离散扩散模型架构在分子图生成任务中演示了它的实用性。我们将该模型与返回特定原子类型的重原子比例和重原子分子量的指导函数配对,并演示了我们的方法指导数据生成的能力。
{"title":"Training-Free Guidance for Discrete Diffusion Models for Molecular Generation","authors":"Thomas J. Kerby, Kevin R. Moon","doi":"arxiv-2409.07359","DOIUrl":"https://doi.org/arxiv-2409.07359","url":null,"abstract":"Training-free guidance methods for continuous data have seen an explosion of\u0000interest due to the fact that they enable foundation diffusion models to be\u0000paired with interchangable guidance models. Currently, equivalent guidance\u0000methods for discrete diffusion models are unknown. We present a framework for\u0000applying training-free guidance to discrete data and demonstrate its utility on\u0000molecular graph generation tasks using the discrete diffusion model\u0000architecture of DiGress. We pair this model with guidance functions that return\u0000the proportion of heavy atoms that are a specific atom type and the molecular\u0000weight of the heavy atoms and demonstrate our method's ability to guide the\u0000data generation.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"62 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient and Unbiased Sampling of Boltzmann Distributions via Consistency Models 通过一致性模型对玻尔兹曼分布进行高效无偏采样
Pub Date : 2024-09-11 DOI: arxiv-2409.07323
Fengzhe Zhang, Jiajun He, Laurence I. Midgley, Javier Antorán, José Miguel Hernández-Lobato
Diffusion models have shown promising potential for advancing BoltzmannGenerators. However, two critical challenges persist: (1) inherent errors insamples due to model imperfections, and (2) the requirement of hundreds offunctional evaluations (NFEs) to achieve high-quality samples. While existingsolutions like importance sampling and distillation address these issuesseparately, they are often incompatible, as most distillation models lack thenecessary density information for importance sampling. This paper introduces anovel sampling method that effectively combines Consistency Models (CMs) withimportance sampling. We evaluate our approach on both synthetic energyfunctions and equivariant n-body particle systems. Our method produces unbiasedsamples using only 6-25 NFEs while achieving a comparable Effective Sample Size(ESS) to Denoising Diffusion Probabilistic Models (DDPMs) that requireapproximately 100 NFEs.
扩散模型在推动玻尔兹曼发电机的发展方面显示出了巨大的潜力。然而,两个关键挑战依然存在:(1) 模型不完善导致的固有样本误差;(2) 需要数百次功能评估 (NFE) 才能获得高质量样本。虽然现有的解决方案(如重要性采样和蒸馏)可以分别解决这些问题,但它们往往互不兼容,因为大多数蒸馏模型缺乏重要性采样所需的密度信息。本文介绍了一种有效结合一致性模型(CM)和重要性采样的高级采样方法。我们在合成能量函数和等变 n 体粒子系统上评估了我们的方法。我们的方法仅使用 6-25 个 NFE 就能产生无偏样本,同时达到与需要约 100 个 NFE 的去噪扩散概率模型(DDPM)相当的有效样本量(ESS)。
{"title":"Efficient and Unbiased Sampling of Boltzmann Distributions via Consistency Models","authors":"Fengzhe Zhang, Jiajun He, Laurence I. Midgley, Javier Antorán, José Miguel Hernández-Lobato","doi":"arxiv-2409.07323","DOIUrl":"https://doi.org/arxiv-2409.07323","url":null,"abstract":"Diffusion models have shown promising potential for advancing Boltzmann\u0000Generators. However, two critical challenges persist: (1) inherent errors in\u0000samples due to model imperfections, and (2) the requirement of hundreds of\u0000functional evaluations (NFEs) to achieve high-quality samples. While existing\u0000solutions like importance sampling and distillation address these issues\u0000separately, they are often incompatible, as most distillation models lack the\u0000necessary density information for importance sampling. This paper introduces a\u0000novel sampling method that effectively combines Consistency Models (CMs) with\u0000importance sampling. We evaluate our approach on both synthetic energy\u0000functions and equivariant n-body particle systems. Our method produces unbiased\u0000samples using only 6-25 NFEs while achieving a comparable Effective Sample Size\u0000(ESS) to Denoising Diffusion Probabilistic Models (DDPMs) that require\u0000approximately 100 NFEs.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Manifold Learning via Foliations and Knowledge Transfer 通过对折和知识转移进行多元学习
Pub Date : 2024-09-11 DOI: arxiv-2409.07412
E. Tron, E. Fioresi
Understanding how real data is distributed in high dimensional spaces is thekey to many tasks in machine learning. We want to provide a natural geometricstructure on the space of data employing a deep ReLU neural network trained asa classifier. Through the data information matrix (DIM), a variation of theFisher information matrix, the model will discern a singular foliationstructure on the space of data. We show that the singular points of suchfoliation are contained in a measure zero set, and that a local regularfoliation exists almost everywhere. Experiments show that the data iscorrelated with leaves of such foliation. Moreover we show the potential of ourapproach for knowledge transfer by analyzing the spectrum of the DIM to measuredistances between datasets.
了解真实数据在高维空间中的分布是机器学习中许多任务的关键。我们希望利用经过训练的深度 ReLU 神经网络作为分类器,为数据空间提供自然的几何结构。通过数据信息矩阵(DIM)--菲舍尔信息矩阵的一种变体--模型将辨别数据空间上的奇异对折结构。我们证明,这种褶皱的奇异点包含在一个度量为零的集合中,而且几乎在所有地方都存在局部规则褶皱。实验表明,数据与这种褶皱的叶子相关。此外,我们还通过分析 DIM 的频谱来测量数据集之间的差异,从而展示了我们的方法在知识转移方面的潜力。
{"title":"Manifold Learning via Foliations and Knowledge Transfer","authors":"E. Tron, E. Fioresi","doi":"arxiv-2409.07412","DOIUrl":"https://doi.org/arxiv-2409.07412","url":null,"abstract":"Understanding how real data is distributed in high dimensional spaces is the\u0000key to many tasks in machine learning. We want to provide a natural geometric\u0000structure on the space of data employing a deep ReLU neural network trained as\u0000a classifier. Through the data information matrix (DIM), a variation of the\u0000Fisher information matrix, the model will discern a singular foliation\u0000structure on the space of data. We show that the singular points of such\u0000foliation are contained in a measure zero set, and that a local regular\u0000foliation exists almost everywhere. Experiments show that the data is\u0000correlated with leaves of such foliation. Moreover we show the potential of our\u0000approach for knowledge transfer by analyzing the spectrum of the DIM to measure\u0000distances between datasets.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
k-MLE, k-Bregman, k-VARs: Theory, Convergence, Computation k-MLE, k-Bregman, k-VARs:理论、收敛、计算
Pub Date : 2024-09-11 DOI: arxiv-2409.06938
Zuogong Yue, Victor Solo
We develop hard clustering based on likelihood rather than distance and proveconvergence. We also provide simulations and real data examples.
我们开发了基于似然而非距离的硬聚类,并证明了收敛性。我们还提供了模拟和真实数据示例。
{"title":"k-MLE, k-Bregman, k-VARs: Theory, Convergence, Computation","authors":"Zuogong Yue, Victor Solo","doi":"arxiv-2409.06938","DOIUrl":"https://doi.org/arxiv-2409.06938","url":null,"abstract":"We develop hard clustering based on likelihood rather than distance and prove\u0000convergence. We also provide simulations and real data examples.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - STAT - Machine Learning
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1