首页 > 最新文献

Proceedings of the ... International Conference on Machine Learning. International Conference on Machine Learning最新文献

英文 中文
Differential Privacy, Linguistic Fairness, and Training Data Influence: Impossibility and Possibility Theorems for Multilingual Language Models 差异隐私、语言公平和训练数据影响:多语言模型的不可能性和可能性定理
Phillip Rust, Anders Søgaard
Language models such as mBERT, XLM-R, and BLOOM aim to achieve multilingual generalization or compression to facilitate transfer to a large number of (potentially unseen) languages. However, these models should ideally also be private, linguistically fair, and transparent, by relating their predictions to training data. Can these requirements be simultaneously satisfied? We show that multilingual compression and linguistic fairness are compatible with differential privacy, but that differential privacy is at odds with training data influence sparsity, an objective for transparency. We further present a series of experiments on two common NLP tasks and evaluate multilingual compression and training data influence sparsity under different privacy guarantees, exploring these trade-offs in more detail. Our results suggest that we need to develop ways to jointly optimize for these objectives in order to find practical trade-offs.
语言模型,如mBERT、XLM-R和BLOOM,旨在实现多语言泛化或压缩,以方便迁移到大量(可能看不见的)语言。然而,理想情况下,通过将这些模型的预测与训练数据联系起来,这些模型也应该是私有的、语言上公平的和透明的。这些要求能同时得到满足吗?我们表明,多语言压缩和语言公平性与差分隐私兼容,但差分隐私与训练数据影响稀疏性(透明度的目标)不一致。我们进一步在两个常见的NLP任务上进行了一系列实验,并评估了不同隐私保证下多语言压缩和训练数据对稀疏性的影响,更详细地探索了这些权衡。我们的结果表明,我们需要开发方法来共同优化这些目标,以便找到实际的权衡。
{"title":"Differential Privacy, Linguistic Fairness, and Training Data Influence: Impossibility and Possibility Theorems for Multilingual Language Models","authors":"Phillip Rust, Anders Søgaard","doi":"10.48550/arXiv.2308.08774","DOIUrl":"https://doi.org/10.48550/arXiv.2308.08774","url":null,"abstract":"Language models such as mBERT, XLM-R, and BLOOM aim to achieve multilingual generalization or compression to facilitate transfer to a large number of (potentially unseen) languages. However, these models should ideally also be private, linguistically fair, and transparent, by relating their predictions to training data. Can these requirements be simultaneously satisfied? We show that multilingual compression and linguistic fairness are compatible with differential privacy, but that differential privacy is at odds with training data influence sparsity, an objective for transparency. We further present a series of experiments on two common NLP tasks and evaluate multilingual compression and training data influence sparsity under different privacy guarantees, exploring these trade-offs in more detail. Our results suggest that we need to develop ways to jointly optimize for these objectives in order to find practical trade-offs.","PeriodicalId":74529,"journal":{"name":"Proceedings of the ... International Conference on Machine Learning. International Conference on Machine Learning","volume":"40 1","pages":"29354-29387"},"PeriodicalIF":0.0,"publicationDate":"2023-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81281834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ske2Grid: Skeleton-to-Grid Representation Learning for Action Recognition Ske2Grid:用于动作识别的骨架到网格表示学习
Dongqi Cai, Yangyuxuan Kang, Anbang Yao, Yurong Chen
This paper presents Ske2Grid, a new representation learning framework for improved skeleton-based action recognition. In Ske2Grid, we define a regular convolution operation upon a novel grid representation of human skeleton, which is a compact image-like grid patch constructed and learned through three novel designs. Specifically, we propose a graph-node index transform (GIT) to construct a regular grid patch through assigning the nodes in the skeleton graph one by one to the desired grid cells. To ensure that GIT is a bijection and enrich the expressiveness of the grid representation, an up-sampling transform (UPT) is learned to interpolate the skeleton graph nodes for filling the grid patch to the full. To resolve the problem when the one-step UPT is aggressive and further exploit the representation capability of the grid patch with increasing spatial size, a progressive learning strategy (PLS) is proposed which decouples the UPT into multiple steps and aligns them to multiple paired GITs through a compact cascaded design learned progressively. We construct networks upon prevailing graph convolution networks and conduct experiments on six mainstream skeleton-based action recognition datasets. Experiments show that our Ske2Grid significantly outperforms existing GCN-based solutions under different benchmark settings, without bells and whistles. Code and models are available at https://github.com/OSVAI/Ske2Grid
本文提出了Ske2Grid,一种新的表示学习框架,用于改进基于骨架的动作识别。在Ske2Grid中,我们对一种新的人体骨骼网格表示定义了一个规则的卷积操作,该网格表示是一个紧凑的图像状网格补丁,通过三种新颖的设计构建和学习。具体来说,我们提出了一种图节点索引变换(GIT),通过将骨架图中的节点逐一分配给所需的网格单元来构建规则的网格补丁。为了保证GIT是一个双射,并丰富网格表示的表达性,学习了上采样变换(UPT)来插值骨架图节点,以充分填充网格补丁。为了解决单步UPT过于侵略性的问题,进一步利用网格块随着空间大小的增加而呈现的能力,提出了一种渐进学习策略(PLS),该策略将UPT解耦成多步,并通过渐进学习的紧凑级联设计将它们对齐到多个成对的git。我们在流行的图卷积网络上构建网络,并在六种主流的基于骨架的动作识别数据集上进行了实验。实验表明,我们的Ske2Grid在不同的基准设置下明显优于现有的基于gcn的解决方案,没有花哨的东西。代码和模型可在https://github.com/OSVAI/Ske2Grid上获得
{"title":"Ske2Grid: Skeleton-to-Grid Representation Learning for Action Recognition","authors":"Dongqi Cai, Yangyuxuan Kang, Anbang Yao, Yurong Chen","doi":"10.48550/arXiv.2308.07571","DOIUrl":"https://doi.org/10.48550/arXiv.2308.07571","url":null,"abstract":"This paper presents Ske2Grid, a new representation learning framework for improved skeleton-based action recognition. In Ske2Grid, we define a regular convolution operation upon a novel grid representation of human skeleton, which is a compact image-like grid patch constructed and learned through three novel designs. Specifically, we propose a graph-node index transform (GIT) to construct a regular grid patch through assigning the nodes in the skeleton graph one by one to the desired grid cells. To ensure that GIT is a bijection and enrich the expressiveness of the grid representation, an up-sampling transform (UPT) is learned to interpolate the skeleton graph nodes for filling the grid patch to the full. To resolve the problem when the one-step UPT is aggressive and further exploit the representation capability of the grid patch with increasing spatial size, a progressive learning strategy (PLS) is proposed which decouples the UPT into multiple steps and aligns them to multiple paired GITs through a compact cascaded design learned progressively. We construct networks upon prevailing graph convolution networks and conduct experiments on six mainstream skeleton-based action recognition datasets. Experiments show that our Ske2Grid significantly outperforms existing GCN-based solutions under different benchmark settings, without bells and whistles. Code and models are available at https://github.com/OSVAI/Ske2Grid","PeriodicalId":74529,"journal":{"name":"Proceedings of the ... International Conference on Machine Learning. International Conference on Machine Learning","volume":"29 1","pages":"3431-3441"},"PeriodicalIF":0.0,"publicationDate":"2023-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76293058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Probabilistic Imputation for Time-series Classification with Missing Data 缺失数据下时间序列分类的概率估计
Seunghyun Kim, Hyunsung Kim, Eunggu Yun, Hwa-Kyung Lee, Jaehun Lee, Juho Lee
Multivariate time series data for real-world applications typically contain a significant amount of missing values. The dominant approach for classification with such missing values is to impute them heuristically with specific values (zero, mean, values of adjacent time-steps) or learnable parameters. However, these simple strategies do not take the data generative process into account, and more importantly, do not effectively capture the uncertainty in prediction due to the multiple possibilities for the missing values. In this paper, we propose a novel probabilistic framework for classification with multivariate time series data with missing values. Our model consists of two parts; a deep generative model for missing value imputation and a classifier. Extending the existing deep generative models to better capture structures of time-series data, our deep generative model part is trained to impute the missing values in multiple plausible ways, effectively modeling the uncertainty of the imputation. The classifier part takes the time series data along with the imputed missing values and classifies signals, and is trained to capture the predictive uncertainty due to the multiple possibilities of imputations. Importantly, we show that na"ively combining the generative model and the classifier could result in trivial solutions where the generative model does not produce meaningful imputations. To resolve this, we present a novel regularization technique that can promote the model to produce useful imputation values that help classification. Through extensive experiments on real-world time series data with missing values, we demonstrate the effectiveness of our method.
实际应用程序的多变量时间序列数据通常包含大量的缺失值。对这些缺失值进行分类的主要方法是启发式地将它们与特定值(零、平均值、相邻时间步长的值)或可学习参数相关联。然而,这些简单的策略没有考虑到数据生成过程,更重要的是,由于缺失值的多种可能性,无法有效地捕捉预测中的不确定性。在本文中,我们提出了一个新的概率框架来分类具有缺失值的多变量时间序列数据。我们的模型由两部分组成;缺失值输入的深度生成模型和分类器。为了更好地捕获时间序列数据的结构,我们的深度生成模型部分被训练成以多种合理的方式输入缺失值,有效地建模输入的不确定性。分类器部分将时间序列数据与输入的缺失值一起进行分类,并进行训练以捕获由于输入的多种可能性而导致的预测不确定性。重要的是,我们表明,简单地结合生成模型和分类器可能导致生成模型不产生有意义的输入的平凡解。为了解决这个问题,我们提出了一种新的正则化技术,可以促进模型产生有用的输入值,帮助分类。通过对具有缺失值的真实时间序列数据的大量实验,我们证明了该方法的有效性。
{"title":"Probabilistic Imputation for Time-series Classification with Missing Data","authors":"Seunghyun Kim, Hyunsung Kim, Eunggu Yun, Hwa-Kyung Lee, Jaehun Lee, Juho Lee","doi":"10.48550/arXiv.2308.06738","DOIUrl":"https://doi.org/10.48550/arXiv.2308.06738","url":null,"abstract":"Multivariate time series data for real-world applications typically contain a significant amount of missing values. The dominant approach for classification with such missing values is to impute them heuristically with specific values (zero, mean, values of adjacent time-steps) or learnable parameters. However, these simple strategies do not take the data generative process into account, and more importantly, do not effectively capture the uncertainty in prediction due to the multiple possibilities for the missing values. In this paper, we propose a novel probabilistic framework for classification with multivariate time series data with missing values. Our model consists of two parts; a deep generative model for missing value imputation and a classifier. Extending the existing deep generative models to better capture structures of time-series data, our deep generative model part is trained to impute the missing values in multiple plausible ways, effectively modeling the uncertainty of the imputation. The classifier part takes the time series data along with the imputed missing values and classifies signals, and is trained to capture the predictive uncertainty due to the multiple possibilities of imputations. Importantly, we show that na\"ively combining the generative model and the classifier could result in trivial solutions where the generative model does not produce meaningful imputations. To resolve this, we present a novel regularization technique that can promote the model to produce useful imputation values that help classification. Through extensive experiments on real-world time series data with missing values, we demonstrate the effectiveness of our method.","PeriodicalId":74529,"journal":{"name":"Proceedings of the ... International Conference on Machine Learning. International Conference on Machine Learning","volume":"26 1","pages":"16654-16667"},"PeriodicalIF":0.0,"publicationDate":"2023-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87908499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Decoding Layer Saliency in Language Transformers 语言转换器译码层显著性研究
Elizabeth M. Hou, Greg Castañón
In this paper, we introduce a strategy for identifying textual saliency in large-scale language models applied to classification tasks. In visual networks where saliency is more well-studied, saliency is naturally localized through the convolutional layers of the network; however, the same is not true in modern transformer-stack networks used to process natural language. We adapt gradient-based saliency methods for these networks, propose a method for evaluating the degree of semantic coherence of each layer, and demonstrate consistent improvement over numerous other methods for textual saliency on multiple benchmark classification datasets. Our approach requires no additional training or access to labelled data, and is comparatively very computationally efficient.
本文介绍了一种用于分类任务的大规模语言模型中的文本显著性识别策略。在显著性研究更深入的视觉网络中,显著性通过网络的卷积层自然定位;然而,在用于处理自然语言的现代变压器堆栈网络中,情况并非如此。我们对这些网络采用了基于梯度的显著性方法,提出了一种评估每层语义连贯程度的方法,并在多个基准分类数据集上证明了在文本显著性方面优于许多其他方法的一致性改进。我们的方法不需要额外的训练或访问标记数据,并且相对来说计算效率很高。
{"title":"Decoding Layer Saliency in Language Transformers","authors":"Elizabeth M. Hou, Greg Castañón","doi":"10.48550/arXiv.2308.05219","DOIUrl":"https://doi.org/10.48550/arXiv.2308.05219","url":null,"abstract":"In this paper, we introduce a strategy for identifying textual saliency in large-scale language models applied to classification tasks. In visual networks where saliency is more well-studied, saliency is naturally localized through the convolutional layers of the network; however, the same is not true in modern transformer-stack networks used to process natural language. We adapt gradient-based saliency methods for these networks, propose a method for evaluating the degree of semantic coherence of each layer, and demonstrate consistent improvement over numerous other methods for textual saliency on multiple benchmark classification datasets. Our approach requires no additional training or access to labelled data, and is comparatively very computationally efficient.","PeriodicalId":74529,"journal":{"name":"Proceedings of the ... International Conference on Machine Learning. International Conference on Machine Learning","volume":"83 1","pages":"13285-13308"},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83790423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Do You Remember? Overcoming Catastrophic Forgetting for Fake Audio Detection 你还记得吗?克服灾难性遗忘的假音频检测
Xiaohui Zhang, Jiangyan Yi, J. Tao, Chenglong Wang, Chuyuan Zhang
Current fake audio detection algorithms have achieved promising performances on most datasets. However, their performance may be significantly degraded when dealing with audio of a different dataset. The orthogonal weight modification to overcome catastrophic forgetting does not consider the similarity of genuine audio across different datasets. To overcome this limitation, we propose a continual learning algorithm for fake audio detection to overcome catastrophic forgetting, called Regularized Adaptive Weight Modification (RAWM). When fine-tuning a detection network, our approach adaptively computes the direction of weight modification according to the ratio of genuine utterances and fake utterances. The adaptive modification direction ensures the network can effectively detect fake audio on the new dataset while preserving its knowledge of old model, thus mitigating catastrophic forgetting. In addition, genuine audio collected from quite different acoustic conditions may skew their feature distribution, so we introduce a regularization constraint to force the network to remember the old distribution in this regard. Our method can easily be generalized to related fields, like speech emotion recognition. We also evaluate our approach across multiple datasets and obtain a significant performance improvement on cross-dataset experiments.
目前的假音频检测算法在大多数数据集上都取得了很好的性能。然而,当处理不同数据集的音频时,它们的性能可能会显著下降。克服灾难性遗忘的正交权值修正没有考虑不同数据集的真实音频的相似性。为了克服这一限制,我们提出了一种用于假音频检测以克服灾难性遗忘的持续学习算法,称为正则化自适应权重修正(RAWM)。在对检测网络进行微调时,我们的方法根据真实话语和虚假话语的比例自适应计算权值修改的方向。自适应修正方向保证了网络在保留旧模型知识的同时,能够有效地检测新数据集上的假音频,从而减轻灾难性遗忘。此外,从完全不同的声学条件下收集的真实音频可能会扭曲它们的特征分布,因此我们引入正则化约束来强制网络在这方面记住旧的分布。我们的方法可以很容易地推广到相关领域,如语音情感识别。我们还跨多个数据集评估了我们的方法,并在跨数据集实验中获得了显着的性能改进。
{"title":"Do You Remember? Overcoming Catastrophic Forgetting for Fake Audio Detection","authors":"Xiaohui Zhang, Jiangyan Yi, J. Tao, Chenglong Wang, Chuyuan Zhang","doi":"10.48550/arXiv.2308.03300","DOIUrl":"https://doi.org/10.48550/arXiv.2308.03300","url":null,"abstract":"Current fake audio detection algorithms have achieved promising performances on most datasets. However, their performance may be significantly degraded when dealing with audio of a different dataset. The orthogonal weight modification to overcome catastrophic forgetting does not consider the similarity of genuine audio across different datasets. To overcome this limitation, we propose a continual learning algorithm for fake audio detection to overcome catastrophic forgetting, called Regularized Adaptive Weight Modification (RAWM). When fine-tuning a detection network, our approach adaptively computes the direction of weight modification according to the ratio of genuine utterances and fake utterances. The adaptive modification direction ensures the network can effectively detect fake audio on the new dataset while preserving its knowledge of old model, thus mitigating catastrophic forgetting. In addition, genuine audio collected from quite different acoustic conditions may skew their feature distribution, so we introduce a regularization constraint to force the network to remember the old distribution in this regard. Our method can easily be generalized to related fields, like speech emotion recognition. We also evaluate our approach across multiple datasets and obtain a significant performance improvement on cross-dataset experiments.","PeriodicalId":74529,"journal":{"name":"Proceedings of the ... International Conference on Machine Learning. International Conference on Machine Learning","volume":"4 1","pages":"41819-41831"},"PeriodicalIF":0.0,"publicationDate":"2023-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87383174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
SDDM: Score-Decomposed Diffusion Models on Manifolds for Unpaired Image-to-Image Translation 非配对图像到图像转换流形上的分数分解扩散模型
Shikun Sun, Longhui Wei, Junliang Xing, Jia Jia, Qi Tian
Recent score-based diffusion models (SBDMs) show promising results in unpaired image-to-image translation (I2I). However, existing methods, either energy-based or statistically-based, provide no explicit form of the interfered intermediate generative distributions. This work presents a new score-decomposed diffusion model (SDDM) on manifolds to explicitly optimize the tangled distributions during image generation. SDDM derives manifolds to make the distributions of adjacent time steps separable and decompose the score function or energy guidance into an image ``denoising"part and a content ``refinement"part. To refine the image in the same noise level, we equalize the refinement parts of the score function and energy guidance, which permits multi-objective optimization on the manifold. We also leverage the block adaptive instance normalization module to construct manifolds with lower dimensions but still concentrated with the perturbed reference image. SDDM outperforms existing SBDM-based methods with much fewer diffusion steps on several I2I benchmarks.
最近基于分数的扩散模型(sdbms)在非配对图像到图像的翻译(I2I)中显示出有希望的结果。然而,现有的方法,无论是基于能量的还是基于统计的,都没有提供干扰中间生成分布的明确形式。本文提出了一种新的分数分解扩散模型(SDDM),用于显式优化图像生成过程中的纠结分布。SDDM导出流形,使相邻时间步长分布可分离,并将分数函数或能量引导分解为图像“去噪”部分和内容“细化”部分。为了在相同噪声水平下对图像进行细化,我们均衡了分数函数和能量制导的细化部分,从而实现了流形上的多目标优化。我们还利用块自适应实例归一化模块来构建具有较低维数的流形,但仍然集中在受干扰的参考图像上。在几个I2I基准测试中,SDDM以更少的扩散步骤优于现有的基于sbdm的方法。
{"title":"SDDM: Score-Decomposed Diffusion Models on Manifolds for Unpaired Image-to-Image Translation","authors":"Shikun Sun, Longhui Wei, Junliang Xing, Jia Jia, Qi Tian","doi":"10.48550/arXiv.2308.02154","DOIUrl":"https://doi.org/10.48550/arXiv.2308.02154","url":null,"abstract":"Recent score-based diffusion models (SBDMs) show promising results in unpaired image-to-image translation (I2I). However, existing methods, either energy-based or statistically-based, provide no explicit form of the interfered intermediate generative distributions. This work presents a new score-decomposed diffusion model (SDDM) on manifolds to explicitly optimize the tangled distributions during image generation. SDDM derives manifolds to make the distributions of adjacent time steps separable and decompose the score function or energy guidance into an image ``denoising\"part and a content ``refinement\"part. To refine the image in the same noise level, we equalize the refinement parts of the score function and energy guidance, which permits multi-objective optimization on the manifold. We also leverage the block adaptive instance normalization module to construct manifolds with lower dimensions but still concentrated with the perturbed reference image. SDDM outperforms existing SBDM-based methods with much fewer diffusion steps on several I2I benchmarks.","PeriodicalId":74529,"journal":{"name":"Proceedings of the ... International Conference on Machine Learning. International Conference on Machine Learning","volume":"108 1","pages":"33115-33134"},"PeriodicalIF":0.0,"publicationDate":"2023-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75911367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Variance Control for Distributional Reinforcement Learning 分布式强化学习的方差控制
Qi Kuang, Zhoufan Zhu, Liwen Zhang, Fan Zhou
Although distributional reinforcement learning (DRL) has been widely examined in the past few years, very few studies investigate the validity of the obtained Q-function estimator in the distributional setting. To fully understand how the approximation errors of the Q-function affect the whole training process, we do some error analysis and theoretically show how to reduce both the bias and the variance of the error terms. With this new understanding, we construct a new estimator emph{Quantiled Expansion Mean} (QEM) and introduce a new DRL algorithm (QEMRL) from the statistical perspective. We extensively evaluate our QEMRL algorithm on a variety of Atari and Mujoco benchmark tasks and demonstrate that QEMRL achieves significant improvement over baseline algorithms in terms of sample efficiency and convergence performance.
尽管分布强化学习(DRL)在过去几年中得到了广泛的研究,但很少有研究调查在分布设置下得到的q函数估计量的有效性。为了充分理解q函数的近似误差是如何影响整个训练过程的,我们进行了一些误差分析,并从理论上展示了如何减少误差项的偏差和方差。基于这一新的认识,我们构造了一个新的估计量emph{量子化展开均值}(QEM),并从统计学的角度引入了一种新的DRL算法(QEMRL)。我们在各种Atari和Mujoco基准任务上广泛评估了我们的QEMRL算法,并证明QEMRL在样本效率和收敛性能方面比基线算法取得了显着改进。
{"title":"Variance Control for Distributional Reinforcement Learning","authors":"Qi Kuang, Zhoufan Zhu, Liwen Zhang, Fan Zhou","doi":"10.48550/arXiv.2307.16152","DOIUrl":"https://doi.org/10.48550/arXiv.2307.16152","url":null,"abstract":"Although distributional reinforcement learning (DRL) has been widely examined in the past few years, very few studies investigate the validity of the obtained Q-function estimator in the distributional setting. To fully understand how the approximation errors of the Q-function affect the whole training process, we do some error analysis and theoretically show how to reduce both the bias and the variance of the error terms. With this new understanding, we construct a new estimator emph{Quantiled Expansion Mean} (QEM) and introduce a new DRL algorithm (QEMRL) from the statistical perspective. We extensively evaluate our QEMRL algorithm on a variety of Atari and Mujoco benchmark tasks and demonstrate that QEMRL achieves significant improvement over baseline algorithms in terms of sample efficiency and convergence performance.","PeriodicalId":74529,"journal":{"name":"Proceedings of the ... International Conference on Machine Learning. International Conference on Machine Learning","volume":"22 1","pages":"17874-17895"},"PeriodicalIF":0.0,"publicationDate":"2023-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83220094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning to Design Analog Circuits to Meet Threshold Specifications 学习设计模拟电路以满足阈值规格
Dmitrii Krylov, Pooya Khajeh, Junhan Ouyang, Thomas Reeves, Tongkai Liu, Hiba Ajmal, Hamidreza Aghasi, Roy Fox
Automated design of analog and radio-frequency circuits using supervised or reinforcement learning from simulation data has recently been studied as an alternative to manual expert design. It is straightforward for a design agent to learn an inverse function from desired performance metrics to circuit parameters. However, it is more common for a user to have threshold performance criteria rather than an exact target vector of feasible performance measures. In this work, we propose a method for generating from simulation data a dataset on which a system can be trained via supervised learning to design circuits to meet threshold specifications. We moreover perform the to-date most extensive evaluation of automated analog circuit design, including experimenting in a significantly more diverse set of circuits than in prior work, covering linear, nonlinear, and autonomous circuit configurations, and show that our method consistently reaches success rate better than 90% at 5% error margin, while also improving data efficiency by upward of an order of magnitude. A demo of this system is available at circuits.streamlit.app
利用从仿真数据中获得的监督学习或强化学习来自动设计模拟电路和射频电路,最近被研究作为人工专家设计的替代方案。对于设计代理来说,从期望的性能指标到电路参数学习反函数是很简单的。然而,更常见的情况是用户拥有阈值性能标准,而不是可行性能度量的精确目标向量。在这项工作中,我们提出了一种从模拟数据生成数据集的方法,在该数据集上,系统可以通过监督学习进行训练,以设计满足阈值规格的电路。此外,我们还对自动化模拟电路设计进行了迄今为止最广泛的评估,包括在比以前的工作更多样化的电路中进行实验,涵盖线性,非线性和自主电路配置,并表明我们的方法在5%的误差范围内始终达到90%以上的成功率,同时还将数据效率提高了一个数量级。该系统的演示可以在circuits.streamlit.app上获得
{"title":"Learning to Design Analog Circuits to Meet Threshold Specifications","authors":"Dmitrii Krylov, Pooya Khajeh, Junhan Ouyang, Thomas Reeves, Tongkai Liu, Hiba Ajmal, Hamidreza Aghasi, Roy Fox","doi":"10.48550/arXiv.2307.13861","DOIUrl":"https://doi.org/10.48550/arXiv.2307.13861","url":null,"abstract":"Automated design of analog and radio-frequency circuits using supervised or reinforcement learning from simulation data has recently been studied as an alternative to manual expert design. It is straightforward for a design agent to learn an inverse function from desired performance metrics to circuit parameters. However, it is more common for a user to have threshold performance criteria rather than an exact target vector of feasible performance measures. In this work, we propose a method for generating from simulation data a dataset on which a system can be trained via supervised learning to design circuits to meet threshold specifications. We moreover perform the to-date most extensive evaluation of automated analog circuit design, including experimenting in a significantly more diverse set of circuits than in prior work, covering linear, nonlinear, and autonomous circuit configurations, and show that our method consistently reaches success rate better than 90% at 5% error margin, while also improving data efficiency by upward of an order of magnitude. A demo of this system is available at circuits.streamlit.app","PeriodicalId":74529,"journal":{"name":"Proceedings of the ... International Conference on Machine Learning. International Conference on Machine Learning","volume":"87 2 1","pages":"17858-17873"},"PeriodicalIF":0.0,"publicationDate":"2023-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85665650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Optimal Approximation Factors in Misspecified Off-Policy Value Function Estimation 错定非策略值函数估计中的最优逼近因子
P. Amortila, Nan Jiang, Csaba Szepesvari
Theoretical guarantees in reinforcement learning (RL) are known to suffer multiplicative blow-up factors with respect to the misspecification error of function approximation. Yet, the nature of such emph{approximation factors} -- especially their optimal form in a given learning problem -- is poorly understood. In this paper we study this question in linear off-policy value function estimation, where many open questions remain. We study the approximation factor in a broad spectrum of settings, such as with the weighted $L_2$-norm (where the weighting is the offline state distribution), the $L_infty$ norm, the presence vs. absence of state aliasing, and full vs. partial coverage of the state space. We establish the optimal asymptotic approximation factors (up to constants) for all of these settings. In particular, our bounds identify two instance-dependent factors for the $L_2(mu)$ norm and only one for the $L_infty$ norm, which are shown to dictate the hardness of off-policy evaluation under misspecification.
已知强化学习(RL)中的理论保证在函数近似的误规范误差方面受到乘法放大因素的影响。然而,这种emph{近似因子}的本质——尤其是它们在给定学习问题中的最佳形式——却知之甚少。本文研究了线性离策略值函数估计中的这一问题,其中仍有许多有待解决的问题。我们在广泛的设置范围内研究近似因子,例如加权$L_2$ -范数(其中权重是离线状态分布),$L_infty$范数,状态混叠的存在与不存在,以及状态空间的完全覆盖与部分覆盖。我们为所有这些设置建立了最优的渐近近似因子(直至常数)。特别是,我们的界限确定了$L_2(mu)$规范的两个实例相关因素,而$L_infty$规范只有一个实例相关因素,这表明了在错误规范下偏离策略评估的硬度。
{"title":"The Optimal Approximation Factors in Misspecified Off-Policy Value Function Estimation","authors":"P. Amortila, Nan Jiang, Csaba Szepesvari","doi":"10.48550/arXiv.2307.13332","DOIUrl":"https://doi.org/10.48550/arXiv.2307.13332","url":null,"abstract":"Theoretical guarantees in reinforcement learning (RL) are known to suffer multiplicative blow-up factors with respect to the misspecification error of function approximation. Yet, the nature of such emph{approximation factors} -- especially their optimal form in a given learning problem -- is poorly understood. In this paper we study this question in linear off-policy value function estimation, where many open questions remain. We study the approximation factor in a broad spectrum of settings, such as with the weighted $L_2$-norm (where the weighting is the offline state distribution), the $L_infty$ norm, the presence vs. absence of state aliasing, and full vs. partial coverage of the state space. We establish the optimal asymptotic approximation factors (up to constants) for all of these settings. In particular, our bounds identify two instance-dependent factors for the $L_2(mu)$ norm and only one for the $L_infty$ norm, which are shown to dictate the hardness of off-policy evaluation under misspecification.","PeriodicalId":74529,"journal":{"name":"Proceedings of the ... International Conference on Machine Learning. International Conference on Machine Learning","volume":"19 1","pages":"768-790"},"PeriodicalIF":0.0,"publicationDate":"2023-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82929213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predicting Ordinary Differential Equations with Transformers 用变压器预测常微分方程
Soren Becker, M. Klein, Alexander Neitz, Giambattista Parascandolo, Niki Kilbertus
We develop a transformer-based sequence-to-sequence model that recovers scalar ordinary differential equations (ODEs) in symbolic form from irregularly sampled and noisy observations of a single solution trajectory. We demonstrate in extensive empirical evaluations that our model performs better or on par with existing methods in terms of accurate recovery across various settings. Moreover, our method is efficiently scalable: after one-time pretraining on a large set of ODEs, we can infer the governing law of a new observed solution in a few forward passes of the model.
我们开发了一个基于变压器的序列到序列模型,该模型从单个解轨迹的不规则采样和噪声观测中以符号形式恢复标量常微分方程(ode)。我们在广泛的经验评估中证明,我们的模型在各种环境下的准确采收率方面表现得更好或与现有方法相当。此外,我们的方法具有有效的可扩展性:在对大量ode进行一次性预训练后,我们可以在模型的几次前向传递中推断出新观察到的解的控制律。
{"title":"Predicting Ordinary Differential Equations with Transformers","authors":"Soren Becker, M. Klein, Alexander Neitz, Giambattista Parascandolo, Niki Kilbertus","doi":"10.48550/arXiv.2307.12617","DOIUrl":"https://doi.org/10.48550/arXiv.2307.12617","url":null,"abstract":"We develop a transformer-based sequence-to-sequence model that recovers scalar ordinary differential equations (ODEs) in symbolic form from irregularly sampled and noisy observations of a single solution trajectory. We demonstrate in extensive empirical evaluations that our model performs better or on par with existing methods in terms of accurate recovery across various settings. Moreover, our method is efficiently scalable: after one-time pretraining on a large set of ODEs, we can infer the governing law of a new observed solution in a few forward passes of the model.","PeriodicalId":74529,"journal":{"name":"Proceedings of the ... International Conference on Machine Learning. International Conference on Machine Learning","volume":"27 1","pages":"1978-2002"},"PeriodicalIF":0.0,"publicationDate":"2023-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75435791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
Proceedings of the ... International Conference on Machine Learning. International Conference on Machine Learning
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1