首页 > 最新文献

Machine Learning最新文献

英文 中文
Reversible jump attack to textual classifiers with modification reduction 针对文本分类器的可逆跳转攻击与修改减少
IF 7.5 3区 计算机科学 Q1 Computer Science Pub Date : 2024-04-22 DOI: 10.1007/s10994-024-06539-6
Mingze Ni, Zhensu Sun, Wei Liu

Recent studies on adversarial examples expose vulnerabilities of natural language processing models. Existing techniques for generating adversarial examples are typically driven by deterministic hierarchical rules that are agnostic to the optimal adversarial examples, a strategy that often results in adversarial samples with a suboptimal balance between magnitudes of changes and attack successes. To this end, in this research we propose two algorithms, Reversible Jump Attack (RJA) and Metropolis–Hasting Modification Reduction (MMR), to generate highly effective adversarial examples and to improve the imperceptibility of the examples, respectively. RJA utilizes a novel randomization mechanism to enlarge the search space and efficiently adapts to a number of perturbed words for adversarial examples. With these generated adversarial examples, MMR applies the Metropolis–Hasting sampler to enhance the imperceptibility of adversarial examples. Extensive experiments demonstrate that RJA-MMR outperforms current state-of-the-art methods in attack performance, imperceptibility, fluency and grammar correctness.

最近关于对抗示例的研究暴露了自然语言处理模型的漏洞。生成对抗示例的现有技术通常由确定性分层规则驱动,而这些规则与最优对抗示例无关,这种策略通常会导致对抗样本在变化幅度和攻击成功率之间达不到最佳平衡。为此,我们在本研究中提出了两种算法--可逆跳跃攻击(RJA)和大都会-空速修改还原(MMR),分别用于生成高效的对抗示例和提高示例的不可感知性。RJA 利用一种新颖的随机化机制来扩大搜索空间,并能有效地适应大量扰动词的对抗示例。利用这些生成的对抗示例,MMR 应用 Metropolis-Hasting 采样器来增强对抗示例的不可感知性。大量实验证明,RJA-MMR 在攻击性能、不可感知性、流畅性和语法正确性方面都优于目前最先进的方法。
{"title":"Reversible jump attack to textual classifiers with modification reduction","authors":"Mingze Ni, Zhensu Sun, Wei Liu","doi":"10.1007/s10994-024-06539-6","DOIUrl":"https://doi.org/10.1007/s10994-024-06539-6","url":null,"abstract":"<p>Recent studies on adversarial examples expose vulnerabilities of natural language processing models. Existing techniques for generating adversarial examples are typically driven by deterministic hierarchical rules that are agnostic to the optimal adversarial examples, a strategy that often results in adversarial samples with a suboptimal balance between magnitudes of changes and attack successes. To this end, in this research we propose two algorithms, Reversible Jump Attack (RJA) and Metropolis–Hasting Modification Reduction (MMR), to generate highly effective adversarial examples and to improve the imperceptibility of the examples, respectively. RJA utilizes a novel randomization mechanism to enlarge the search space and efficiently adapts to a number of perturbed words for adversarial examples. With these generated adversarial examples, MMR applies the Metropolis–Hasting sampler to enhance the imperceptibility of adversarial examples. Extensive experiments demonstrate that RJA-MMR outperforms current state-of-the-art methods in attack performance, imperceptibility, fluency and grammar correctness.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140806544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Coresets for kernel clustering 内核聚类的核集
IF 7.5 3区 计算机科学 Q1 Computer Science Pub Date : 2024-04-22 DOI: 10.1007/s10994-024-06540-z
Shaofeng H. -C. Jiang, Robert Krauthgamer, Jianing Lou, Yubo Zhang

We devise coresets for kernel (k)-Means with a general kernel, and use them to obtain new, more efficient, algorithms. Kernel (k)-Means has superior clustering capability compared to classical (k)-Means, particularly when clusters are non-linearly separable, but it also introduces significant computational challenges. We address this computational issue by constructing a coreset, which is a reduced dataset that accurately preserves the clustering costs. Our main result is a coreset for kernel (k)-Means that works for a general kernel and has size ({{,textrm{poly},}}(kepsilon ^{-1})). Our new coreset both generalizes and greatly improves all previous results; moreover, it can be constructed in time near-linear in n. This result immediately implies new algorithms for kernel (k)-Means, such as a ((1+epsilon ))-approximation in time near-linear in n, and a streaming algorithm using space and update time ({{,textrm{poly},}}(k epsilon ^{-1} log n)). We validate our coreset on various datasets with different kernels. Our coreset performs consistently well, achieving small errors while using very few points. We show that our coresets can speed up kernel (textsc {k-Means++}) (the kernelized version of the widely used (textsc {k-Means++}) algorithm), and we further use this faster kernel (textsc {k-Means++}) for spectral clustering. In both applications, we achieve significant speedup and a better asymptotic growth while the error is comparable to baselines that do not use coresets.

我们为具有一般核的核(k/)-Means设计了核集,并利用它们获得了更高效的新算法。与经典的(k)-Means相比,核(k)-Means具有更优越的聚类能力,尤其是当聚类是非线性可分离的时候,但它也带来了巨大的计算挑战。我们通过构建一个核心集来解决这个计算问题,核心集是一个缩小了的数据集,它能准确地保留聚类成本。我们的主要成果是一个适用于一般内核、大小为 ({{,textrm{poly},}}(kepsilon ^{-1}))的内核 (k)-Means 的核心集。我们的新内核既概括了之前的所有结果,又大大改进了这些结果;此外,它可以在接近 n 线性的时间内构造出来。这一结果立即意味着核(k)-均值的新算法,比如在时间上接近于 n 的 ((1+epsilon ))-approximation 算法,以及使用空间和更新时间的流算法 ({{,textrm{poly},}(kepsilon ^{-1} log n))。我们用不同的内核在各种数据集上验证了我们的核心集。我们的核心集始终表现出色,在使用极少量点的情况下误差很小。我们的研究表明,我们的核心集可以加快核(textsc {k-Means++})(广泛使用的核(textsc {k-Means++})算法的核化版本)的速度,我们还将这种更快的核(textsc {k-Means++})用于光谱聚类。在这两种应用中,我们都实现了显著的提速和更好的渐进增长,而误差则与不使用核集的基线相当。
{"title":"Coresets for kernel clustering","authors":"Shaofeng H. -C. Jiang, Robert Krauthgamer, Jianing Lou, Yubo Zhang","doi":"10.1007/s10994-024-06540-z","DOIUrl":"https://doi.org/10.1007/s10994-024-06540-z","url":null,"abstract":"<p>We devise coresets for kernel <span>(k)</span>-<span>Means</span> with a general kernel, and use them to obtain new, more efficient, algorithms. Kernel <span>(k)</span>-<span>Means</span> has superior clustering capability compared to classical <span>(k)</span>-<span>Means</span>, particularly when clusters are non-linearly separable, but it also introduces significant computational challenges. We address this computational issue by constructing a coreset, which is a reduced dataset that accurately preserves the clustering costs. Our main result is a coreset for kernel <span>(k)</span>-<span>Means</span> that works for a general kernel and has size <span>({{,textrm{poly},}}(kepsilon ^{-1}))</span>. Our new coreset both generalizes and greatly improves all previous results; moreover, it can be constructed in time near-linear in <i>n</i>. This result immediately implies new algorithms for kernel <span>(k)</span>-<span>Means</span>, such as a <span>((1+epsilon ))</span>-approximation in time near-linear in <i>n</i>, and a streaming algorithm using space and update time <span>({{,textrm{poly},}}(k epsilon ^{-1} log n))</span>. We validate our coreset on various datasets with different kernels. Our coreset performs consistently well, achieving small errors while using very few points. We show that our coresets can speed up kernel <span>(textsc {k-Means++})</span> (the kernelized version of the widely used <span>(textsc {k-Means++})</span> algorithm), and we further use this faster kernel <span>(textsc {k-Means++})</span> for spectral clustering. In both applications, we achieve significant speedup and a better asymptotic growth while the error is comparable to baselines that do not use coresets.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140806646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
From MNIST to ImageNet and back: benchmarking continual curriculum learning 从 MNIST 到 ImageNet 再到 ImageNet:持续课程学习的基准测试
IF 7.5 3区 计算机科学 Q1 Computer Science Pub Date : 2024-04-22 DOI: 10.1007/s10994-024-06524-z
Kamil Faber, Dominik Zurek, Marcin Pietron, Nathalie Japkowicz, Antonio Vergari, Roberto Corizzo

Continual learning (CL) is one of the most promising trends in recent machine learning research. Its goal is to go beyond classical assumptions in machine learning and develop models and learning strategies that present high robustness in dynamic environments. This goal is realized by designing strategies that simultaneously foster the incorporation of new knowledge while avoiding forgetting past knowledge. The landscape of CL research is fragmented into several learning evaluation protocols, comprising different learning tasks, datasets, and evaluation metrics. Additionally, the benchmarks adopted so far are still distant from the complexity of real-world scenarios, and are usually tailored to highlight capabilities specific to certain strategies. In such a landscape, it is hard to clearly and objectively assess models and strategies. In this work, we fill this gap for CL on image data by introducing two novel CL benchmarks that involve multiple heterogeneous tasks from six image datasets, with varying levels of complexity and quality. Our aim is to fairly evaluate current state-of-the-art CL strategies on a common ground that is closer to complex real-world scenarios. We additionally structure our benchmarks so that tasks are presented in increasing and decreasing order of complexity—according to a curriculum—in order to evaluate if current CL models are able to exploit structure across tasks. We devote particular emphasis to providing the CL community with a rigorous and reproducible evaluation protocol for measuring the ability of a model to generalize and not to forget while learning. Furthermore, we provide an extensive experimental evaluation showing that popular CL strategies, when challenged with our proposed benchmarks, yield sub-par performance, high levels of forgetting, and present a limited ability to effectively leverage curriculum task ordering. We believe that these results highlight the need for rigorous comparisons in future CL works as well as pave the way to design new CL strategies that are able to deal with more complex scenarios.

持续学习(CL)是近期机器学习研究中最有前途的趋势之一。它的目标是超越机器学习的经典假设,开发在动态环境中具有高鲁棒性的模型和学习策略。要实现这一目标,就要设计出既能促进新知识的吸收,又能避免遗忘过去知识的策略。CL研究的范围被划分为多个学习评估协议,包括不同的学习任务、数据集和评估指标。此外,迄今为止所采用的基准仍与现实世界场景的复杂性相去甚远,而且通常是为突出某些策略的特定能力而量身定制的。在这种情况下,很难对模型和策略进行清晰客观的评估。在这项工作中,我们引入了两个新颖的图像数据分析基准,涉及六个图像数据集的多个异构任务,复杂程度和质量各不相同,从而填补了图像数据分析的这一空白。我们的目标是在更接近复杂现实世界场景的共同基础上,公平地评估当前最先进的 CL 策略。此外,我们还对基准进行了结构化设计,使任务的复杂度按照课程的顺序依次递增和递减,以评估当前的 CL 模型是否能够利用跨任务的结构。我们特别强调要为 CL 社区提供一个严格的、可重复的评估协议,以衡量模型的泛化能力和在学习过程中不遗忘的能力。此外,我们还提供了广泛的实验评估,结果表明,当使用我们提出的基准进行挑战时,流行的CL策略会产生不合格的性能、高水平的遗忘,并且有效利用课程任务排序的能力有限。我们认为,这些结果凸显了在未来的学习策略研究中进行严格比较的必要性,同时也为设计能够应对更复杂情况的新型学习策略铺平了道路。
{"title":"From MNIST to ImageNet and back: benchmarking continual curriculum learning","authors":"Kamil Faber, Dominik Zurek, Marcin Pietron, Nathalie Japkowicz, Antonio Vergari, Roberto Corizzo","doi":"10.1007/s10994-024-06524-z","DOIUrl":"https://doi.org/10.1007/s10994-024-06524-z","url":null,"abstract":"<p>Continual learning (CL) is one of the most promising trends in recent machine learning research. Its goal is to go beyond classical assumptions in machine learning and develop models and learning strategies that present high robustness in dynamic environments. This goal is realized by designing strategies that simultaneously foster the incorporation of new knowledge while avoiding forgetting past knowledge. The landscape of CL research is fragmented into several learning evaluation protocols, comprising different learning tasks, datasets, and evaluation metrics. Additionally, the benchmarks adopted so far are still distant from the complexity of real-world scenarios, and are usually tailored to highlight capabilities specific to certain strategies. In such a landscape, it is hard to clearly and objectively assess models and strategies. In this work, we fill this gap for CL on image data by introducing two novel CL benchmarks that involve multiple heterogeneous tasks from six image datasets, with varying levels of complexity and quality. Our aim is to fairly evaluate current state-of-the-art CL strategies on a common ground that is closer to complex real-world scenarios. We additionally structure our benchmarks so that tasks are presented in increasing and decreasing order of complexity—according to a curriculum—in order to evaluate if current CL models are able to exploit structure across tasks. We devote particular emphasis to providing the CL community with a rigorous and reproducible evaluation protocol for measuring the ability of a model to generalize and not to forget while learning. Furthermore, we provide an extensive experimental evaluation showing that popular CL strategies, when challenged with our proposed benchmarks, yield sub-par performance, high levels of forgetting, and present a limited ability to effectively leverage curriculum task ordering. We believe that these results highlight the need for rigorous comparisons in future CL works as well as pave the way to design new CL strategies that are able to deal with more complex scenarios.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140798897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A survey on interpretable reinforcement learning 可解释强化学习调查
IF 7.5 3区 计算机科学 Q1 Computer Science Pub Date : 2024-04-19 DOI: 10.1007/s10994-024-06543-w
Claire Glanois, Paul Weng, Matthieu Zimmer, Dong Li, Tianpei Yang, Jianye Hao, Wulong Liu

Although deep reinforcement learning has become a promising machine learning approach for sequential decision-making problems, it is still not mature enough for high-stake domains such as autonomous driving or medical applications. In such contexts, a learned policy needs for instance to be interpretable, so that it can be inspected before any deployment (e.g., for safety and verifiability reasons). This survey provides an overview of various approaches to achieve higher interpretability in reinforcement learning (RL). To that aim, we distinguish interpretability (as an intrinsic property of a model) and explainability (as a post-hoc operation) and discuss them in the context of RL with an emphasis on the former notion. In particular, we argue that interpretable RL may embrace different facets: interpretable inputs, interpretable (transition/reward) models, and interpretable decision-making. Based on this scheme, we summarize and analyze recent work related to interpretable RL with an emphasis on papers published in the past 10 years. We also discuss briefly some related research areas and point to some potential promising research directions, notably related to the recent development of foundation models (e.g., large language models, RL from human feedback).

虽然深度强化学习已成为一种很有前途的机器学习方法,可用于连续决策问题,但对于自动驾驶或医疗应用等高风险领域来说,它还不够成熟。在这种情况下,学习到的策略需要具有可解释性,以便在部署前对其进行检查(例如,出于安全性和可验证性的原因)。本调查概述了在强化学习(RL)中实现更高可解释性的各种方法。为此,我们区分了可解释性(作为模型的固有属性)和可解释性(作为事后操作),并在 RL 的背景下对它们进行了讨论,重点放在前者的概念上。特别是,我们认为可解释的 RL 可能包含不同的方面:可解释的输入、可解释的(过渡/回报)模型和可解释的决策。基于这一方案,我们总结并分析了与可解释 RL 相关的最新研究成果,重点是过去 10 年发表的论文。我们还简要讨论了一些相关的研究领域,并指出了一些潜在的有前途的研究方向,特别是与基础模型(如大型语言模型、来自人类反馈的 RL)的最新发展相关的研究方向。
{"title":"A survey on interpretable reinforcement learning","authors":"Claire Glanois, Paul Weng, Matthieu Zimmer, Dong Li, Tianpei Yang, Jianye Hao, Wulong Liu","doi":"10.1007/s10994-024-06543-w","DOIUrl":"https://doi.org/10.1007/s10994-024-06543-w","url":null,"abstract":"<p>Although deep reinforcement learning has become a promising machine learning approach for sequential decision-making problems, it is still not mature enough for high-stake domains such as autonomous driving or medical applications. In such contexts, a learned policy needs for instance to be interpretable, so that it can be inspected before any deployment (e.g., for safety and verifiability reasons). This survey provides an overview of various approaches to achieve higher interpretability in reinforcement learning (RL). To that aim, we distinguish interpretability (as an intrinsic property of a model) and explainability (as a post-hoc operation) and discuss them in the context of RL with an emphasis on the former notion. In particular, we argue that interpretable RL may embrace different facets: interpretable inputs, interpretable (transition/reward) models, and interpretable decision-making. Based on this scheme, we summarize and analyze recent work related to interpretable RL with an emphasis on papers published in the past 10 years. We also discuss briefly some related research areas and point to some potential promising research directions, notably related to the recent development of foundation models (e.g., large language models, RL from human feedback).</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140625634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PolieDRO: a novel classification and regression framework with non-parametric data-driven regularization PolieDRO:非参数数据驱动正则化的新型分类和回归框架
IF 7.5 3区 计算机科学 Q1 Computer Science Pub Date : 2024-04-15 DOI: 10.1007/s10994-024-06544-9
Tomás Gutierrez, Davi Valladão, Bernardo K. Pagnoncelli

PolieDRO is a novel analytics framework for classification and regression that harnesses the power and flexibility of data-driven distributionally robust optimization (DRO) to circumvent the need for regularization hyperparameters. Recent literature shows that traditional machine learning methods such as SVM and (square-root) LASSO can be written as Wasserstein-based DRO problems. Inspired by those results we propose a hyperparameter-free ambiguity set that explores the polyhedral structure of data-driven convex hulls, generating computationally tractable regression and classification methods for any convex loss function. Numerical results based on 100 real-world databases and an extensive experiment with synthetically generated data show that our methods consistently outperform their traditional counterparts.

PolieDRO 是一种用于分类和回归的新型分析框架,它利用数据驱动的分布稳健优化(DRO)的强大功能和灵活性,规避了对正则化超参数的需求。最近的文献表明,SVM 和(平方根)LASSO 等传统机器学习方法可以写成基于 Wasserstein 的 DRO 问题。受这些结果的启发,我们提出了一种无超参数模糊集,它可以探索数据驱动凸壳的多面体结构,为任何凸损失函数生成可计算的回归和分类方法。基于 100 个真实世界数据库的数值结果以及对合成数据的广泛实验表明,我们的方法始终优于传统方法。
{"title":"PolieDRO: a novel classification and regression framework with non-parametric data-driven regularization","authors":"Tomás Gutierrez, Davi Valladão, Bernardo K. Pagnoncelli","doi":"10.1007/s10994-024-06544-9","DOIUrl":"https://doi.org/10.1007/s10994-024-06544-9","url":null,"abstract":"<p>PolieDRO is a novel analytics framework for classification and regression that harnesses the power and flexibility of data-driven distributionally robust optimization (DRO) to circumvent the need for regularization hyperparameters. Recent literature shows that traditional machine learning methods such as SVM and (square-root) LASSO can be written as Wasserstein-based DRO problems. Inspired by those results we propose a hyperparameter-free ambiguity set that explores the polyhedral structure of data-driven convex hulls, generating computationally tractable regression and classification methods for any convex loss function. Numerical results based on 100 real-world databases and an extensive experiment with synthetically generated data show that our methods consistently outperform their traditional counterparts.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140611189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Did we personalize? Assessing personalization by an online reinforcement learning algorithm using resampling 我们个性化了吗?利用重采样评估在线强化学习算法的个性化程度
IF 7.5 3区 计算机科学 Q1 Computer Science Pub Date : 2024-04-10 DOI: 10.1007/s10994-024-06526-x
Susobhan Ghosh, Raphael Kim, Prasidh Chhabria, Raaz Dwivedi, Predrag Klasnja, Peng Liao, Kelly Zhang, Susan Murphy

There is a growing interest in using reinforcement learning (RL) to personalize sequences of treatments in digital health to support users in adopting healthier behaviors. Such sequential decision-making problems involve decisions about when to treat and how to treat based on the user’s context (e.g., prior activity level, location, etc.). Online RL is a promising data-driven approach for this problem as it learns based on each user’s historical responses and uses that knowledge to personalize these decisions. However, to decide whether the RL algorithm should be included in an “optimized” intervention for real-world deployment, we must assess the data evidence indicating that the RL algorithm is actually personalizing the treatments to its users. Due to the stochasticity in the RL algorithm, one may get a false impression that it is learning in certain states and using this learning to provide specific treatments. We use a working definition of personalization and introduce a resampling-based methodology for investigating whether the personalization exhibited by the RL algorithm is an artifact of the RL algorithm stochasticity. We illustrate our methodology with a case study by analyzing the data from a physical activity clinical trial called HeartSteps, which included the use of an online RL algorithm. We demonstrate how our approach enhances data-driven truth-in-advertising of algorithm personalization both across all users as well as within specific users in the study.

越来越多的人开始关注在数字健康领域使用强化学习(RL)来个性化治疗顺序,以支持用户采取更健康的行为。此类顺序决策问题涉及根据用户的背景(如先前的活动水平、位置等)决定何时治疗和如何治疗。在线 RL 是一种很有前景的数据驱动型方法,它可以根据每个用户的历史反应进行学习,并利用这些知识来个性化这些决策。然而,要决定是否应将 RL 算法纳入实际部署的 "优化 "干预中,我们必须评估表明 RL 算法确实在为用户提供个性化治疗的数据证据。由于 RL 算法的随机性,人们可能会产生一种错觉,以为它正在某些状态下学习,并利用这种学习提供特定的治疗。我们使用了个性化的工作定义,并介绍了一种基于重采样的方法,用于研究 RL 算法所表现出的个性化是否是 RL 算法随机性的产物。我们通过分析一项名为 HeartSteps 的体育锻炼临床试验的数据来说明我们的方法,其中包括在线 RL 算法的使用。我们展示了我们的方法如何在所有用户以及研究中的特定用户中增强算法个性化的数据驱动真实广告。
{"title":"Did we personalize? Assessing personalization by an online reinforcement learning algorithm using resampling","authors":"Susobhan Ghosh, Raphael Kim, Prasidh Chhabria, Raaz Dwivedi, Predrag Klasnja, Peng Liao, Kelly Zhang, Susan Murphy","doi":"10.1007/s10994-024-06526-x","DOIUrl":"https://doi.org/10.1007/s10994-024-06526-x","url":null,"abstract":"<p>There is a growing interest in using reinforcement learning (RL) to personalize sequences of treatments in digital health to support users in adopting healthier behaviors. Such sequential decision-making problems involve decisions about when to treat and how to treat based on the user’s context (e.g., prior activity level, location, etc.). Online RL is a promising data-driven approach for this problem as it learns based on each user’s historical responses and uses that knowledge to personalize these decisions. However, to decide whether the RL algorithm should be included in an “optimized” intervention for real-world deployment, we must assess the data evidence indicating that the RL algorithm is actually personalizing the treatments to its users. Due to the stochasticity in the RL algorithm, one may get a false impression that it is learning in certain states and using this learning to provide specific treatments. We use a working definition of personalization and introduce a resampling-based methodology for investigating whether the personalization exhibited by the RL algorithm is an artifact of the RL algorithm stochasticity. We illustrate our methodology with a case study by analyzing the data from a physical activity clinical trial called HeartSteps, which included the use of an online RL algorithm. We demonstrate how our approach enhances data-driven truth-in-advertising of algorithm personalization both across all users as well as within specific users in the study.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140596433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exposing and explaining fake news on-the-fly 即时揭露和解释假新闻
IF 7.5 3区 计算机科学 Q1 Computer Science Pub Date : 2024-04-10 DOI: 10.1007/s10994-024-06527-w
Francisco de Arriba-Pérez, Silvia García-Méndez, Fátima Leal, Benedita Malheiro, Juan Carlos Burguillo

Social media platforms enable the rapid dissemination and consumption of information. However, users instantly consume such content regardless of the reliability of the shared data. Consequently, the latter crowdsourcing model is exposed to manipulation. This work contributes with an explainable and online classification method to recognize fake news in real-time. The proposed method combines both unsupervised and supervised Machine Learning approaches with online created lexica. The profiling is built using creator-, content- and context-based features using Natural Language Processing techniques. The explainable classification mechanism displays in a dashboard the features selected for classification and the prediction confidence. The performance of the proposed solution has been validated with real data sets from Twitter and the results attain 80% accuracy and macro F-measure. This proposal is the first to jointly provide data stream processing, profiling, classification and explainability. Ultimately, the proposed early detection, isolation and explanation of fake news contribute to increase the quality and trustworthiness of social media contents.

社交媒体平台能够快速传播和消费信息。然而,无论共享数据是否可靠,用户都会即时消费这些内容。因此,后一种众包模式很容易受到操纵。本作品提出了一种可解释的在线分类方法来实时识别假新闻。所提出的方法将无监督和有监督的机器学习方法与在线创建的词库相结合。利用自然语言处理技术,使用基于创建者、内容和上下文的特征进行剖析。可解释的分类机制可在仪表板上显示分类所选特征和预测置信度。拟议解决方案的性能已通过 Twitter 的真实数据集进行了验证,结果达到了 80% 的准确率和宏观 F-measure。该提案是首个联合提供数据流处理、剖析、分类和可解释性的方案。最终,假新闻的早期检测、隔离和解释有助于提高社交媒体内容的质量和可信度。
{"title":"Exposing and explaining fake news on-the-fly","authors":"Francisco de Arriba-Pérez, Silvia García-Méndez, Fátima Leal, Benedita Malheiro, Juan Carlos Burguillo","doi":"10.1007/s10994-024-06527-w","DOIUrl":"https://doi.org/10.1007/s10994-024-06527-w","url":null,"abstract":"<p>Social media platforms enable the rapid dissemination and consumption of information. However, users instantly consume such content regardless of the reliability of the shared data. Consequently, the latter crowdsourcing model is exposed to manipulation. This work contributes with an explainable and online classification method to recognize fake news in real-time. The proposed method combines both unsupervised and supervised Machine Learning approaches with online created lexica. The profiling is built using creator-, content- and context-based features using Natural Language Processing techniques. The explainable classification mechanism displays in a dashboard the features selected for classification and the prediction confidence. The performance of the proposed solution has been validated with real data sets from Twitter and the results attain 80% accuracy and macro <i>F</i>-measure. This proposal is the first to jointly provide data stream processing, profiling, classification and explainability. Ultimately, the proposed early detection, isolation and explanation of fake news contribute to increase the quality and trustworthiness of social media contents.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140596312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Utilizing reinforcement learning for de novo drug design 利用强化学习进行新药设计
IF 7.5 3区 计算机科学 Q1 Computer Science Pub Date : 2024-04-08 DOI: 10.1007/s10994-024-06519-w
Hampus Gummesson Svensson, Christian Tyrchan, Ola Engkvist, Morteza Haghir Chehreghani

Deep learning-based approaches for generating novel drug molecules with specific properties have gained a lot of interest in the last few years. Recent studies have demonstrated promising performance for string-based generation of novel molecules utilizing reinforcement learning. In this paper, we develop a unified framework for using reinforcement learning for de novo drug design, wherein we systematically study various on- and off-policy reinforcement learning algorithms and replay buffers to learn an RNN-based policy to generate novel molecules predicted to be active against the dopamine receptor DRD2. Our findings suggest that it is advantageous to use at least both top-scoring and low-scoring molecules for updating the policy when structural diversity is essential. Using all generated molecules at an iteration seems to enhance performance stability for on-policy algorithms. In addition, when replaying high, intermediate, and low-scoring molecules, off-policy algorithms display the potential of improving the structural diversity and number of active molecules generated, but possibly at the cost of a longer exploration phase. Our work provides an open-source framework enabling researchers to investigate various reinforcement learning methods for de novo drug design.

过去几年,基于深度学习生成具有特定性质的新型药物分子的方法受到了广泛关注。最近的研究表明,利用强化学习生成基于字符串的新型分子具有良好的性能。在本文中,我们开发了一个将强化学习用于新药设计的统一框架,系统地研究了各种政策内和政策外强化学习算法和重放缓冲器,以学习基于 RNN 的政策,生成预测对多巴胺受体 DRD2 有活性的新分子。我们的研究结果表明,当结构多样性至关重要时,至少使用得分最高和得分最低的分子来更新策略是有利的。在一次迭代中使用所有生成的分子似乎能提高策略算法的性能稳定性。此外,在重放高分、中分和低分分子时,非政策算法显示出提高结构多样性和生成的活性分子数量的潜力,但可能要以延长探索阶段为代价。我们的工作提供了一个开源框架,使研究人员能够研究用于新药设计的各种强化学习方法。
{"title":"Utilizing reinforcement learning for de novo drug design","authors":"Hampus Gummesson Svensson, Christian Tyrchan, Ola Engkvist, Morteza Haghir Chehreghani","doi":"10.1007/s10994-024-06519-w","DOIUrl":"https://doi.org/10.1007/s10994-024-06519-w","url":null,"abstract":"<p>Deep learning-based approaches for generating novel drug molecules with specific properties have gained a lot of interest in the last few years. Recent studies have demonstrated promising performance for string-based generation of novel molecules utilizing reinforcement learning. In this paper, we develop a unified framework for using reinforcement learning for de novo drug design, wherein we systematically study various on- and off-policy reinforcement learning algorithms and replay buffers to learn an RNN-based policy to generate novel molecules predicted to be active against the dopamine receptor DRD2. Our findings suggest that it is advantageous to use at least both top-scoring and low-scoring molecules for updating the policy when structural diversity is essential. Using all generated molecules at an iteration seems to enhance performance stability for on-policy algorithms. In addition, when replaying high, intermediate, and low-scoring molecules, off-policy algorithms display the potential of improving the structural diversity and number of active molecules generated, but possibly at the cost of a longer exploration phase. Our work provides an open-source framework enabling researchers to investigate various reinforcement learning methods for de novo drug design.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140596434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-consensus decentralized primal-dual fixed point algorithm for distributed learning 分布式学习的多共识分散原始二元定点算法
IF 7.5 3区 计算机科学 Q1 Computer Science Pub Date : 2024-04-08 DOI: 10.1007/s10994-024-06537-8
Kejie Tang, Weidong Liu, Xiaojun Mao

Decentralized distributed learning has recently attracted significant attention in many applications in machine learning and signal processing. To solve a decentralized optimization with regularization, we propose a Multi-consensus Decentralized Primal-Dual Fixed Point (MD-PDFP) algorithm. We apply multiple consensus steps with the gradient tracking technique to extend the primal-dual fixed point method over a network. The communication complexities of our procedure are given under certain conditions. Moreover, we show that our algorithm is consistent under general conditions and enjoys global linear convergence under strong convexity. With some particular choices of regularizations, our algorithm can be applied to decentralized machine learning applications. Finally, several numerical experiments and real data analyses are conducted to demonstrate the effectiveness of the proposed algorithm.

最近,分散式分布学习在机器学习和信号处理的许多应用中引起了极大关注。为了解决带正则化的分散优化问题,我们提出了一种多共识分散原始双定点算法(MD-PDFP)。我们将多个共识步骤与梯度跟踪技术相结合,在网络上扩展了原始双定点法。在某些条件下,我们给出了程序的通信复杂度。此外,我们还证明了我们的算法在一般条件下是一致的,并且在强凸性条件下具有全局线性收敛性。通过一些特定的正则化选择,我们的算法可以应用于分散式机器学习应用。最后,我们还进行了一些数值实验和实际数据分析,以证明所提算法的有效性。
{"title":"Multi-consensus decentralized primal-dual fixed point algorithm for distributed learning","authors":"Kejie Tang, Weidong Liu, Xiaojun Mao","doi":"10.1007/s10994-024-06537-8","DOIUrl":"https://doi.org/10.1007/s10994-024-06537-8","url":null,"abstract":"<p>Decentralized distributed learning has recently attracted significant attention in many applications in machine learning and signal processing. To solve a decentralized optimization with regularization, we propose a Multi-consensus Decentralized Primal-Dual Fixed Point (MD-PDFP) algorithm. We apply multiple consensus steps with the gradient tracking technique to extend the primal-dual fixed point method over a network. The communication complexities of our procedure are given under certain conditions. Moreover, we show that our algorithm is consistent under general conditions and enjoys global linear convergence under strong convexity. With some particular choices of regularizations, our algorithm can be applied to decentralized machine learning applications. Finally, several numerical experiments and real data analyses are conducted to demonstrate the effectiveness of the proposed algorithm.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140596327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning explanatory logical rules in non-linear domains: a neuro-symbolic approach 学习非线性领域中的解释性逻辑规则:一种神经符号方法
IF 7.5 3区 计算机科学 Q1 Computer Science Pub Date : 2024-04-08 DOI: 10.1007/s10994-024-06538-7
Andreas Bueff, Vaishak Belle

Deep neural networks, despite their capabilities, are constrained by the need for large-scale training data, and often fall short in generalisation and interpretability. Inductive logic programming (ILP) presents an intriguing solution with its data-efficient learning of first-order logic rules. However, ILP grapples with challenges, notably the handling of non-linearity in continuous domains. With the ascent of neuro-symbolic ILP, there’s a drive to mitigate these challenges, synergising deep learning with relational ILP models to enhance interpretability and create logical decision boundaries. In this research, we introduce a neuro-symbolic ILP framework, grounded on differentiable Neural Logic networks, tailored for non-linear rule extraction in mixed discrete-continuous spaces. Our methodology consists of a neuro-symbolic approach, emphasising the extraction of non-linear functions from mixed domain data. Our preliminary findings showcase our architecture’s capability to identify non-linear functions from continuous data, offering a new perspective in neural-symbolic research and underlining the adaptability of ILP-based frameworks for regression challenges in continuous scenarios.

深度神经网络尽管功能强大,但受限于大规模训练数据的需求,在泛化和可解释性方面往往存在不足。归纳逻辑编程(ILP)通过对一阶逻辑规则的数据高效学习,提出了一种令人感兴趣的解决方案。然而,归纳逻辑编程也面临着挑战,尤其是在处理连续领域的非线性问题时。随着神经符号 ILP 的兴起,人们开始努力减轻这些挑战,将深度学习与关系 ILP 模型协同起来,以增强可解释性并创建逻辑决策边界。在这项研究中,我们引入了一种神经符号 ILP 框架,该框架以可微分神经逻辑网络为基础,专为离散-连续混合空间中的非线性规则提取而量身定制。我们的方法包括神经符号方法,强调从混合域数据中提取非线性函数。我们的初步研究结果展示了我们的架构从连续数据中识别非线性函数的能力,为神经符号研究提供了一个新的视角,并强调了基于 ILP 的框架对连续场景中回归挑战的适应性。
{"title":"Learning explanatory logical rules in non-linear domains: a neuro-symbolic approach","authors":"Andreas Bueff, Vaishak Belle","doi":"10.1007/s10994-024-06538-7","DOIUrl":"https://doi.org/10.1007/s10994-024-06538-7","url":null,"abstract":"<p>Deep neural networks, despite their capabilities, are constrained by the need for large-scale training data, and often fall short in generalisation and interpretability. Inductive logic programming (ILP) presents an intriguing solution with its data-efficient learning of first-order logic rules. However, ILP grapples with challenges, notably the handling of non-linearity in continuous domains. With the ascent of neuro-symbolic ILP, there’s a drive to mitigate these challenges, synergising deep learning with relational ILP models to enhance interpretability and create logical decision boundaries. In this research, we introduce a neuro-symbolic ILP framework, grounded on differentiable Neural Logic networks, tailored for non-linear rule extraction in mixed discrete-continuous spaces. Our methodology consists of a neuro-symbolic approach, emphasising the extraction of non-linear functions from mixed domain data. Our preliminary findings showcase our architecture’s capability to identify non-linear functions from continuous data, offering a new perspective in neural-symbolic research and underlining the adaptability of ILP-based frameworks for regression challenges in continuous scenarios.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140596320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Machine Learning
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1