首页 > 最新文献

Machine Learning最新文献

英文 中文
Rule learning by modularity 通过模块化学习规则
IF 7.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-03 DOI: 10.1007/s10994-024-06556-5
Albert Nössig, Tobias Hell, Georg Moser

In this paper, we present a modular methodology that combines state-of-the-art methods in (stochastic) machine learning with well-established methods in inductive logic programming (ILP) and rule induction to provide efficient and scalable algorithms for the classification of vast data sets. By construction, these classifications are based on the synthesis of simple rules, thus providing direct explanations of the obtained classifications. Apart from evaluating our approach on the common large scale data sets MNIST, Fashion-MNIST and IMDB, we present novel results on explainable classifications of dental bills. The latter case study stems from an industrial collaboration with Allianz Private Krankenversicherung which is an insurance company offering diverse services in Germany.

在本文中,我们介绍了一种模块化方法,它将最先进的(随机)机器学习方法与归纳逻辑编程(ILP)和规则归纳的成熟方法相结合,为海量数据集的分类提供了高效、可扩展的算法。从结构上看,这些分类基于简单规则的综合,从而为所获得的分类提供了直接解释。除了在常见的大规模数据集 MNIST、Fashion-MNIST 和 IMDB 上评估我们的方法外,我们还展示了牙科账单可解释分类的新成果。后一个案例研究源于与安联私人医疗保险公司(Allianz Private Krankenversicherung)的行业合作,该公司是一家在德国提供多种服务的保险公司。
{"title":"Rule learning by modularity","authors":"Albert Nössig, Tobias Hell, Georg Moser","doi":"10.1007/s10994-024-06556-5","DOIUrl":"https://doi.org/10.1007/s10994-024-06556-5","url":null,"abstract":"<p>In this paper, we present a modular methodology that combines state-of-the-art methods in (stochastic) machine learning with well-established methods in inductive logic programming (ILP) and rule induction to provide efficient and scalable algorithms for the classification of vast data sets. By construction, these classifications are based on the synthesis of simple rules, thus providing direct explanations of the obtained classifications. Apart from evaluating our approach on the common large scale data sets <i>MNIST</i>, <i>Fashion-MNIST</i> and <i>IMDB</i>, we present novel results on explainable classifications of dental bills. The latter case study stems from an industrial collaboration with <i>Allianz Private Krankenversicherung</i> which is an insurance company offering diverse services in Germany.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"50 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141551766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PROUD: PaRetO-gUided diffusion model for multi-objective generation PROUD:PaRetO-gUided 多目标生成扩散模型
IF 7.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-02 DOI: 10.1007/s10994-024-06575-2
Yinghua Yao, Yuangang Pan, Jing Li, Ivor Tsang, Xin Yao

Recent advancements in the realm of deep generative models focus on generating samples that satisfy multiple desired properties. However, prevalent approaches optimize these property functions independently, thus omitting the trade-offs among them. In addition, the property optimization is often improperly integrated into the generative models, resulting in an unnecessary compromise on generation quality (i.e., the quality of generated samples). To address these issues, we formulate a constrained optimization problem. It seeks to optimize generation quality while ensuring that generated samples reside at the Pareto front of multiple property objectives. Such a formulation enables the generation of samples that cannot be further improved simultaneously on the conflicting property functions and preserves good quality of generated samples.Building upon this formulation, we introduce the ParetO-gUided Diffusion model (PROUD), wherein the gradients in the denoising process are dynamically adjusted to enhance generation quality while the generated samples adhere to Pareto optimality. Experimental evaluations on image generation and protein generation tasks demonstrate that our PROUD consistently maintains superior generation quality while approaching Pareto optimality across multiple property functions compared to various baselines

深度生成模型领域的最新进展侧重于生成满足多种所需属性的样本。然而,普遍的方法都是独立优化这些属性函数,从而忽略了它们之间的权衡。此外,属性优化往往被不恰当地集成到生成模型中,导致生成质量(即生成样本的质量)受到不必要的影响。为了解决这些问题,我们提出了一个约束优化问题。该问题旨在优化生成质量,同时确保生成的样本位于多个属性目标的帕累托前沿。在此基础上,我们引入了 ParetO-gUided Diffusion 模型(PROUD),动态调整去噪过程中的梯度,以提高生成质量,同时使生成的样本符合帕累托最优性。对图像生成和蛋白质生成任务的实验评估表明,与各种基线相比,我们的 PROUD 始终保持着卓越的生成质量,同时在多个属性函数中接近帕累托最优。
{"title":"PROUD: PaRetO-gUided diffusion model for multi-objective generation","authors":"Yinghua Yao, Yuangang Pan, Jing Li, Ivor Tsang, Xin Yao","doi":"10.1007/s10994-024-06575-2","DOIUrl":"https://doi.org/10.1007/s10994-024-06575-2","url":null,"abstract":"<p>Recent advancements in the realm of deep generative models focus on generating samples that satisfy multiple desired properties. However, prevalent approaches optimize these property functions independently, thus omitting the trade-offs among them. In addition, the property optimization is often improperly integrated into the generative models, resulting in an unnecessary compromise on generation quality (i.e., the quality of generated samples). To address these issues, we formulate a constrained optimization problem. It seeks to optimize generation quality while ensuring that generated samples reside at the Pareto front of multiple property objectives. Such a formulation enables the generation of samples that cannot be further improved simultaneously on the conflicting property functions and preserves good quality of generated samples.Building upon this formulation, we introduce the ParetO-gUided Diffusion model (PROUD), wherein the gradients in the denoising process are dynamically adjusted to enhance generation quality while the generated samples adhere to Pareto optimality. Experimental evaluations on image generation and protein generation tasks demonstrate that our PROUD consistently maintains superior generation quality while approaching Pareto optimality across multiple property functions compared to various baselines</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"13 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141525355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Secure and fast asynchronous Vertical Federated Learning via cascaded hybrid optimization 通过级联混合优化实现安全快速的异步垂直联合学习
IF 7.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-27 DOI: 10.1007/s10994-024-06541-y
Ganyu Wang, Qingsong Zhang, Xiang Li, Boyu Wang, Bin Gu, Charles X. Ling

Vertical Federated Learning (VFL) is gaining increasing attention due to its ability to enable multiple parties to collaboratively train a privacy-preserving model using vertically partitioned data. Recent research has highlighted the advantages of using zeroth-order optimization (ZOO) in developing practical VFL algorithms. However, a significant drawback of ZOO-based VFL is its slow convergence rate, which limits its applicability in handling large modern models. To address this issue, we propose a cascaded hybrid optimization method for VFL. In this method, the downstream models (clients) are trained using ZOO to ensure privacy and prevent the sharing of internal information. Simultaneously, the upstream model (server) is updated locally using first-order optimization, which significantly improves the convergence rate. This approach allows for the training of large models without compromising privacy and security. We theoretically prove that our VFL method achieves faster convergence compared to ZOO-based VFL because the convergence rate of our framework is not limited by the size of the server model, making it effective for training large models. Extensive experiments demonstrate that our method achieves faster convergence than ZOO-based VFL while maintaining an equivalent level of privacy protection. Additionally, we demonstrate the feasibility of training large models using our method.

垂直联合学习(Vertical Federated Learning,VFL)能够让多方利用垂直分割的数据协作训练一个保护隐私的模型,因此越来越受到关注。最近的研究凸显了使用零阶优化(ZOO)开发实用 VFL 算法的优势。然而,基于 ZOO 的 VFL 的一个显著缺点是收敛速度慢,这限制了它在处理大型现代模型时的适用性。为了解决这个问题,我们提出了一种级联混合优化 VFL 方法。在这种方法中,下游模型(客户端)使用 ZOO 进行训练,以确保隐私并防止内部信息共享。同时,上游模型(服务器)使用一阶优化进行本地更新,从而显著提高收敛速度。这种方法可以在不影响隐私和安全的情况下训练大型模型。我们从理论上证明,与基于 ZOO 的 VFL 相比,我们的 VFL 方法能实现更快的收敛速度,因为我们框架的收敛速度不受服务器模型大小的限制,使其能有效地训练大型模型。大量实验证明,我们的方法比基于 ZOO 的 VFL 收敛速度更快,同时保持了同等水平的隐私保护。此外,我们还证明了使用我们的方法训练大型模型的可行性。
{"title":"Secure and fast asynchronous Vertical Federated Learning via cascaded hybrid optimization","authors":"Ganyu Wang, Qingsong Zhang, Xiang Li, Boyu Wang, Bin Gu, Charles X. Ling","doi":"10.1007/s10994-024-06541-y","DOIUrl":"https://doi.org/10.1007/s10994-024-06541-y","url":null,"abstract":"<p>Vertical Federated Learning (VFL) is gaining increasing attention due to its ability to enable multiple parties to collaboratively train a privacy-preserving model using vertically partitioned data. Recent research has highlighted the advantages of using zeroth-order optimization (ZOO) in developing practical VFL algorithms. However, a significant drawback of ZOO-based VFL is its slow convergence rate, which limits its applicability in handling large modern models. To address this issue, we propose a cascaded hybrid optimization method for VFL. In this method, the downstream models (clients) are trained using ZOO to ensure privacy and prevent the sharing of internal information. Simultaneously, the upstream model (server) is updated locally using first-order optimization, which significantly improves the convergence rate. This approach allows for the training of large models without compromising privacy and security. We theoretically prove that our VFL method achieves faster convergence compared to ZOO-based VFL because the convergence rate of our framework is not limited by the size of the server model, making it effective for training large models. Extensive experiments demonstrate that our method achieves faster convergence than ZOO-based VFL while maintaining an equivalent level of privacy protection. Additionally, we demonstrate the feasibility of training large models using our method.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"44 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141525356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evidential uncertainty sampling strategies for active learning 主动学习的证据不确定性抽样策略
IF 7.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-27 DOI: 10.1007/s10994-024-06567-2
Arthur Hoarau, Vincent Lemaire, Yolande Le Gall, Jean-Christophe Dubois, Arnaud Martin

Recent studies in active learning, particularly in uncertainty sampling, have focused on the decomposition of model uncertainty into reducible and irreducible uncertainties. In this paper, the aim is to simplify the computational process while eliminating the dependence on observations. Crucially, the inherent uncertainty in the labels is considered, i.e. the uncertainty of the oracles. Two strategies are proposed, sampling by Klir uncertainty, which tackles the exploration–exploitation dilemma, and sampling by evidential epistemic uncertainty, which extends the concept of reducible uncertainty within the evidential framework, both using the theory of belief functions. Experimental results in active learning demonstrate that our proposed method can outperform uncertainty sampling.

最近的主动学习研究,尤其是不确定性采样研究,主要集中在将模型的不确定性分解为可还原不确定性和不可还原不确定性。本文的目的是简化计算过程,同时消除对观测结果的依赖。最重要的是,本文考虑了标签中固有的不确定性,即指标的不确定性。我们提出了两种策略,一种是通过克利尔不确定性进行采样,以解决探索-开发两难的问题;另一种是通过证据认识论不确定性进行采样,在证据框架内扩展了可还原不确定性的概念,这两种策略都使用了信念函数理论。主动学习的实验结果表明,我们提出的方法可以超越不确定性采样。
{"title":"Evidential uncertainty sampling strategies for active learning","authors":"Arthur Hoarau, Vincent Lemaire, Yolande Le Gall, Jean-Christophe Dubois, Arnaud Martin","doi":"10.1007/s10994-024-06567-2","DOIUrl":"https://doi.org/10.1007/s10994-024-06567-2","url":null,"abstract":"<p>Recent studies in active learning, particularly in uncertainty sampling, have focused on the decomposition of model uncertainty into reducible and irreducible uncertainties. In this paper, the aim is to simplify the computational process while eliminating the dependence on observations. Crucially, the inherent uncertainty in the labels is considered, i.e. the uncertainty of the oracles. Two strategies are proposed, sampling by Klir uncertainty, which tackles the exploration–exploitation dilemma, and sampling by evidential epistemic uncertainty, which extends the concept of reducible uncertainty within the evidential framework, both using the theory of belief functions. Experimental results in active learning demonstrate that our proposed method can outperform uncertainty sampling.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"26 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sample complexity of variance-reduced policy gradient: weaker assumptions and lower bounds 方差缩小政策梯度的样本复杂性:较弱的假设和下限
IF 7.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-27 DOI: 10.1007/s10994-024-06573-4
Gabor Paczolay, Matteo Papini, Alberto Maria Metelli, Istvan Harmati, Marcello Restelli

Several variance-reduced versions of REINFORCE based on importance sampling achieve an improved (O(epsilon ^{-3})) sample complexity to find an (epsilon)-stationary point, under an unrealistic assumption on the variance of the importance weights. In this paper, we propose the Defensive Policy Gradient (DEF-PG) algorithm, based on defensive importance sampling, achieving the same result without any assumption on the variance of the importance weights. We also show that this is not improvable by establishing a matching (Omega (epsilon ^{-3})) lower bound, and that REINFORCE with its (O(epsilon ^{-4})) sample complexity is actually optimal under weaker assumptions on the policy class. Numerical simulations show promising results for the proposed technique compared to similar algorithms based on vanilla importance sampling.

在对重要性权重的方差做了不切实际的假设的情况下,基于重要性采样的 REINFORCE 的几个方差降低版本实现了改进的 (O(epsilon ^{-3}))采样复杂度,从而找到了一个 (epsilon)-stationary point。在本文中,我们提出了基于防御性重要性采样的防御策略梯度(DEF-PG)算法,在不假设重要性权重方差的情况下实现了相同的结果。我们还通过建立一个匹配的 (ω (epsilon ^{-3}))下限证明了这一算法无法改进,而且在政策类的较弱假设下,具有 (O(epsilon ^{-4}))采样复杂度的 REINFORCE 实际上是最优的。数值模拟显示,与基于香草重要性采样的类似算法相比,所提出的技术具有良好的效果。
{"title":"Sample complexity of variance-reduced policy gradient: weaker assumptions and lower bounds","authors":"Gabor Paczolay, Matteo Papini, Alberto Maria Metelli, Istvan Harmati, Marcello Restelli","doi":"10.1007/s10994-024-06573-4","DOIUrl":"https://doi.org/10.1007/s10994-024-06573-4","url":null,"abstract":"<p>Several variance-reduced versions of REINFORCE based on importance sampling achieve an improved <span>(O(epsilon ^{-3}))</span> sample complexity to find an <span>(epsilon)</span>-stationary point, under an unrealistic assumption on the variance of the importance weights. In this paper, we propose the Defensive Policy Gradient (DEF-PG) algorithm, based on defensive importance sampling, achieving the same result without any assumption on the variance of the importance weights. We also show that this is not improvable by establishing a matching <span>(Omega (epsilon ^{-3}))</span> lower bound, and that REINFORCE with its <span>(O(epsilon ^{-4}))</span> sample complexity is actually optimal under weaker assumptions on the policy class. Numerical simulations show promising results for the proposed technique compared to similar algorithms based on vanilla importance sampling.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"24 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quantitative Gaussian approximation of randomly initialized deep neural networks 随机初始化深度神经网络的定量高斯逼近
IF 7.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-25 DOI: 10.1007/s10994-024-06578-z
Andrea Basteri, Dario Trevisan

Given any deep fully connected neural network, initialized with random Gaussian parameters, we bound from above the quadratic Wasserstein distance between its output distribution and a suitable Gaussian process. Our explicit inequalities indicate how the hidden and output layers sizes affect the Gaussian behaviour of the network and quantitatively recover the distributional convergence results in the wide limit, i.e., if all the hidden layers sizes become large.

给定任何以随机高斯参数初始化的深度全连接神经网络,我们从上面约束了其输出分布与合适的高斯过程之间的二次瓦瑟斯坦距离。我们的显式不等式指出了隐藏层和输出层的大小如何影响网络的高斯行为,并定量地恢复了广义极限的分布收敛结果,即如果所有隐藏层的大小都变得很大。
{"title":"Quantitative Gaussian approximation of randomly initialized deep neural networks","authors":"Andrea Basteri, Dario Trevisan","doi":"10.1007/s10994-024-06578-z","DOIUrl":"https://doi.org/10.1007/s10994-024-06578-z","url":null,"abstract":"<p>Given any deep fully connected neural network, initialized with random Gaussian parameters, we bound from above the quadratic Wasserstein distance between its output distribution and a suitable Gaussian process. Our explicit inequalities indicate how the hidden and output layers sizes affect the Gaussian behaviour of the network and quantitatively recover the distributional convergence results in the wide limit, i.e., if all the hidden layers sizes become large.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"33 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141532684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Discrete-time graph neural networks for transaction prediction in Web3 social platforms 用于 Web3 社交平台交易预测的离散时间图神经网络
IF 7.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-25 DOI: 10.1007/s10994-024-06579-y
Manuel Dileo, Matteo Zignani

In Web3 social platforms, i.e. social web applications that rely on blockchain technology to support their functionalities, interactions among users are usually multimodal, from common social interactions such as following, liking, or posting, to specific relations given by crypto-token transfers facilitated by the blockchain. In this dynamic and intertwined networked context, modeled as a financial network, our main goals are (i) to predict whether a pair of users will be involved in a financial transaction, i.e. the transaction prediction task, even using textual information produced by users, and (ii) to verify whether performances may be enhanced by textual content. To address the above issues, we compared current snapshot-based temporal graph learning methods and developed T3GNN, a solution based on state-of-the-art temporal graph neural networks’ design, which integrates fine-tuned sentence embeddings and a simple yet effective graph-augmentation strategy for representing content, and historical negative sampling. We evaluated models in a Web3 context by leveraging a novel high-resolution temporal dataset, collected from one of the most used Web3 social platforms, which spans more than one year of financial interactions as well as published textual content. The experimental evaluation has shown that T3GNN consistently achieved the best performance over time and for most of the snapshots. Furthermore, through an extensive analysis of the performance of our model, we show that, despite the graph structure being crucial for making predictions, textual content contains useful information for forecasting transactions, highlighting an interplay between users’ interests and economic relationships in Web3 platforms. Finally, the evaluation has also highlighted the importance of adopting sampling methods alternative to random negative sampling when dealing with prediction tasks on temporal networks.

在 Web3 社交平台(即依靠区块链技术来支持其功能的社交网络应用程序)中,用户之间的互动通常是多模式的,从关注、点赞或发帖等普通社交互动,到区块链促成的加密令牌转账所带来的特定关系。在这种以金融网络为模型的动态、交织的网络背景下,我们的主要目标是:(i)预测一对用户是否会参与金融交易,即交易预测任务,甚至使用用户生成的文本信息;(ii)验证是否可以通过文本内容提高性能。为了解决上述问题,我们比较了当前基于快照的时态图学习方法,并开发了 T3GNN,这是一种基于最先进的时态图神经网络设计的解决方案,它集成了微调句子嵌入、简单而有效的图增强策略(用于表示内容)和历史负采样。我们利用从最常用的 Web3 社交平台之一收集的新型高分辨率时态数据集,对 Web3 环境下的模型进行了评估,该数据集涵盖了一年多的金融互动以及发布的文本内容。实验评估结果表明,T3GNN 在大部分时间和大部分快照中都始终保持着最佳性能。此外,通过对模型性能的广泛分析,我们发现,尽管图结构对预测至关重要,但文本内容也包含了预测交易的有用信息,这凸显了 Web3 平台中用户兴趣与经济关系之间的相互作用。最后,评估还强调了在处理时态网络预测任务时采用随机负抽样以外的抽样方法的重要性。
{"title":"Discrete-time graph neural networks for transaction prediction in Web3 social platforms","authors":"Manuel Dileo, Matteo Zignani","doi":"10.1007/s10994-024-06579-y","DOIUrl":"https://doi.org/10.1007/s10994-024-06579-y","url":null,"abstract":"<p>In Web3 social platforms, i.e. social web applications that rely on blockchain technology to support their functionalities, interactions among users are usually multimodal, from common social interactions such as following, liking, or posting, to specific relations given by crypto-token transfers facilitated by the blockchain. In this dynamic and intertwined networked context, modeled as a financial network, our main goals are (i) to predict whether a pair of users will be involved in a financial transaction, i.e. the <i>transaction prediction task</i>, even using textual information produced by users, and (ii) to verify whether performances may be enhanced by textual content. To address the above issues, we compared current snapshot-based temporal graph learning methods and developed T3GNN, a solution based on state-of-the-art temporal graph neural networks’ design, which integrates fine-tuned sentence embeddings and a simple yet effective graph-augmentation strategy for representing content, and historical negative sampling. We evaluated models in a Web3 context by leveraging a novel high-resolution temporal dataset, collected from one of the most used Web3 social platforms, which spans more than one year of financial interactions as well as published textual content. The experimental evaluation has shown that T3GNN consistently achieved the best performance over time and for most of the snapshots. Furthermore, through an extensive analysis of the performance of our model, we show that, despite the graph structure being crucial for making predictions, textual content contains useful information for forecasting transactions, highlighting an interplay between users’ interests and economic relationships in Web3 platforms. Finally, the evaluation has also highlighted the importance of adopting sampling methods alternative to random negative sampling when dealing with prediction tasks on temporal networks.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"345 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Kalt: generating adversarial explainable chinese legal texts Kalt:生成可解释的对抗性中文法律文本
IF 7.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-21 DOI: 10.1007/s10994-024-06572-5
Yunting Zhang, Shang Li, Lin Ye, Hongli Zhang, Zhe Chen, Binxing Fang

Deep neural networks (DNNs) are vulnerable to adversarial examples (AEs), which are well-designed input samples with imperceptible perturbations. Existing methods generate AEs to evaluate the robustness of DNN-based natural language processing models. However, the AE attack performance significantly degrades in some verticals, such as law, due to overlooking essential domain knowledge. To generate explainable Chinese legal adversarial texts, we introduce legal knowledge and propose a novel black-box approach, knowledge-aware law tricker (KALT), in the framework of adversarial text generation based on word importance. Firstly, we invent a legal knowledge extraction method based on KeyBERT. The knowledge contains unique features from each category and shared features among different categories. Additionally, we design two perturbation strategies, Strengthen Similar Label and Weaken Original Label, to selectively perturb the two types of features, which can significantly reduce the classification accuracy of the target model. These two perturbation strategies can be regarded as components, which can be conveniently integrated into any perturbation method to enhance attack performance. Furthermore, we propose a strong hybrid perturbation method to introduce perturbation into the original texts. The perturbation method combines seven representative perturbation methods for Chinese. Finally, we design a formula to calculate interpretability scores, quantifying the interpretability of adversarial text generation methods. Experimental results demonstrate that KALT can effectively generate explainable Chinese legal adversarial texts that can be misclassified with high confidence and achieve excellent attack performance against the powerful Chinese BERT.

深度神经网络(DNN)很容易受到对抗示例(AE)的影响,对抗示例是精心设计的输入样本,具有难以察觉的扰动。现有方法通过生成 AE 来评估基于 DNN 的自然语言处理模型的鲁棒性。然而,在某些垂直领域(如法律),由于忽略了基本的领域知识,AE 攻击性能明显下降。为了生成可解释的中文法律对抗文本,我们引入了法律知识,并在基于词重要性的对抗文本生成框架下提出了一种新颖的黑盒方法--知识感知法律诱导器(KALT)。首先,我们发明了一种基于 KeyBERT 的法律知识提取方法。该知识包含每个类别的独特特征和不同类别之间的共享特征。此外,我们还设计了两种扰动策略,即强化相似标签和弱化原始标签,以选择性地扰动这两类特征,从而显著降低目标模型的分类准确率。这两种扰动策略可以被视为组件,可以方便地集成到任何扰动方法中以提高攻击性能。此外,我们还提出了一种强混合扰动方法,将扰动引入原始文本。该扰动方法结合了七种具有代表性的中文扰动方法。最后,我们设计了一个计算可解释性分数的公式,量化了对抗文本生成方法的可解释性。实验结果表明,KALT 可以有效生成可解释的中文法律对抗文本,这些文本可以被高置信度地错误分类,并在面对强大的中文 BERT 时取得优异的攻击性能。
{"title":"Kalt: generating adversarial explainable chinese legal texts","authors":"Yunting Zhang, Shang Li, Lin Ye, Hongli Zhang, Zhe Chen, Binxing Fang","doi":"10.1007/s10994-024-06572-5","DOIUrl":"https://doi.org/10.1007/s10994-024-06572-5","url":null,"abstract":"<p>Deep neural networks (DNNs) are vulnerable to adversarial examples (AEs), which are well-designed input samples with imperceptible perturbations. Existing methods generate AEs to evaluate the robustness of DNN-based natural language processing models. However, the AE attack performance significantly degrades in some verticals, such as law, due to overlooking essential domain knowledge. To generate explainable Chinese legal adversarial texts, we introduce legal knowledge and propose a novel black-box approach, knowledge-aware law tricker (KALT), in the framework of adversarial text generation based on word importance. Firstly, we invent a legal knowledge extraction method based on KeyBERT. The knowledge contains unique features from each category and shared features among different categories. Additionally, we design two perturbation strategies, Strengthen Similar Label and Weaken Original Label, to selectively perturb the two types of features, which can significantly reduce the classification accuracy of the target model. These two perturbation strategies can be regarded as components, which can be conveniently integrated into any perturbation method to enhance attack performance. Furthermore, we propose a strong hybrid perturbation method to introduce perturbation into the original texts. The perturbation method combines seven representative perturbation methods for Chinese. Finally, we design a formula to calculate interpretability scores, quantifying the interpretability of adversarial text generation methods. Experimental results demonstrate that KALT can effectively generate explainable Chinese legal adversarial texts that can be misclassified with high confidence and achieve excellent attack performance against the powerful Chinese BERT.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"53 32 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving interpretability via regularization of neural activation sensitivity 通过正则化神经激活敏感性提高可解释性
IF 7.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-19 DOI: 10.1007/s10994-024-06549-4
Ofir Moshe, Gil Fidel, Ron Bitton, Asaf Shabtai

State-of-the-art deep neural networks (DNNs) are highly effective at tackling many real-world tasks. However, their widespread adoption in mission-critical contexts is limited due to two major weaknesses - their susceptibility to adversarial attacks and their opaqueness. The former raises concerns about DNNs’ security and generalization in real-world conditions, while the latter, opaqueness, directly impacts interpretability. The lack of interpretability diminishes user trust as it is challenging to have confidence in a model’s decision when its reasoning is not aligned with human perspectives. In this research, we (1) examine the effect of adversarial robustness on interpretability, and (2) present a novel approach for improving DNNs’ interpretability that is based on the regularization of neural activation sensitivity. We evaluate the interpretability of models trained using our method to that of standard models and models trained using state-of-the-art adversarial robustness techniques. Our results show that adversarially robust models are superior to standard models, and that models trained using our proposed method are even better than adversarially robust models in terms of interpretability.(Code provided in supplementary material.)

最先进的深度神经网络(DNN)在处理许多现实世界的任务时非常有效。然而,由于其易受对抗性攻击和不透明性这两大弱点,它们在关键任务环境中的广泛应用受到了限制。前者引发了人们对 DNN 在真实世界条件下的安全性和泛化能力的担忧,而后者,即不透明性,则直接影响了可解释性。缺乏可解释性会降低用户的信任度,因为当模型的推理与人类的观点不一致时,要对模型的决策抱有信心是很有挑战性的。在这项研究中,我们(1) 研究了对抗鲁棒性对可解释性的影响,(2) 提出了一种基于神经激活灵敏度正则化的提高 DNN 可解释性的新方法。我们评估了使用我们的方法训练的模型与标准模型和使用最先进的对抗鲁棒性技术训练的模型的可解释性。我们的结果表明,对抗鲁棒性模型优于标准模型,而使用我们提出的方法训练的模型在可解释性方面甚至优于对抗鲁棒性模型。
{"title":"Improving interpretability via regularization of neural activation sensitivity","authors":"Ofir Moshe, Gil Fidel, Ron Bitton, Asaf Shabtai","doi":"10.1007/s10994-024-06549-4","DOIUrl":"https://doi.org/10.1007/s10994-024-06549-4","url":null,"abstract":"<p>State-of-the-art deep neural networks (DNNs) are highly effective at tackling many real-world tasks. However, their widespread adoption in mission-critical contexts is limited due to two major weaknesses - their susceptibility to adversarial attacks and their opaqueness. The former raises concerns about DNNs’ security and generalization in real-world conditions, while the latter, opaqueness, directly impacts interpretability. The lack of interpretability diminishes user trust as it is challenging to have confidence in a model’s decision when its reasoning is not aligned with human perspectives. In this research, we (1) examine the effect of adversarial robustness on interpretability, and (2) present a novel approach for improving DNNs’ interpretability that is based on the regularization of neural activation sensitivity. We evaluate the interpretability of models trained using our method to that of standard models and models trained using state-of-the-art adversarial robustness techniques. Our results show that adversarially robust models are superior to standard models, and that models trained using our proposed method are even better than adversarially robust models in terms of interpretability.(Code provided in supplementary material.)</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"32 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141525360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
REFUEL: rule extraction for imbalanced neural node classification REFUEL:不平衡神经节点分类的规则提取
IF 7.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-19 DOI: 10.1007/s10994-024-06569-0
Marco Markwald, Elena Demidova

Imbalanced graph node classification is a highly relevant and challenging problem in many real-world applications. The inherent data scarcity, a central characteristic of this task, substantially limits the performance of neural classification models driven solely by data. Given the limited instances of relevant nodes and complex graph structures, current methods fail to capture the distinct characteristics of node attributes and graph patterns within the underrepresented classes. In this article, we propose REFUEL—a novel approach for highly imbalanced node classification problems in graphs. Whereas symbolic and neural methods have complementary strengths and weaknesses when applied to such problems, REFUEL combines the power of symbolic and neural learning in a novel neural rule-extraction architecture. REFUEL captures the class semantics in the automatically extracted rule vectors. Then, REFUEL augments the graph nodes with the extracted rules vectors and adopts a Graph Attention Network-based neural node embedding, enhancing the downstream neural node representation. Our evaluation confirms the effectiveness of the proposed REFUEL approach for three real-world datasets with different minority class sizes. REFUEL achieves at least a 4% point improvement in precision on the minority classes of 1.5–2% compared to the baselines.

在现实世界的许多应用中,不平衡图节点分类是一个高度相关且极具挑战性的问题。固有的数据稀缺性是这一任务的核心特征,它极大地限制了仅由数据驱动的神经分类模型的性能。由于相关节点和复杂图结构的实例有限,目前的方法无法捕捉到代表性不足的类别中节点属性和图模式的明显特征。在本文中,我们提出了 REFUEL--一种针对图中高度不平衡节点分类问题的新方法。符号方法和神经方法在应用于此类问题时优缺点互补,而 REFUEL 则将符号学习和神经学习的力量结合在一个新颖的神经规则提取架构中。REFUEL 在自动提取的规则向量中捕捉类的语义。然后,REFUEL 用提取的规则向量增强图节点,并采用基于图注意网络的神经节点嵌入,从而增强下游神经节点的表示。我们的评估证实了所提出的 REFUEL 方法在三个具有不同少数群体规模的真实数据集上的有效性。与基线相比,REFUEL 在 1.5%-2%的少数群体类别上至少提高了 4% 的精确度。
{"title":"REFUEL: rule extraction for imbalanced neural node classification","authors":"Marco Markwald, Elena Demidova","doi":"10.1007/s10994-024-06569-0","DOIUrl":"https://doi.org/10.1007/s10994-024-06569-0","url":null,"abstract":"<p>Imbalanced graph node classification is a highly relevant and challenging problem in many real-world applications. The inherent data scarcity, a central characteristic of this task, substantially limits the performance of neural classification models driven solely by data. Given the limited instances of relevant nodes and complex graph structures, current methods fail to capture the distinct characteristics of node attributes and graph patterns within the underrepresented classes. In this article, we propose REFUEL—a novel approach for highly imbalanced node classification problems in graphs. Whereas symbolic and neural methods have complementary strengths and weaknesses when applied to such problems, REFUEL combines the power of symbolic and neural learning in a novel neural rule-extraction architecture. REFUEL captures the class semantics in the automatically extracted rule vectors. Then, REFUEL augments the graph nodes with the extracted rules vectors and adopts a Graph Attention Network-based neural node embedding, enhancing the downstream neural node representation. Our evaluation confirms the effectiveness of the proposed REFUEL approach for three real-world datasets with different minority class sizes. REFUEL achieves at least a 4% point improvement in precision on the minority classes of 1.5–2% compared to the baselines.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"85 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141525357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Machine Learning
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1