arXiv - CS - Machine Learning最新文献

英文中文

Modeling Human Responses by Ordinal Archetypal Analysis 通过正序原型分析建立人类反应模型

arXiv - CS - Machine Learning

Pub Date : 2024-09-12 DOI: arxiv-2409.07934

Anna Emilie J. Wedenborg, Michael Alexander Harborg, Andreas Bigom, Oliver Elmgreen, Marcus Presutti, Andreas Råskov, Fumiko Kano Glückstad, Mikkel Schmidt, Morten Mørup

This paper introduces a novel framework for Archetypal Analysis (AA) tailoredto ordinal data, particularly from questionnaires. Unlike existing methods, theproposed method, Ordinal Archetypal Analysis (OAA), bypasses the two-stepprocess of transforming ordinal data into continuous scales and operatesdirectly on the ordinal data. We extend traditional AA methods to handle thesubjective nature of questionnaire-based data, acknowledging individualdifferences in scale perception. We introduce the Response Bias OrdinalArchetypal Analysis (RBOAA), which learns individualized scales for eachsubject during optimization. The effectiveness of these methods is demonstratedon synthetic data and the European Social Survey dataset, highlighting theirpotential to provide deeper insights into human behavior and perception. Thestudy underscores the importance of considering response bias in cross-nationalresearch and offers a principled approach to analyzing ordinal data throughArchetypal Analysis.

本文介绍了一种新颖的原型分析（AA）框架，专门针对序数数据，尤其是来自问卷调查的数据。与现有方法不同，本文提出的方法--序数原型分析（OAA）--绕过了将序数数据转换为连续量表的两步过程，直接对序数数据进行操作。我们扩展了传统的 AA 方法，以处理基于问卷的数据的主观性，承认量表感知的个体差异。我们引入了响应偏差序数弧形分析法（RBOAA），它能在优化过程中为每个受试者学习个性化的量表。我们在合成数据和欧洲社会调查数据集上证明了这些方法的有效性，从而凸显了它们在深入了解人类行为和感知方面的潜力。该研究强调了在跨国研究中考虑反应偏差的重要性，并提供了一种通过原型分析法分析序数数据的原则性方法。

{"title":"Modeling Human Responses by Ordinal Archetypal Analysis","authors":"Anna Emilie J. Wedenborg, Michael Alexander Harborg, Andreas Bigom, Oliver Elmgreen, Marcus Presutti, Andreas Råskov, Fumiko Kano Glückstad, Mikkel Schmidt, Morten Mørup","doi":"arxiv-2409.07934","DOIUrl":"https://doi.org/arxiv-2409.07934","url":null,"abstract":"This paper introduces a novel framework for Archetypal Analysis (AA) tailored\u0000to ordinal data, particularly from questionnaires. Unlike existing methods, the\u0000proposed method, Ordinal Archetypal Analysis (OAA), bypasses the two-step\u0000process of transforming ordinal data into continuous scales and operates\u0000directly on the ordinal data. We extend traditional AA methods to handle the\u0000subjective nature of questionnaire-based data, acknowledging individual\u0000differences in scale perception. We introduce the Response Bias Ordinal\u0000Archetypal Analysis (RBOAA), which learns individualized scales for each\u0000subject during optimization. The effectiveness of these methods is demonstrated\u0000on synthetic data and the European Social Survey dataset, highlighting their\u0000potential to provide deeper insights into human behavior and perception. The\u0000study underscores the importance of considering response bias in cross-national\u0000research and offers a principled approach to analyzing ordinal data through\u0000Archetypal Analysis.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":"31 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142180612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Network Anomaly Traffic Detection via Multi-view Feature Fusion 通过多视角特征融合进行网络异常流量检测

arXiv - CS - Machine Learning

Pub Date : 2024-09-12 DOI: arxiv-2409.08020

Song Hao, Wentao Fu, Xuanze Chen, Chengxiang Jin, Jiajun Zhou, Shanqing Yu, Qi Xuan

Traditional anomalous traffic detection methods are based on single-viewanalysis, which has obvious limitations in dealing with complex attacks andencrypted communications. In this regard, we propose a Multi-view FeatureFusion (MuFF) method for network anomaly traffic detection. MuFF models thetemporal and interactive relationships of packets in network traffic based onthe temporal and interactive viewpoints respectively. It learns temporal andinteractive features. These features are then fused from different perspectivesfor anomaly traffic detection. Extensive experiments on six real trafficdatasets show that MuFF has excellent performance in network anomalous trafficdetection, which makes up for the shortcomings of detection under a singleperspective.

传统的异常流量检测方法基于单视角分析，在处理复杂攻击和加密通信时具有明显的局限性。为此，我们提出了一种用于网络异常流量检测的多视角特征融合（Multi-view FeatureFusion，MuFF）方法。MuFF 分别基于时间视角和交互视角对网络流量中数据包的时间关系和交互关系进行建模。它可以学习时间和交互特征。然后从不同角度融合这些特征，进行异常流量检测。在六个真实流量数据集上进行的大量实验表明，MuFF 在网络异常流量检测中表现出色，弥补了单一视角检测的不足。

引用次数: 0

What is the Relationship between Tensor Factorizations and Circuits (and How Can We Exploit it)? 张量因式分解与电路之间的关系是什么（以及如何利用它）？

arXiv - CS - Machine Learning

Pub Date : 2024-09-12 DOI: arxiv-2409.07953

Lorenzo Loconte, Antonio Mari, Gennaro Gala, Robert Peharz, Cassio de Campos, Erik Quaeghebeur, Gennaro Vessio, Antonio Vergari

This paper establishes a rigorous connection between circuit representationsand tensor factorizations, two seemingly distinct yet fundamentally relatedareas. By connecting these fields, we highlight a series of opportunities thatcan benefit both communities. Our work generalizes popular tensorfactorizations within the circuit language, and unifies various circuitlearning algorithms under a single, generalized hierarchical factorizationframework. Specifically, we introduce a modular "Lego block" approach to buildtensorized circuit architectures. This, in turn, allows us to systematicallyconstruct and explore various circuit and tensor factorization models whilemaintaining tractability. This connection not only clarifies similarities anddifferences in existing models, but also enables the development of acomprehensive pipeline for building and optimizing new circuit/tensorfactorization architectures. We show the effectiveness of our framework throughextensive empirical evaluations, and highlight new research opportunities fortensor factorizations in probabilistic modeling.

本文在电路表示法和张量因式分解这两个看似不同却又有本质关联的领域之间建立了严格的联系。通过连接这两个领域，我们强调了一系列能使这两个领域受益的机会。我们的工作将流行的张量因式分解推广到电路语言中，并将各种电路学习算法统一到一个单一、通用的分层因式分解框架下。具体来说，我们引入了一种模块化的 "乐高积木 "方法来构建张量电路架构。这反过来又使我们能够系统地构建和探索各种电路和张量因式分解模型，同时保持其可操作性。这种联系不仅阐明了现有模型的异同，还使我们能够开发出用于构建和优化新电路/张量因子化架构的综合流水线。我们通过广泛的实证评估展示了我们框架的有效性，并强调了在概率建模中进行张量因子化的新研究机会。

{"title":"What is the Relationship between Tensor Factorizations and Circuits (and How Can We Exploit it)?","authors":"Lorenzo Loconte, Antonio Mari, Gennaro Gala, Robert Peharz, Cassio de Campos, Erik Quaeghebeur, Gennaro Vessio, Antonio Vergari","doi":"arxiv-2409.07953","DOIUrl":"https://doi.org/arxiv-2409.07953","url":null,"abstract":"This paper establishes a rigorous connection between circuit representations\u0000and tensor factorizations, two seemingly distinct yet fundamentally related\u0000areas. By connecting these fields, we highlight a series of opportunities that\u0000can benefit both communities. Our work generalizes popular tensor\u0000factorizations within the circuit language, and unifies various circuit\u0000learning algorithms under a single, generalized hierarchical factorization\u0000framework. Specifically, we introduce a modular \"Lego block\" approach to build\u0000tensorized circuit architectures. This, in turn, allows us to systematically\u0000construct and explore various circuit and tensor factorization models while\u0000maintaining tractability. This connection not only clarifies similarities and\u0000differences in existing models, but also enables the development of a\u0000comprehensive pipeline for building and optimizing new circuit/tensor\u0000factorization architectures. We show the effectiveness of our framework through\u0000extensive empirical evaluations, and highlight new research opportunities for\u0000tensor factorizations in probabilistic modeling.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":"10 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142180609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

LoRID: Low-Rank Iterative Diffusion for Adversarial Purification LoRID：逆向纯化的低链迭代扩散

arXiv - CS - Machine Learning

Pub Date : 2024-09-12 DOI: arxiv-2409.08255

Geigh Zollicoffer, Minh Vu, Ben Nebgen, Juan Castorena, Boian Alexandrov, Manish Bhattarai

This work presents an information-theoretic examination of diffusion-basedpurification methods, the state-of-the-art adversarial defenses that utilizediffusion models to remove malicious perturbations in adversarial examples. Bytheoretically characterizing the inherent purification errors associated withthe Markov-based diffusion purifications, we introduce LoRID, a novel Low-RankIterative Diffusion purification method designed to remove adversarialperturbation with low intrinsic purification errors. LoRID centers around amulti-stage purification process that leverages multiple rounds ofdiffusion-denoising loops at the early time-steps of the diffusion models, andthe integration of Tucker decomposition, an extension of matrix factorization,to remove adversarial noise at high-noise regimes. Consequently, LoRIDincreases the effective diffusion time-steps and overcomes strong adversarialattacks, achieving superior robustness performance in CIFAR-10/100, CelebA-HQ,and ImageNet datasets under both white-box and black-box settings.

本研究从信息论角度对基于扩散的净化方法进行了研究，这些方法是最先进的对抗防御手段，利用扩散模型来消除对抗示例中的恶意扰动。通过从理论上描述与基于马尔可夫的扩散净化相关的固有净化误差，我们引入了 LoRID，这是一种新型的低阶迭代扩散净化方法，旨在以较低的固有净化误差消除对抗性扰动。LoRID 以多级净化过程为中心，在扩散模型的早期时间步骤利用多轮扩散-去噪循环，并结合矩阵因式分解的扩展--塔克分解，以去除高噪声状态下的对抗性噪声。因此，LoRID 增加了有效的扩散时间步数，克服了强大的对抗性攻击，在 CIFAR-10/100、CelebA-HQ 和 ImageNet 数据集的白盒和黑盒设置下都取得了卓越的鲁棒性表现。

引用次数: 0

Multiplex Graph Contrastive Learning with Soft Negatives 利用软底片进行多重图对比学习

arXiv - CS - Machine Learning

Pub Date : 2024-09-12 DOI: arxiv-2409.08010

Zhenhao Zhao, Minhong Zhu, Chen Wang, Sijia Wang, Jiqiang Zhang, Li Chen, Weiran Cai

Graph Contrastive Learning (GCL) seeks to learn nodal or graphrepresentations that contain maximal consistent information fromgraph-structured data. While node-level contrasting modes are dominating, someefforts commence to explore consistency across different scales. Yet, they tendto lose consistent information and be contaminated by disturbing features.Here, we introduce MUX-GCL, a novel cross-scale contrastive learning paradigmthat utilizes multiplex representations as effective patches. While thislearning mode minimizes contaminating noises, a commensurate contrastingstrategy using positional affinities further avoids information loss bycorrecting false negative pairs across scales. Extensive downstream experimentsdemonstrate that MUX-GCL yields multiple state-of-the-art results on publicdatasets. Our theoretical analysis further guarantees the new objectivefunction as a stricter lower bound of mutual information of raw input featuresand output embeddings, which rationalizes this paradigm. Code is available athttps://github.com/MUX-GCL/Code.

图形对比学习（GCL）旨在从图形结构数据中学习包含最大一致性信息的节点或图形表示。虽然节点级对比模式占主导地位，但也有人开始探索不同尺度的一致性。在这里，我们引入了 MUX-GCL，这是一种新颖的跨尺度对比学习范式，它利用多重表征作为有效的补丁。这种学习模式能最大限度地减少干扰，而利用位置亲和力的相称对比策略则能通过校正跨尺度的假阴性对来进一步避免信息丢失。广泛的下游实验证明，MUX-GCL 在公共数据集上产生了多个最先进的结果。我们的理论分析进一步保证了新的目标函数是原始输入特征和输出嵌入的互信息的更严格下限，从而使这一范例更加合理。代码可在https://github.com/MUX-GCL/Code。

引用次数: 0

GRE^2-MDCL: Graph Representation Embedding Enhanced via Multidimensional Contrastive Learning GRE^2-MDCL：通过多维对比学习增强图形表示嵌入功能

arXiv - CS - Machine Learning

Pub Date : 2024-09-12 DOI: arxiv-2409.07725

Kaizhe Fan, Quanjun Li

Graph representation learning has emerged as a powerful tool for preservinggraph topology when mapping nodes to vector representations, enabling variousdownstream tasks such as node classification and community detection. However,most current graph neural network models face the challenge of requiringextensive labeled data, which limits their practical applicability inreal-world scenarios where labeled data is scarce. To address this challenge,researchers have explored Graph Contrastive Learning (GCL), which leveragesenhanced graph data and contrastive learning techniques. While promising,existing GCL methods often struggle with effectively capturing both local andglobal graph structures, and balancing the trade-off between nodelevel andgraph-level representations. In this work, we propose Graph RepresentationEmbedding Enhanced via Multidimensional Contrastive Learning (GRE2-MDCL). Ourmodel introduces a novel triple network architecture with a multi-headattention GNN as the core. GRE2-MDCL first globally and locally augments theinput graph using SVD and LAGNN techniques. It then constructs amultidimensional contrastive loss, incorporating cross-network, cross-view, andneighbor contrast, to optimize the model. Extensive experiments on benchmarkdatasets Cora, Citeseer, and PubMed demonstrate that GRE2-MDCL achievesstate-of-the-art performance, with average accuracies of 82.5%, 72.5%, and81.6% respectively. Visualizations further show tighter intra-clusteraggregation and clearer inter-cluster boundaries, highlighting theeffectiveness of our framework in improving upon baseline GCL models.

图表示学习已经成为一种强大的工具，在将节点映射到向量表示时可以保留图的拓扑结构，从而实现节点分类和群落检测等各种下游任务。然而，目前大多数图神经网络模型都面临着需要大量标注数据的挑战，这限制了它们在标注数据稀缺的现实世界场景中的实际应用。为了应对这一挑战，研究人员探索了图对比学习（GCL），它利用了增强的图数据和对比学习技术。现有的 GCL 方法虽然前景广阔，但在有效捕捉局部和全局图结构，以及平衡节点级和图级表征之间的权衡方面往往力不从心。在这项工作中，我们提出了通过多维对比学习增强图形表征嵌入（GRE2-MDCL）。我们的模型引入了以多头注意力 GNN 为核心的新型三重网络架构。GRE2-MDCL 首先使用 SVD 和 LAGNN 技术对输入图进行全局和局部增强。然后，它将跨网络、跨视图和邻居对比纳入其中，构建多维对比损失，以优化模型。在基准数据集 Cora、Citeseer 和 PubMed 上进行的大量实验表明，GRE2-MDCL 达到了最先进的性能，平均准确率分别为 82.5%、72.5% 和 81.6%。可视化效果进一步显示了更紧密的簇内聚类和更清晰的簇间边界，突出了我们的框架在改进基线 GCL 模型方面的有效性。

{"title":"GRE^2-MDCL: Graph Representation Embedding Enhanced via Multidimensional Contrastive Learning","authors":"Kaizhe Fan, Quanjun Li","doi":"arxiv-2409.07725","DOIUrl":"https://doi.org/arxiv-2409.07725","url":null,"abstract":"Graph representation learning has emerged as a powerful tool for preserving\u0000graph topology when mapping nodes to vector representations, enabling various\u0000downstream tasks such as node classification and community detection. However,\u0000most current graph neural network models face the challenge of requiring\u0000extensive labeled data, which limits their practical applicability in\u0000real-world scenarios where labeled data is scarce. To address this challenge,\u0000researchers have explored Graph Contrastive Learning (GCL), which leverages\u0000enhanced graph data and contrastive learning techniques. While promising,\u0000existing GCL methods often struggle with effectively capturing both local and\u0000global graph structures, and balancing the trade-off between nodelevel and\u0000graph-level representations. In this work, we propose Graph Representation\u0000Embedding Enhanced via Multidimensional Contrastive Learning (GRE2-MDCL). Our\u0000model introduces a novel triple network architecture with a multi-head\u0000attention GNN as the core. GRE2-MDCL first globally and locally augments the\u0000input graph using SVD and LAGNN techniques. It then constructs a\u0000multidimensional contrastive loss, incorporating cross-network, cross-view, and\u0000neighbor contrast, to optimize the model. Extensive experiments on benchmark\u0000datasets Cora, Citeseer, and PubMed demonstrate that GRE2-MDCL achieves\u0000state-of-the-art performance, with average accuracies of 82.5%, 72.5%, and\u000081.6% respectively. Visualizations further show tighter intra-cluster\u0000aggregation and clearer inter-cluster boundaries, highlighting the\u0000effectiveness of our framework in improving upon baseline GCL models.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":"121 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142180631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Efficient Privacy-Preserving KAN Inference Using Homomorphic Encryption 使用同态加密进行高效的隐私保护 KAN 推断

arXiv - CS - Machine Learning

Pub Date : 2024-09-12 DOI: arxiv-2409.07751

Zhizheng Lai, Yufei Zhou, Peijia Zheng, Lin Chen

The recently proposed Kolmogorov-Arnold Networks (KANs) offer enhancedinterpretability and greater model expressiveness. However, KANs also presentchallenges related to privacy leakage during inference. Homomorphic encryption(HE) facilitates privacy-preserving inference for deep learning models,enabling resource-limited users to benefit from deep learning services whileensuring data security. Yet, the complex structure of KANs, incorporatingnonlinear elements like the SiLU activation function and B-spline functions,renders existing privacy-preserving inference techniques inadequate. To addressthis issue, we propose an accurate and efficient privacy-preserving inferencescheme tailored for KANs. Our approach introduces a task-specific polynomialapproximation for the SiLU activation function, dynamically adjusting theapproximation range to ensure high accuracy on real-world datasets.Additionally, we develop an efficient method for computing B-spline functionswithin the HE domain, leveraging techniques such as repeat packing, lazycombination, and comparison functions. We evaluate the effectiveness of ourprivacy-preserving KAN inference scheme on both symbolic formula evaluation andimage classification. The experimental results show that our model achievesaccuracy comparable to plaintext KANs across various datasets and outperformsplaintext MLPs. Additionally, on the CIFAR-10 dataset, our inference latencyachieves over 7 times speedup compared to the naive method.

最近提出的 Kolmogorov-Arnold 网络（KANs）具有更强的可解释性和更高的模型表达能力。然而，KANs 也面临着推理过程中隐私泄露的挑战。同态加密（HE）有助于深度学习模型的隐私保护推理，使资源有限的用户能够受益于深度学习服务，同时确保数据安全。然而，KANs结构复杂，包含SiLU激活函数和B-样条函数等非线性元素，使得现有的隐私保护推理技术无法满足需要。为了解决这个问题，我们提出了一种专为 KAN 量身定制的准确高效的隐私保护推理方案。我们的方法为 SiLU 激活函数引入了针对特定任务的多项式逼近，动态调整逼近范围，以确保在真实世界数据集上的高精度。我们评估了保护隐私的 KAN 推理方案在符号公式评估和图像分类方面的有效性。实验结果表明，我们的模型在各种数据集上实现了与明文 KAN 相当的准确性，并且优于明文 MLP。此外，在 CIFAR-10 数据集上，我们的推理延迟比原始方法提高了 7 倍以上。

{"title":"Efficient Privacy-Preserving KAN Inference Using Homomorphic Encryption","authors":"Zhizheng Lai, Yufei Zhou, Peijia Zheng, Lin Chen","doi":"arxiv-2409.07751","DOIUrl":"https://doi.org/arxiv-2409.07751","url":null,"abstract":"The recently proposed Kolmogorov-Arnold Networks (KANs) offer enhanced\u0000interpretability and greater model expressiveness. However, KANs also present\u0000challenges related to privacy leakage during inference. Homomorphic encryption\u0000(HE) facilitates privacy-preserving inference for deep learning models,\u0000enabling resource-limited users to benefit from deep learning services while\u0000ensuring data security. Yet, the complex structure of KANs, incorporating\u0000nonlinear elements like the SiLU activation function and B-spline functions,\u0000renders existing privacy-preserving inference techniques inadequate. To address\u0000this issue, we propose an accurate and efficient privacy-preserving inference\u0000scheme tailored for KANs. Our approach introduces a task-specific polynomial\u0000approximation for the SiLU activation function, dynamically adjusting the\u0000approximation range to ensure high accuracy on real-world datasets.\u0000Additionally, we develop an efficient method for computing B-spline functions\u0000within the HE domain, leveraging techniques such as repeat packing, lazy\u0000combination, and comparison functions. We evaluate the effectiveness of our\u0000privacy-preserving KAN inference scheme on both symbolic formula evaluation and\u0000image classification. The experimental results show that our model achieves\u0000accuracy comparable to plaintext KANs across various datasets and outperforms\u0000plaintext MLPs. Additionally, on the CIFAR-10 dataset, our inference latency\u0000achieves over 7 times speedup compared to the naive method.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":"62 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142223714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Reimagining Linear Probing: Kolmogorov-Arnold Networks in Transfer Learning 重新想象线性探测：迁移学习中的柯尔莫哥洛夫-阿诺德网络

arXiv - CS - Machine Learning

Pub Date : 2024-09-12 DOI: arxiv-2409.07763

Sheng Shen, Rabih Younes

This paper introduces Kolmogorov-Arnold Networks (KAN) as an enhancement tothe traditional linear probing method in transfer learning. Linear probing,often applied to the final layer of pre-trained models, is limited by itsinability to model complex relationships in data. To address this, we proposesubstituting the linear probing layer with KAN, which leverages spline-basedrepresentations to approximate intricate functions. In this study, we integrateKAN with a ResNet-50 model pre-trained on ImageNet and evaluate its performanceon the CIFAR-10 dataset. We perform a systematic hyperparameter search,focusing on grid size and spline degree (k), to optimize KAN's flexibility andaccuracy. Our results demonstrate that KAN consistently outperforms traditionallinear probing, achieving significant improvements in accuracy andgeneralization across a range of configurations. These findings indicate thatKAN offers a more powerful and adaptable alternative to conventional linearprobing techniques in transfer learning.

本文介绍了 Kolmogorov-Arnold 网络（KAN），作为迁移学习中传统线性探测方法的一种增强。线性探测通常应用于预训练模型的最后一层，但因其无法对数据中的复杂关系建模而受到限制。为了解决这个问题，我们建议用 KAN 代替线性探测层，KAN 利用基于样条的表示来逼近复杂的函数。在本研究中，我们将 KAN 与在 ImageNet 上预先训练好的 ResNet-50 模型进行了整合，并在 CIFAR-10 数据集上对其性能进行了评估。我们进行了系统的超参数搜索，重点关注网格大小和样条线度（k），以优化 KAN 的灵活性和准确性。我们的结果表明，KAN 的性能始终优于传统的线性探测，在各种配置下都能显著提高精度和泛化能力。这些发现表明，在迁移学习中，KAN 提供了一种比传统线性探测技术更强大、适应性更强的替代方法。

引用次数: 0

Privacy-preserving federated prediction of pain intensity change based on multi-center survey data 基于多中心调查数据的疼痛强度变化的隐私保护联合预测

arXiv - CS - Machine Learning

Pub Date : 2024-09-12 DOI: arxiv-2409.07997

Supratim Das, Mahdie Rafie, Paula Kammer, Søren T. Skou, Dorte T. Grønne, Ewa M. Roos, André Hajek, Hans-Helmut König, Md Shihab Ullaha, Niklas Probul, Jan Baumbacha, Linda Baumbach

Background: Patient-reported survey data are used to train prognostic modelsaimed at improving healthcare. However, such data are typically availablemulti-centric and, for privacy reasons, cannot easily be centralized in onedata repository. Models trained locally are less accurate, robust, andgeneralizable. We present and apply privacy-preserving federated machinelearning techniques for prognostic model building, where local survey datanever leaves the legally safe harbors of the medical centers. Methods: We usedcentralized, local, and federated learning techniques on two healthcaredatasets (GLA:D data from the five health regions of Denmark and internationalSHARE data of 27 countries) to predict two different health outcomes. Wecompared linear regression, random forest regression, and random forestclassification models trained on local data with those trained on the entiredata in a centralized and in a federated fashion. Results: In GLA:D data,federated linear regression (R2 0.34, RMSE 18.2) and federated random forestregression (R2 0.34, RMSE 18.3) models outperform their local counterparts(i.e., R2 0.32, RMSE 18.6, R2 0.30, RMSE 18.8) with statistical significance.We also found that centralized models (R2 0.34, RMSE 18.2, R2 0.32, RMSE 18.5,respectively) did not perform significantly better than the federated models.In SHARE, the federated model (AC 0.78, AUROC: 0.71) and centralized model (AC0.84, AUROC: 0.66) perform significantly better than the local models (AC:0.74, AUROC: 0.69). Conclusion: Federated learning enables the training ofprognostic models from multi-center surveys without compromising privacy andwith only minimal or no compromise regarding model performance.

背景：患者报告的调查数据被用来训练预后模型，以改善医疗保健。然而，此类数据通常是多中心提供的，出于隐私原因，无法轻易集中到一个数据存储库中。本地训练的模型准确性、鲁棒性和通用性都较差。我们提出并应用了保护隐私的联合机器学习技术来构建预后模型，其中本地调查数据永远不会离开医疗中心的合法安全港。方法：我们在两个健康数据集（来自丹麦五个健康地区的 GLA:D 数据和来自 27 个国家的国际医疗保健数据）上使用了集中、本地和联合学习技术来预测两种不同的健康结果。我们比较了在本地数据上训练的线性回归模型、随机森林回归模型和随机森林分类模型，以及以集中和联合方式在初始数据上训练的模型。结果显示在 GLA:D 数据中，联合线性回归模型（R2 0.34，RMSE 18.2）和联合随机森林回归模型（R2 0.34，RMSE 18.3）优于其本地对应模型（即：R2 0.32，RMSE 18.3）、我们还发现，集中模型（分别为 R2 0.34、RMSE 18.2、R2 0.32、RMSE 18.5）的表现并没有明显优于联合模型。在 SHARE 中，联合模型（AC 0.78，AUROC：0.71）和集中模型（AC 0.84，AUROC：0.66）的表现明显优于本地模型（AC：0.74，AUROC：0.69）。结论联合学习能在不损害隐私的情况下从多中心调查中训练预测模型，而且模型的性能只受到最低程度的影响，甚至没有受到任何影响。

{"title":"Privacy-preserving federated prediction of pain intensity change based on multi-center survey data","authors":"Supratim Das, Mahdie Rafie, Paula Kammer, Søren T. Skou, Dorte T. Grønne, Ewa M. Roos, André Hajek, Hans-Helmut König, Md Shihab Ullaha, Niklas Probul, Jan Baumbacha, Linda Baumbach","doi":"arxiv-2409.07997","DOIUrl":"https://doi.org/arxiv-2409.07997","url":null,"abstract":"Background: Patient-reported survey data are used to train prognostic models\u0000aimed at improving healthcare. However, such data are typically available\u0000multi-centric and, for privacy reasons, cannot easily be centralized in one\u0000data repository. Models trained locally are less accurate, robust, and\u0000generalizable. We present and apply privacy-preserving federated machine\u0000learning techniques for prognostic model building, where local survey data\u0000never leaves the legally safe harbors of the medical centers. Methods: We used\u0000centralized, local, and federated learning techniques on two healthcare\u0000datasets (GLA:D data from the five health regions of Denmark and international\u0000SHARE data of 27 countries) to predict two different health outcomes. We\u0000compared linear regression, random forest regression, and random forest\u0000classification models trained on local data with those trained on the entire\u0000data in a centralized and in a federated fashion. Results: In GLA:D data,\u0000federated linear regression (R2 0.34, RMSE 18.2) and federated random forest\u0000regression (R2 0.34, RMSE 18.3) models outperform their local counterparts\u0000(i.e., R2 0.32, RMSE 18.6, R2 0.30, RMSE 18.8) with statistical significance.\u0000We also found that centralized models (R2 0.34, RMSE 18.2, R2 0.32, RMSE 18.5,\u0000respectively) did not perform significantly better than the federated models.\u0000In SHARE, the federated model (AC 0.78, AUROC: 0.71) and centralized model (AC\u00000.84, AUROC: 0.66) perform significantly better than the local models (AC:\u00000.74, AUROC: 0.69). Conclusion: Federated learning enables the training of\u0000prognostic models from multi-center surveys without compromising privacy and\u0000with only minimal or no compromise regarding model performance.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142180610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Tera-SpaceCom: GNN-based Deep Reinforcement Learning for Joint Resource Allocation and Task Offloading in TeraHertz Band Space Networks Tera-SpaceCom：基于 GNN 的深度强化学习，用于 TeraHertz 频带空间网络中的联合资源分配和任务卸载

arXiv - CS - Machine Learning

Pub Date : 2024-09-12 DOI: arxiv-2409.07911

Zhifeng Hu, Chong Han, Wolfgang Gerstacker, Ian F. Akyildiz

Terahertz (THz) space communications (Tera-SpaceCom) is envisioned as apromising technology to enable various space science and communicationapplications. Mainly, the realm of Tera-SpaceCom consists of THz sensing forspace exploration, data centers in space providing cloud services for spaceexploration tasks, and a low earth orbit (LEO) mega-constellation relayingthese tasks to ground stations (GSs) or data centers via THz links. Moreover,to reduce the computational burden on data centers as well as resourceconsumption and latency in the relaying process, the LEO mega-constellationprovides satellite edge computing (SEC) services to directly compute spaceexploration tasks without relaying these tasks to data centers. The LEOsatellites that receive space exploration tasks offload (i.e., distribute)partial tasks to their neighboring LEO satellites, to further reduce theircomputational burden. However, efficient joint communication resourceallocation and computing task offloading for the Tera-SpaceCom SEC network isan NP-hard mixed-integer nonlinear programming problem (MINLP), due to thediscrete nature of space exploration tasks and sub-arrays as well as thecontinuous nature of transmit power. To tackle this challenge, a graph neuralnetwork (GNN)-deep reinforcement learning (DRL)-based joint resource allocationand task offloading (GRANT) algorithm is proposed with the target of long-termresource efficiency (RE). Particularly, GNNs learn relationships amongdifferent satellites from their connectivity information. Furthermore,multi-agent and multi-task mechanisms cooperatively train task offloading andresource allocation. Compared with benchmark solutions, GRANT not only achievesthe highest RE with relatively low latency, but realizes the fewest trainableparameters and the shortest running time.

太赫兹（THz）空间通信（Tera-SpaceCom）被认为是实现各种空间科学和通信应用的一项新兴技术。太赫兹空间通信领域主要包括用于空间探索的太赫兹传感、为空间探索任务提供云服务的空间数据中心，以及通过太赫兹链路将这些任务中继到地面站或数据中心的低地球轨道（LEO）超大型星座。此外，为了减少数据中心的计算负担以及中继过程中的资源消耗和延迟，低地轨道超大星座提供卫星边缘计算（SEC）服务，直接计算空间探索任务，而无需将这些任务中继给数据中心。接收空间探索任务的低地轨道卫星将部分任务卸载（即分配）给邻近的低地轨道卫星，以进一步减轻其计算负担。然而，由于太空探索任务和子阵列的离散性以及发射功率的连续性，Tera-SpaceCom SEC 网络的高效联合通信资源分配和计算任务卸载是一个 NP 难的混合整数非线性编程（MINLP）问题。为了应对这一挑战，我们提出了一种基于图神经网络（GNN）和深度强化学习（DRL）的联合资源分配和任务卸载（GRANT）算法，其目标是实现长期资源效率（RE）。特别是，GNN 从不同卫星的连接信息中学习它们之间的关系。此外，多代理和多任务机制合作训练任务卸载和资源分配。与基准解决方案相比，GRANT 不仅以相对较低的延迟实现了最高的资源效率，而且实现了最少的可训练参数和最短的运行时间。

{"title":"Tera-SpaceCom: GNN-based Deep Reinforcement Learning for Joint Resource Allocation and Task Offloading in TeraHertz Band Space Networks","authors":"Zhifeng Hu, Chong Han, Wolfgang Gerstacker, Ian F. Akyildiz","doi":"arxiv-2409.07911","DOIUrl":"https://doi.org/arxiv-2409.07911","url":null,"abstract":"Terahertz (THz) space communications (Tera-SpaceCom) is envisioned as a\u0000promising technology to enable various space science and communication\u0000applications. Mainly, the realm of Tera-SpaceCom consists of THz sensing for\u0000space exploration, data centers in space providing cloud services for space\u0000exploration tasks, and a low earth orbit (LEO) mega-constellation relaying\u0000these tasks to ground stations (GSs) or data centers via THz links. Moreover,\u0000to reduce the computational burden on data centers as well as resource\u0000consumption and latency in the relaying process, the LEO mega-constellation\u0000provides satellite edge computing (SEC) services to directly compute space\u0000exploration tasks without relaying these tasks to data centers. The LEO\u0000satellites that receive space exploration tasks offload (i.e., distribute)\u0000partial tasks to their neighboring LEO satellites, to further reduce their\u0000computational burden. However, efficient joint communication resource\u0000allocation and computing task offloading for the Tera-SpaceCom SEC network is\u0000an NP-hard mixed-integer nonlinear programming problem (MINLP), due to the\u0000discrete nature of space exploration tasks and sub-arrays as well as the\u0000continuous nature of transmit power. To tackle this challenge, a graph neural\u0000network (GNN)-deep reinforcement learning (DRL)-based joint resource allocation\u0000and task offloading (GRANT) algorithm is proposed with the target of long-term\u0000resource efficiency (RE). Particularly, GNNs learn relationships among\u0000different satellites from their connectivity information. Furthermore,\u0000multi-agent and multi-task mechanisms cooperatively train task offloading and\u0000resource allocation. Compared with benchmark solutions, GRANT not only achieves\u0000the highest RE with relatively low latency, but realizes the fewest trainable\u0000parameters and the shortest running time.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":"144 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142180613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

arXiv - CS - Machine Learning

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀