首页 > 最新文献

arXiv - CS - Machine Learning最新文献

英文 中文
DEMAU: Decompose, Explore, Model and Analyse Uncertainties DEMAU:分解、探索、模拟和分析不确定性
Pub Date : 2024-09-12 DOI: arxiv-2409.08105
Arthur Hoarau, Vincent Lemaire
Recent research in machine learning has given rise to a flourishingliterature on the quantification and decomposition of model uncertainty. Thisinformation can be very useful during interactions with the learner, such as inactive learning or adaptive learning, and especially in uncertainty sampling.To allow a simple representation of these total, epistemic (reducible) andaleatoric (irreducible) uncertainties, we offer DEMAU, an open-sourceeducational, exploratory and analytical tool allowing to visualize and exploreseveral types of uncertainty for classification models in machine learning.
最近的机器学习研究催生了大量关于模型不确定性量化和分解的文献。这些信息在与学习者的交互过程中非常有用,比如非主动学习或自适应学习,尤其是在不确定性采样中。为了能够简单地表示这些总的不确定性、认识论的(可还原的)不确定性和理论的(不可还原的)不确定性,我们提供了 DEMAU,这是一个开源的教育、探索和分析工具,可以可视化和探索机器学习中分类模型的各种类型的不确定性。
{"title":"DEMAU: Decompose, Explore, Model and Analyse Uncertainties","authors":"Arthur Hoarau, Vincent Lemaire","doi":"arxiv-2409.08105","DOIUrl":"https://doi.org/arxiv-2409.08105","url":null,"abstract":"Recent research in machine learning has given rise to a flourishing\u0000literature on the quantification and decomposition of model uncertainty. This\u0000information can be very useful during interactions with the learner, such as in\u0000active learning or adaptive learning, and especially in uncertainty sampling.\u0000To allow a simple representation of these total, epistemic (reducible) and\u0000aleatoric (irreducible) uncertainties, we offer DEMAU, an open-source\u0000educational, exploratory and analytical tool allowing to visualize and explore\u0000several types of uncertainty for classification models in machine learning.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":"29 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142180605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
XMOL: Explainable Multi-property Optimization of Molecules XMOL:可解释的分子多性能优化
Pub Date : 2024-09-12 DOI: arxiv-2409.07786
Aye Phyu Phyu Aung, Jay Chaudhary, Ji Wei Yoon, Senthilnath Jayavelu
Molecular optimization is a key challenge in drug discovery and materialscience domain, involving the design of molecules with desired properties.Existing methods focus predominantly on single-property optimization,necessitating repetitive runs to target multiple properties, which isinefficient and computationally expensive. Moreover, these methods often lacktransparency, making it difficult for researchers to understand and control theoptimization process. To address these issues, we propose a novel framework,Explainable Multi-property Optimization of Molecules (XMOL), to optimizemultiple molecular properties simultaneously while incorporatingexplainability. Our approach builds on state-of-the-art geometric diffusionmodels, extending them to multi-property optimization through the introductionof spectral normalization and enhanced molecular constraints for stabilizedtraining. Additionally, we integrate interpretive and explainable techniquesthroughout the optimization process. We evaluated XMOL on the real-worldmolecular datasets i.e., QM9, demonstrating its effectiveness in both singleproperty and multiple properties optimization while offering interpretableresults, paving the way for more efficient and reliable molecular design.
分子优化是药物发现和材料科学领域的一项关键挑战,涉及设计具有所需性质的分子。现有方法主要侧重于单一性质的优化,需要针对多种性质重复运行,效率低且计算成本高。此外,这些方法往往缺乏透明度,使研究人员难以理解和控制优化过程。为了解决这些问题,我们提出了一个新颖的框架--可解释的分子多特性优化(XMOL),在结合可解释性的同时优化多种分子特性。我们的方法以最先进的几何扩散模型为基础,通过引入光谱归一化和增强的稳定训练分子约束,将其扩展到多属性优化。此外,我们还在整个优化过程中整合了解释性和可解释性技术。我们在真实世界的分子数据集(如 QM9)上对 XMOL 进行了评估,证明了它在单属性和多属性优化中的有效性,同时提供了可解释的结果,为更高效、更可靠的分子设计铺平了道路。
{"title":"XMOL: Explainable Multi-property Optimization of Molecules","authors":"Aye Phyu Phyu Aung, Jay Chaudhary, Ji Wei Yoon, Senthilnath Jayavelu","doi":"arxiv-2409.07786","DOIUrl":"https://doi.org/arxiv-2409.07786","url":null,"abstract":"Molecular optimization is a key challenge in drug discovery and material\u0000science domain, involving the design of molecules with desired properties.\u0000Existing methods focus predominantly on single-property optimization,\u0000necessitating repetitive runs to target multiple properties, which is\u0000inefficient and computationally expensive. Moreover, these methods often lack\u0000transparency, making it difficult for researchers to understand and control the\u0000optimization process. To address these issues, we propose a novel framework,\u0000Explainable Multi-property Optimization of Molecules (XMOL), to optimize\u0000multiple molecular properties simultaneously while incorporating\u0000explainability. Our approach builds on state-of-the-art geometric diffusion\u0000models, extending them to multi-property optimization through the introduction\u0000of spectral normalization and enhanced molecular constraints for stabilized\u0000training. Additionally, we integrate interpretive and explainable techniques\u0000throughout the optimization process. We evaluated XMOL on the real-world\u0000molecular datasets i.e., QM9, demonstrating its effectiveness in both single\u0000property and multiple properties optimization while offering interpretable\u0000results, paving the way for more efficient and reliable molecular design.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":"77 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142223713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BLens: Contrastive Captioning of Binary Functions using Ensemble Embedding BLens:利用集合嵌入对二进制函数进行对比式字幕制作
Pub Date : 2024-09-12 DOI: arxiv-2409.07889
Tristan Benoit, Yunru Wang, Moritz Dannehl, Johannes Kinder
Function names can greatly aid human reverse engineers, which has spurreddevelopment of machine learning-based approaches to predicting function namesin stripped binaries. Much current work in this area now uses transformers,applying a metaphor of machine translation from code to function names. Still,function naming models face challenges in generalizing to projects completelyunrelated to the training set. In this paper, we take a completely new approachby transferring advances in automated image captioning to the domain of binaryreverse engineering, such that different parts of a binary function can beassociated with parts of its name. We propose BLens, which combines multiplebinary function embeddings into a new ensemble representation, aligns it withthe name representation latent space via a contrastive learning approach, andgenerates function names with a transformer architecture tailored for functionnames. In our experiments, we demonstrate that BLens significantly outperformsthe state of the art. In the usual setting of splitting per binary, we achievean $F_1$ score of 0.77 compared to 0.67. Moreover, in the cross-projectsetting, which emphasizes generalizability, we achieve an $F_1$ score of 0.46compared to 0.29.
函数名可以极大地帮助人类逆向工程师,这也刺激了基于机器学习的方法的发展,以预测剥离二进制文件中的函数名。目前,该领域的许多工作都使用了转换器,应用了从代码到函数名的机器翻译隐喻。不过,函数命名模型在推广到与训练集完全无关的项目时仍然面临挑战。在本文中,我们采用了一种全新的方法,将自动图像字幕技术的进步应用到二进制逆向工程领域,从而将二进制函数的不同部分与其名称的不同部分联系起来。我们提出了 BLens,它将多个二进制函数嵌入结合到一个新的集合表示中,通过对比学习方法将其与名称表示的潜在空间对齐,并通过专为函数名称定制的转换器架构生成函数名称。我们在实验中证明,BLens 的性能明显优于现有技术。在按二进制拆分的常规设置中,我们的 $F_1$ 得分为 0.77,而后者为 0.67。此外,在强调通用性的跨项目设置中,我们的 $F_1$ 得分为 0.46,而之前的得分为 0.29。
{"title":"BLens: Contrastive Captioning of Binary Functions using Ensemble Embedding","authors":"Tristan Benoit, Yunru Wang, Moritz Dannehl, Johannes Kinder","doi":"arxiv-2409.07889","DOIUrl":"https://doi.org/arxiv-2409.07889","url":null,"abstract":"Function names can greatly aid human reverse engineers, which has spurred\u0000development of machine learning-based approaches to predicting function names\u0000in stripped binaries. Much current work in this area now uses transformers,\u0000applying a metaphor of machine translation from code to function names. Still,\u0000function naming models face challenges in generalizing to projects completely\u0000unrelated to the training set. In this paper, we take a completely new approach\u0000by transferring advances in automated image captioning to the domain of binary\u0000reverse engineering, such that different parts of a binary function can be\u0000associated with parts of its name. We propose BLens, which combines multiple\u0000binary function embeddings into a new ensemble representation, aligns it with\u0000the name representation latent space via a contrastive learning approach, and\u0000generates function names with a transformer architecture tailored for function\u0000names. In our experiments, we demonstrate that BLens significantly outperforms\u0000the state of the art. In the usual setting of splitting per binary, we achieve\u0000an $F_1$ score of 0.77 compared to 0.67. Moreover, in the cross-project\u0000setting, which emphasizes generalizability, we achieve an $F_1$ score of 0.46\u0000compared to 0.29.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":"21 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142227503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Alignment with Preference Optimization Is All You Need for LLM Safety 与偏好优化保持一致是保证 LLM 安全的必要条件
Pub Date : 2024-09-12 DOI: arxiv-2409.07772
Reda Alami, Ali Khalifa Almansoori, Ahmed Alzubaidi, Mohamed El Amine Seddik, Mugariya Farooq, Hakim Hacid
We demonstrate that preference optimization methods can effectively enhanceLLM safety. Applying various alignment techniques to the Falcon 11B model usingsafety datasets, we achieve a significant boost in global safety score (from$57.64%$ to $99.90%$) as measured by LlamaGuard 3 8B, competing withstate-of-the-art models. On toxicity benchmarks, average scores in adversarialsettings dropped from over $0.6$ to less than $0.07$. However, this safetyimprovement comes at the cost of reduced general capabilities, particularly inmath, suggesting a trade-off. We identify noise contrastive alignment(Safe-NCA) as an optimal method for balancing safety and performance. Our studyultimately shows that alignment techniques can be sufficient for building safeand robust models.
我们证明了偏好优化方法可以有效提高LLM 的安全性。在使用安全数据集的 Falcon 11B 模型中应用各种配准技术后,我们显著提高了 LlamaGuard 3 8B 测定的全球安全得分(从 57.64%$ 提高到 99.90%$),与最先进的模型不相上下。在毒性基准上,对抗环境中的平均得分从 0.6 美元以上降至 0.07 美元以下。然而,这种安全性的提高是以通用能力的降低为代价的,特别是在数学方面,这表明需要权衡利弊。我们认为噪声对比对齐(Safe-NCA)是平衡安全性和性能的最佳方法。我们的研究最终表明,配准技术足以构建安全而稳健的模型。
{"title":"Alignment with Preference Optimization Is All You Need for LLM Safety","authors":"Reda Alami, Ali Khalifa Almansoori, Ahmed Alzubaidi, Mohamed El Amine Seddik, Mugariya Farooq, Hakim Hacid","doi":"arxiv-2409.07772","DOIUrl":"https://doi.org/arxiv-2409.07772","url":null,"abstract":"We demonstrate that preference optimization methods can effectively enhance\u0000LLM safety. Applying various alignment techniques to the Falcon 11B model using\u0000safety datasets, we achieve a significant boost in global safety score (from\u0000$57.64%$ to $99.90%$) as measured by LlamaGuard 3 8B, competing with\u0000state-of-the-art models. On toxicity benchmarks, average scores in adversarial\u0000settings dropped from over $0.6$ to less than $0.07$. However, this safety\u0000improvement comes at the cost of reduced general capabilities, particularly in\u0000math, suggesting a trade-off. We identify noise contrastive alignment\u0000(Safe-NCA) as an optimal method for balancing safety and performance. Our study\u0000ultimately shows that alignment techniques can be sufficient for building safe\u0000and robust models.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142180632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Large Language Models are Pattern Matchers: Editing Semi-Structured and Structured Documents with ChatGPT 大语言模型是模式匹配器:使用 ChatGPT 编辑半结构化和结构化文档
Pub Date : 2024-09-12 DOI: arxiv-2409.07732
Irene Weber
Large Language Models (LLMs) offer numerous applications, the full extent ofwhich is not yet understood. This paper investigates if LLMs can be applied forediting structured and semi-structured documents with minimal effort. Using aqualitative research approach, we conduct two case studies with ChatGPT andthoroughly analyze the results. Our experiments indicate that LLMs caneffectively edit structured and semi-structured documents when provided withbasic, straightforward prompts. ChatGPT demonstrates a strong ability torecognize and process the structure of annotated documents. This suggests thatexplicitly structuring tasks and data in prompts might enhance an LLM's abilityto understand and solve tasks. Furthermore, the experiments also revealimpressive pattern matching skills in ChatGPT. This observation deservesfurther investigation, as it may contribute to understanding the processesleading to hallucinations in LLMs.
大型语言模型(LLMs)的应用领域非常广泛,但人们还不了解其全部范围。本文研究了 LLM 是否能在编辑结构化和半结构化文档时以最小的工作量得到应用。我们采用定量研究方法,使用 ChatGPT 进行了两项案例研究,并对结果进行了全面分析。我们的实验表明,当提供简单明了的提示时,LLM 可以有效地编辑结构化和半结构化文档。ChatGPT 展示了识别和处理注释文档结构的强大能力。这表明,在提示中明确提出任务和数据的结构可能会提高 LLM 理解和解决任务的能力。此外,实验还揭示了 ChatGPT 令人印象深刻的模式匹配技能。这一观察结果值得进一步研究,因为它可能有助于理解导致 LLM 产生幻觉的过程。
{"title":"Large Language Models are Pattern Matchers: Editing Semi-Structured and Structured Documents with ChatGPT","authors":"Irene Weber","doi":"arxiv-2409.07732","DOIUrl":"https://doi.org/arxiv-2409.07732","url":null,"abstract":"Large Language Models (LLMs) offer numerous applications, the full extent of\u0000which is not yet understood. This paper investigates if LLMs can be applied for\u0000editing structured and semi-structured documents with minimal effort. Using a\u0000qualitative research approach, we conduct two case studies with ChatGPT and\u0000thoroughly analyze the results. Our experiments indicate that LLMs can\u0000effectively edit structured and semi-structured documents when provided with\u0000basic, straightforward prompts. ChatGPT demonstrates a strong ability to\u0000recognize and process the structure of annotated documents. This suggests that\u0000explicitly structuring tasks and data in prompts might enhance an LLM's ability\u0000to understand and solve tasks. Furthermore, the experiments also reveal\u0000impressive pattern matching skills in ChatGPT. This observation deserves\u0000further investigation, as it may contribute to understanding the processes\u0000leading to hallucinations in LLMs.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":"17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142180618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improve Machine Learning carbon footprint using Nvidia GPU and Mixed Precision training for classification algorithms 使用 Nvidia GPU 和混合精度训练分类算法,改善机器学习的碳足迹
Pub Date : 2024-09-12 DOI: arxiv-2409.07853
Andrew Antonopoulos
This study was part of my dissertation for my master degree and compares thepower consumption using the default floating point (32bit) and Nvidia mixedprecision (16bit and 32bit) while training a classification ML model. A customPC with specific hardware was built to perform the experiments, and differentML hyper-parameters, such as batch size, neurons, and epochs, were chosen tobuild Deep Neural Networks (DNN). Additionally, various software was usedduring the experiments to collect the power consumption data in Watts from theGraphics Processing Unit (GPU), Central Processing Unit (CPU), Random AccessMemory (RAM) and manually from a wattmeter connected to the wall. Abenchmarking test with default hyper parameter values for the DNN was used as areference, while the experiments used a combination of different settings. Theresults were recorded in Excel, and descriptive statistics were chosen tocalculate the mean between the groups and compare them using graphs and tables.The outcome was positive when using mixed precision combined with specifichyper-parameters. Compared to the benchmarking, the optimisation for theclassification reduced the power consumption between 7 and 11 Watts. Similarly,the carbon footprint is reduced because the calculation uses the same powerconsumption data. Still, a consideration is required when configuringhyper-parameters because it can negatively affect hardware performance.However, this research required inferential statistics, specifically ANOVA andT-test, to compare the relationship between the means. Furthermore, testsindicated no statistical significance of the relationship between thebenchmarking and experiments. However, a more extensive implementation with acluster of GPUs can increase the sample size significantly, as it is anessential factor and can change the outcome of the statistical analysis.
本研究是我硕士论文的一部分,比较了在训练分类 ML 模型时使用默认浮点(32 位)和 Nvidia 混合精度(16 位和 32 位)的功耗。为了进行实验,我们构建了一台具有特定硬件的定制电脑,并选择了不同的 ML 超参数,如批量大小、神经元和历时,以构建深度神经网络(DNN)。此外,在实验过程中还使用了各种软件从图形处理器(GPU)、中央处理器(CPU)、随机存取存储器(RAM)收集功耗数据(单位:瓦特),并通过连接到墙上的电表手动收集数据。基准测试使用 DNN 的默认超参数值作为参考,而实验则使用不同设置的组合。实验结果记录在 Excel 中,并选择了描述性统计来计算各组之间的平均值,并使用图表对其进行比较。与基准相比,分类优化降低了 7 到 11 瓦特的功耗。同样,碳足迹也减少了,因为计算使用了相同的功耗数据。不过,在配置超参数时仍需考虑,因为这会对硬件性能产生负面影响。然而,这项研究需要推断统计,特别是方差分析和T检验,以比较平均值之间的关系。此外,测试表明基准测试和实验之间的关系在统计上并不显著。然而,使用 GPU 群集进行更广泛的实施可以显著增加样本量,因为它是一个重要因素,可以改变统计分析的结果。
{"title":"Improve Machine Learning carbon footprint using Nvidia GPU and Mixed Precision training for classification algorithms","authors":"Andrew Antonopoulos","doi":"arxiv-2409.07853","DOIUrl":"https://doi.org/arxiv-2409.07853","url":null,"abstract":"This study was part of my dissertation for my master degree and compares the\u0000power consumption using the default floating point (32bit) and Nvidia mixed\u0000precision (16bit and 32bit) while training a classification ML model. A custom\u0000PC with specific hardware was built to perform the experiments, and different\u0000ML hyper-parameters, such as batch size, neurons, and epochs, were chosen to\u0000build Deep Neural Networks (DNN). Additionally, various software was used\u0000during the experiments to collect the power consumption data in Watts from the\u0000Graphics Processing Unit (GPU), Central Processing Unit (CPU), Random Access\u0000Memory (RAM) and manually from a wattmeter connected to the wall. A\u0000benchmarking test with default hyper parameter values for the DNN was used as a\u0000reference, while the experiments used a combination of different settings. The\u0000results were recorded in Excel, and descriptive statistics were chosen to\u0000calculate the mean between the groups and compare them using graphs and tables.\u0000The outcome was positive when using mixed precision combined with specific\u0000hyper-parameters. Compared to the benchmarking, the optimisation for the\u0000classification reduced the power consumption between 7 and 11 Watts. Similarly,\u0000the carbon footprint is reduced because the calculation uses the same power\u0000consumption data. Still, a consideration is required when configuring\u0000hyper-parameters because it can negatively affect hardware performance.\u0000However, this research required inferential statistics, specifically ANOVA and\u0000T-test, to compare the relationship between the means. Furthermore, tests\u0000indicated no statistical significance of the relationship between the\u0000benchmarking and experiments. However, a more extensive implementation with a\u0000cluster of GPUs can increase the sample size significantly, as it is an\u0000essential factor and can change the outcome of the statistical analysis.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142180615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FedHide: Federated Learning by Hiding in the Neighbors FedHide:通过隐藏在邻居中进行联合学习
Pub Date : 2024-09-12 DOI: arxiv-2409.07808
Hyunsin Park, Sungrack Yun
We propose a prototype-based federated learning method designed for embeddingnetworks in classification or verification tasks. Our focus is on scenarioswhere each client has data from a single class. The main challenge is todevelop an embedding network that can distinguish between different classeswhile adhering to privacy constraints. Sharing true class prototypes with theserver or other clients could potentially compromise sensitive information. Totackle this issue, we propose a proxy class prototype that will be shared amongclients instead of the true class prototype. Our approach generates proxy classprototypes by linearly combining them with their nearest neighbors. Thistechnique conceals the true class prototype while enabling clients to learndiscriminative embedding networks. We compare our method to alternativetechniques, such as adding random Gaussian noise and using random selectionwith cosine similarity constraints. Furthermore, we evaluate the robustness ofour approach against gradient inversion attacks and introduce a measure forprototype leakage. This measure quantifies the extent of private informationrevealed when sharing the proposed proxy class prototype. Moreover, we providea theoretical analysis of the convergence properties of our approach. Ourproposed method for federated learning from scratch demonstrates itseffectiveness through empirical results on three benchmark datasets: CIFAR-100,VoxCeleb1, and VGGFace2.
我们提出了一种基于原型的联合学习方法,设计用于在分类或验证任务中嵌入网络。我们的重点是每个客户端都拥有来自单一类别的数据的场景。我们面临的主要挑战是如何开发一种嵌入网络,既能区分不同类别,又能遵守隐私约束。与服务器或其他客户端共享真正的类原型可能会泄露敏感信息。为了解决这个问题,我们提出了一种代理类原型,它将在客户端之间共享,而不是真正的类原型。我们的方法通过将代理类原型与其最近的邻居进行线性组合来生成代理类原型。这种方法既能隐藏真正的类原型,又能让客户学习到有区分度的嵌入网络。我们将我们的方法与其他技术进行了比较,如添加随机高斯噪声和使用带有余弦相似性约束的随机选择。此外,我们还评估了我们的方法对梯度反转攻击的鲁棒性,并引入了一种原型泄漏度量方法。该指标量化了共享所提出的代理类原型时泄露的私人信息的程度。此外,我们还对我们方法的收敛特性进行了理论分析。我们提出的从零开始的联合学习方法通过在三个基准数据集上的实证结果证明了它的有效性:CIFAR-100、VoxCeleb1 和 VGGFace2。
{"title":"FedHide: Federated Learning by Hiding in the Neighbors","authors":"Hyunsin Park, Sungrack Yun","doi":"arxiv-2409.07808","DOIUrl":"https://doi.org/arxiv-2409.07808","url":null,"abstract":"We propose a prototype-based federated learning method designed for embedding\u0000networks in classification or verification tasks. Our focus is on scenarios\u0000where each client has data from a single class. The main challenge is to\u0000develop an embedding network that can distinguish between different classes\u0000while adhering to privacy constraints. Sharing true class prototypes with the\u0000server or other clients could potentially compromise sensitive information. To\u0000tackle this issue, we propose a proxy class prototype that will be shared among\u0000clients instead of the true class prototype. Our approach generates proxy class\u0000prototypes by linearly combining them with their nearest neighbors. This\u0000technique conceals the true class prototype while enabling clients to learn\u0000discriminative embedding networks. We compare our method to alternative\u0000techniques, such as adding random Gaussian noise and using random selection\u0000with cosine similarity constraints. Furthermore, we evaluate the robustness of\u0000our approach against gradient inversion attacks and introduce a measure for\u0000prototype leakage. This measure quantifies the extent of private information\u0000revealed when sharing the proposed proxy class prototype. Moreover, we provide\u0000a theoretical analysis of the convergence properties of our approach. Our\u0000proposed method for federated learning from scratch demonstrates its\u0000effectiveness through empirical results on three benchmark datasets: CIFAR-100,\u0000VoxCeleb1, and VGGFace2.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142180619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A framework for measuring the training efficiency of a neural architecture 衡量神经架构训练效率的框架
Pub Date : 2024-09-12 DOI: arxiv-2409.07925
Eduardo Cueto-Mendoza, John D. Kelleher
Measuring Efficiency in neural network system development is an open researchproblem. This paper presents an experimental framework to measure the trainingefficiency of a neural architecture. To demonstrate our approach, we analyzethe training efficiency of Convolutional Neural Networks and Bayesianequivalents on the MNIST and CIFAR-10 tasks. Our results show that trainingefficiency decays as training progresses and varies across different stoppingcriteria for a given neural model and learning task. We also find a non-linearrelationship between training stopping criteria, training Efficiency, modelsize, and training Efficiency. Furthermore, we illustrate the potential confounding effects of overtrainingon measuring the training efficiency of a neural architecture. Regardingrelative training efficiency across different architectures, our resultsindicate that CNNs are more efficient than BCNNs on both datasets. Moregenerally, as a learning task becomes more complex, the relative difference intraining efficiency between different architectures becomes more pronounced.
衡量神经网络系统开发的效率是一个尚未解决的研究问题。本文提出了一个测量神经架构训练效率的实验框架。为了证明我们的方法,我们分析了卷积神经网络和贝叶斯等效网络在 MNIST 和 CIFAR-10 任务上的训练效率。我们的结果表明,训练效率会随着训练的进行而下降,并且在给定神经模型和学习任务的不同停止标准下也会有所不同。我们还发现训练停止标准、训练效率、模型大小和训练效率之间存在非线性关系。此外,我们还说明了过度训练对衡量神经架构训练效率的潜在干扰效应。关于不同架构的相对训练效率,我们的结果表明,在两个数据集上,CNN 比 BCNN 更有效率。一般来说,随着学习任务变得越来越复杂,不同架构之间训练效率的相对差异也会越来越明显。
{"title":"A framework for measuring the training efficiency of a neural architecture","authors":"Eduardo Cueto-Mendoza, John D. Kelleher","doi":"arxiv-2409.07925","DOIUrl":"https://doi.org/arxiv-2409.07925","url":null,"abstract":"Measuring Efficiency in neural network system development is an open research\u0000problem. This paper presents an experimental framework to measure the training\u0000efficiency of a neural architecture. To demonstrate our approach, we analyze\u0000the training efficiency of Convolutional Neural Networks and Bayesian\u0000equivalents on the MNIST and CIFAR-10 tasks. Our results show that training\u0000efficiency decays as training progresses and varies across different stopping\u0000criteria for a given neural model and learning task. We also find a non-linear\u0000relationship between training stopping criteria, training Efficiency, model\u0000size, and training Efficiency. Furthermore, we illustrate the potential confounding effects of overtraining\u0000on measuring the training efficiency of a neural architecture. Regarding\u0000relative training efficiency across different architectures, our results\u0000indicate that CNNs are more efficient than BCNNs on both datasets. More\u0000generally, as a learning task becomes more complex, the relative difference in\u0000training efficiency between different architectures becomes more pronounced.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142223712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Graph Laplacian-based Bayesian Multi-fidelity Modeling 基于图谱拉普拉斯的贝叶斯多保真度建模
Pub Date : 2024-09-12 DOI: arxiv-2409.08211
Orazio Pinti, Jeremy M. Budd, Franca Hoffmann, Assad A. Oberai
We present a novel probabilistic approach for generating multi-fidelity datawhile accounting for errors inherent in both low- and high-fidelity data. Inthis approach a graph Laplacian constructed from the low-fidelity data is usedto define a multivariate Gaussian prior density for the coordinates of the truedata points. In addition, few high-fidelity data points are used to construct aconjugate likelihood term. Thereafter, Bayes rule is applied to derive anexplicit expression for the posterior density which is also multivariateGaussian. The maximum textit{a posteriori} (MAP) estimate of this density isselected to be the optimal multi-fidelity estimate. It is shown that the MAPestimate and the covariance of the posterior density can be determined throughthe solution of linear systems of equations. Thereafter, two methods, one basedon spectral truncation and another based on a low-rank approximation, aredeveloped to solve these equations efficiently. The multi-fidelity approach istested on a variety of problems in solid and fluid mechanics with data thatrepresents vectors of quantities of interest and discretized spatial fields inone and two dimensions. The results demonstrate that by utilizing a smallfraction of high-fidelity data, the multi-fidelity approach can significantlyimprove the accuracy of a large collection of low-fidelity data points.
我们提出了一种新颖的概率方法,用于生成多保真度数据,同时考虑低保真度和高保真度数据中固有的误差。在这种方法中,根据低保真数据构建的图拉普拉卡矩被用来定义真实数据点坐标的多元高斯先验密度。此外,少数高保真数据点被用于构建共轭似然项。之后,应用贝叶斯规则推导出后验密度的显式表达式,后验密度也是多元高斯的。该密度的最大后验估计值(MAP)被选为最佳多保真度估计值。研究表明,MAP 估计值和后验密度的协方差可以通过线性方程组的求解来确定。随后,研究人员开发了两种方法,一种是基于谱截断的方法,另一种是基于低阶近似的方法,以高效求解这些方程。多保真度方法在固体力学和流体力学的各种问题上进行了测试,测试数据代表了相关量的矢量以及一维和二维的离散空间场。结果表明,通过利用一小部分高保真数据,多保真方法可以显著提高大量低保真数据点的精度。
{"title":"Graph Laplacian-based Bayesian Multi-fidelity Modeling","authors":"Orazio Pinti, Jeremy M. Budd, Franca Hoffmann, Assad A. Oberai","doi":"arxiv-2409.08211","DOIUrl":"https://doi.org/arxiv-2409.08211","url":null,"abstract":"We present a novel probabilistic approach for generating multi-fidelity data\u0000while accounting for errors inherent in both low- and high-fidelity data. In\u0000this approach a graph Laplacian constructed from the low-fidelity data is used\u0000to define a multivariate Gaussian prior density for the coordinates of the true\u0000data points. In addition, few high-fidelity data points are used to construct a\u0000conjugate likelihood term. Thereafter, Bayes rule is applied to derive an\u0000explicit expression for the posterior density which is also multivariate\u0000Gaussian. The maximum textit{a posteriori} (MAP) estimate of this density is\u0000selected to be the optimal multi-fidelity estimate. It is shown that the MAP\u0000estimate and the covariance of the posterior density can be determined through\u0000the solution of linear systems of equations. Thereafter, two methods, one based\u0000on spectral truncation and another based on a low-rank approximation, are\u0000developed to solve these equations efficiently. The multi-fidelity approach is\u0000tested on a variety of problems in solid and fluid mechanics with data that\u0000represents vectors of quantities of interest and discretized spatial fields in\u0000one and two dimensions. The results demonstrate that by utilizing a small\u0000fraction of high-fidelity data, the multi-fidelity approach can significantly\u0000improve the accuracy of a large collection of low-fidelity data points.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":"17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142180604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FPMT: Enhanced Semi-Supervised Model for Traffic Incident Detection FPMT:用于交通事故检测的增强型半监督模型
Pub Date : 2024-09-12 DOI: arxiv-2409.07839
Xinying Lu, Jianli Xiao
For traffic incident detection, the acquisition of data and labels is notablyresource-intensive, rendering semi-supervised traffic incident detection both aformidable and consequential challenge. Thus, this paper focuses on trafficincident detection with a semi-supervised learning way. It proposes asemi-supervised learning model named FPMT within the framework of MixText. Thedata augmentation module introduces Generative Adversarial Networks to balanceand expand the dataset. During the mix-up process in the hidden space, itemploys a probabilistic pseudo-mixing mechanism to enhance regularization andelevate model precision. In terms of training strategy, it initiates withunsupervised training on all data, followed by supervised fine-tuning on asubset of labeled data, and ultimately completing the goal of semi-supervisedtraining. Through empirical validation on four authentic datasets, our FPMTmodel exhibits outstanding performance across various metrics. Particularlynoteworthy is its robust performance even in scenarios with low label rates.
对于交通事故检测而言,数据和标签的获取显然是资源密集型的,这使得半监督交通事故检测成为一项艰巨而又重要的挑战。因此,本文重点关注采用半监督学习方式的交通事故检测。它在 MixText 框架内提出了一个名为 FPMT 的半监督学习模型。数据增强模块引入生成对抗网络(Generative Adversarial Networks)来平衡和扩展数据集。在隐藏空间的混合过程中,它采用了概率伪混合机制来增强正则化和提高模型精度。在训练策略上,它首先对所有数据进行无监督训练,然后对标注数据的子集进行监督微调,最终完成半监督训练的目标。通过在四个真实数据集上的经验验证,我们的 FPMT 模型在各种指标上都表现出了卓越的性能。尤其值得注意的是,即使在标签率较低的情况下,它的性能也很稳定。
{"title":"FPMT: Enhanced Semi-Supervised Model for Traffic Incident Detection","authors":"Xinying Lu, Jianli Xiao","doi":"arxiv-2409.07839","DOIUrl":"https://doi.org/arxiv-2409.07839","url":null,"abstract":"For traffic incident detection, the acquisition of data and labels is notably\u0000resource-intensive, rendering semi-supervised traffic incident detection both a\u0000formidable and consequential challenge. Thus, this paper focuses on traffic\u0000incident detection with a semi-supervised learning way. It proposes a\u0000semi-supervised learning model named FPMT within the framework of MixText. The\u0000data augmentation module introduces Generative Adversarial Networks to balance\u0000and expand the dataset. During the mix-up process in the hidden space, it\u0000employs a probabilistic pseudo-mixing mechanism to enhance regularization and\u0000elevate model precision. In terms of training strategy, it initiates with\u0000unsupervised training on all data, followed by supervised fine-tuning on a\u0000subset of labeled data, and ultimately completing the goal of semi-supervised\u0000training. Through empirical validation on four authentic datasets, our FPMT\u0000model exhibits outstanding performance across various metrics. Particularly\u0000noteworthy is its robust performance even in scenarios with low label rates.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142180614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - CS - Machine Learning
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1