首页 > 最新文献

arXiv - CS - Machine Learning最新文献

英文 中文
Constraint Guided AutoEncoders for Joint Optimization of Condition Indicator Estimation and Anomaly Detection in Machine Condition Monitoring 用于联合优化机器状态监测中的状态指标估计和异常检测的约束引导自动编码器
Pub Date : 2024-09-18 DOI: arxiv-2409.11807
Maarten Meire, Quinten Van Baelen, Ted Ooijevaar, Peter Karsmakers
The main goal of machine condition monitoring is, as the name implies, tomonitor the condition of industrial applications. The objective of thismonitoring can be mainly split into two problems. A diagnostic problem, wherenormal data should be distinguished from anomalous data, otherwise calledAnomaly Detection (AD), or a prognostic problem, where the aim is to predictthe evolution of a Condition Indicator (CI) that reflects the condition of anasset throughout its life time. When considering machine condition monitoring,it is expected that this CI shows a monotonic behavior, as the condition of amachine gradually degrades over time. This work proposes an extension toConstraint Guided AutoEncoders (CGAE), which is a robust AD method, thatenables building a single model that can be used for both AD and CI estimation.For the purpose of improved CI estimation the extension incorporates aconstraint that enforces the model to have monotonically increasing CIpredictions over time. Experimental results indicate that the proposedalgorithm performs similar, or slightly better, than CGAE, with regards to AD,while improving the monotonic behavior of the CI.
顾名思义,机器状态监测的主要目的是监测工业应用的状态。这种监控的目标主要可分为两个问题。一个是诊断问题,需要将正常数据与异常数据区分开来,也称为异常检测 (AD);另一个是预测问题,目的是预测状态指标 (CI) 的变化,该指标反映了资产在整个生命周期内的状态。在考虑机器状态监控时,随着时间的推移,机器的状态会逐渐恶化,因此预计该 CI 会表现出单调的行为。为了改进 CI 估算,该扩展包含了一个约束条件,强制模型具有随时间单调递增的 CI 预测。实验结果表明,所提出的算法在 AD 方面的表现与 CGAE 相似或略胜一筹,同时改进了 CI 的单调行为。
{"title":"Constraint Guided AutoEncoders for Joint Optimization of Condition Indicator Estimation and Anomaly Detection in Machine Condition Monitoring","authors":"Maarten Meire, Quinten Van Baelen, Ted Ooijevaar, Peter Karsmakers","doi":"arxiv-2409.11807","DOIUrl":"https://doi.org/arxiv-2409.11807","url":null,"abstract":"The main goal of machine condition monitoring is, as the name implies, to\u0000monitor the condition of industrial applications. The objective of this\u0000monitoring can be mainly split into two problems. A diagnostic problem, where\u0000normal data should be distinguished from anomalous data, otherwise called\u0000Anomaly Detection (AD), or a prognostic problem, where the aim is to predict\u0000the evolution of a Condition Indicator (CI) that reflects the condition of an\u0000asset throughout its life time. When considering machine condition monitoring,\u0000it is expected that this CI shows a monotonic behavior, as the condition of a\u0000machine gradually degrades over time. This work proposes an extension to\u0000Constraint Guided AutoEncoders (CGAE), which is a robust AD method, that\u0000enables building a single model that can be used for both AD and CI estimation.\u0000For the purpose of improved CI estimation the extension incorporates a\u0000constraint that enforces the model to have monotonically increasing CI\u0000predictions over time. Experimental results indicate that the proposed\u0000algorithm performs similar, or slightly better, than CGAE, with regards to AD,\u0000while improving the monotonic behavior of the CI.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Topological Deep Learning with State-Space Models: A Mamba Approach for Simplicial Complexes 拓扑深度学习与状态空间模型:简约复合物的曼巴方法
Pub Date : 2024-09-18 DOI: arxiv-2409.12033
Marco Montagna, Simone Scardapane, Lev Telyatnikov
Graph Neural Networks based on the message-passing (MP) mechanism are adominant approach for handling graph-structured data. However, they areinherently limited to modeling only pairwise interactions, making it difficultto explicitly capture the complexity of systems with $n$-body relations. Toaddress this, topological deep learning has emerged as a promising field forstudying and modeling higher-order interactions using various topologicaldomains, such as simplicial and cellular complexes. While these new domainsprovide powerful representations, they introduce new challenges, such aseffectively modeling the interactions among higher-order structures throughhigher-order MP. Meanwhile, structured state-space sequence models have provento be effective for sequence modeling and have recently been adapted for graphdata by encoding the neighborhood of a node as a sequence, thereby avoiding theMP mechanism. In this work, we propose a novel architecture designed to operatewith simplicial complexes, utilizing the Mamba state-space model as itsbackbone. Our approach generates sequences for the nodes based on theneighboring cells, enabling direct communication between all higher-orderstructures, regardless of their rank. We extensively validate our model,demonstrating that it achieves competitive performance compared tostate-of-the-art models developed for simplicial complexes.
基于消息传递(MP)机制的图神经网络是处理图结构数据的主要方法。然而,它们本身仅限于建模成对的相互作用,因此难以明确捕捉具有 $n$ 体关系的系统的复杂性。为了解决这个问题,拓扑深度学习应运而生,成为利用各种拓扑域(如单纯形和细胞复合物)研究和建模高阶相互作用的一个前景广阔的领域。虽然这些新领域提供了强大的表征,但也带来了新的挑战,例如通过高阶 MP 有效地建模高阶结构之间的相互作用。与此同时,结构化状态空间序列模型已被证明能有效地进行序列建模,最近又通过将节点的邻域编码为序列,从而避免了 MP 机制,使其适用于图数据。在这项工作中,我们提出了一种新颖的架构,旨在利用 Mamba 状态空间模型作为骨干,对简单复合物进行操作。我们的方法根据相邻单元为节点生成序列,从而实现了所有高阶结构之间的直接通信,而不管它们的阶数如何。我们对我们的模型进行了广泛验证,证明与针对简单复合物开发的最先进模型相比,我们的模型具有极强的性能竞争力。
{"title":"Topological Deep Learning with State-Space Models: A Mamba Approach for Simplicial Complexes","authors":"Marco Montagna, Simone Scardapane, Lev Telyatnikov","doi":"arxiv-2409.12033","DOIUrl":"https://doi.org/arxiv-2409.12033","url":null,"abstract":"Graph Neural Networks based on the message-passing (MP) mechanism are a\u0000dominant approach for handling graph-structured data. However, they are\u0000inherently limited to modeling only pairwise interactions, making it difficult\u0000to explicitly capture the complexity of systems with $n$-body relations. To\u0000address this, topological deep learning has emerged as a promising field for\u0000studying and modeling higher-order interactions using various topological\u0000domains, such as simplicial and cellular complexes. While these new domains\u0000provide powerful representations, they introduce new challenges, such as\u0000effectively modeling the interactions among higher-order structures through\u0000higher-order MP. Meanwhile, structured state-space sequence models have proven\u0000to be effective for sequence modeling and have recently been adapted for graph\u0000data by encoding the neighborhood of a node as a sequence, thereby avoiding the\u0000MP mechanism. In this work, we propose a novel architecture designed to operate\u0000with simplicial complexes, utilizing the Mamba state-space model as its\u0000backbone. Our approach generates sequences for the nodes based on the\u0000neighboring cells, enabling direct communication between all higher-order\u0000structures, regardless of their rank. We extensively validate our model,\u0000demonstrating that it achieves competitive performance compared to\u0000state-of-the-art models developed for simplicial complexes.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142269708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Efficient Model-Agnostic Approach for Uncertainty Estimation in Data-Restricted Pedometric Applications 在数据受限的计步应用中进行不确定性估计的高效模型诊断方法
Pub Date : 2024-09-18 DOI: arxiv-2409.11985
Viacheslav Barkov, Jonas Schmidinger, Robin Gebbers, Martin Atzmueller
This paper introduces a model-agnostic approach designed to enhanceuncertainty estimation in the predictive modeling of soil properties, a crucialfactor for advancing pedometrics and the practice of digital soil mapping. Foraddressing the typical challenge of data scarcity in soil studies, we presentan improved technique for uncertainty estimation. This method is based on thetransformation of regression tasks into classification problems, which not onlyallows for the production of reliable uncertainty estimates but also enablesthe application of established machine learning algorithms with competitiveperformance that have not yet been utilized in pedometrics. Empirical resultsfrom datasets collected from two German agricultural fields showcase thepractical application of the proposed methodology. Our results and findingssuggest that the proposed approach has the potential to provide betteruncertainty estimation than the models commonly used in pedometrics.
本文介绍了一种与模型无关的方法,旨在加强土壤特性预测建模中的不确定性估计,这是推进土壤测量学和数字土壤制图实践的关键因素。为了解决土壤研究中数据稀缺的典型难题,我们提出了一种改进的不确定性估计技术。该方法基于将回归任务转化为分类问题,这不仅可以产生可靠的不确定性估计,还可以应用尚未在测绘学中使用过的具有竞争力性能的成熟机器学习算法。从德国两个农田收集的数据集得出的经验结果展示了所提方法的实际应用。我们的结果和发现表明,与计步学中常用的模型相比,所提出的方法有可能提供更好的不确定性估计。
{"title":"An Efficient Model-Agnostic Approach for Uncertainty Estimation in Data-Restricted Pedometric Applications","authors":"Viacheslav Barkov, Jonas Schmidinger, Robin Gebbers, Martin Atzmueller","doi":"arxiv-2409.11985","DOIUrl":"https://doi.org/arxiv-2409.11985","url":null,"abstract":"This paper introduces a model-agnostic approach designed to enhance\u0000uncertainty estimation in the predictive modeling of soil properties, a crucial\u0000factor for advancing pedometrics and the practice of digital soil mapping. For\u0000addressing the typical challenge of data scarcity in soil studies, we present\u0000an improved technique for uncertainty estimation. This method is based on the\u0000transformation of regression tasks into classification problems, which not only\u0000allows for the production of reliable uncertainty estimates but also enables\u0000the application of established machine learning algorithms with competitive\u0000performance that have not yet been utilized in pedometrics. Empirical results\u0000from datasets collected from two German agricultural fields showcase the\u0000practical application of the proposed methodology. Our results and findings\u0000suggest that the proposed approach has the potential to provide better\u0000uncertainty estimation than the models commonly used in pedometrics.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Extended Deep Submodular Functions 扩展深次模态函数
Pub Date : 2024-09-18 DOI: arxiv-2409.12053
Seyed Mohammad Hosseini, Arash Jamshid, Seyed Mahdi Noormousavi, Mahdi Jafari Siavoshani, Naeimeh Omidvar
We introduce a novel category of set functions called Extended DeepSubmodular functions (EDSFs), which are neural network-representable. EDSFsserve as an extension of Deep Submodular Functions (DSFs), inheriting crucialproperties from DSFs while addressing innate limitations. It is known that DSFscan represent a limiting subset of submodular functions. In contrast, throughan analysis of polymatroid properties, we establish that EDSFs possess thecapability to represent all monotone submodular functions, a notableenhancement compared to DSFs. Furthermore, our findings demonstrate that EDSFscan represent any monotone set function, indicating the family of EDSFs isequivalent to the family of all monotone set functions. Additionally, we provethat EDSFs maintain the concavity inherent in DSFs when the components of theinput vector are non-negative real numbers-an essential feature in certaincombinatorial optimization problems. Through extensive experiments, weillustrate that EDSFs exhibit significantly lower empirical generalizationerror than DSFs in the learning of coverage functions. This suggests that EDSFspresent a promising advancement in the representation and learning of setfunctions with improved generalization capabilities.
我们引入了一类新的集合函数,称为扩展深度子模态函数(EDSF),它是神经网络可表示的。EDSF 是深度子模态函数(DSF)的扩展,继承了 DSF 的关键特性,同时解决了其固有的局限性。众所周知,DSF 可以代表亚模态函数的极限子集。与此相反,通过分析多模态性质,我们发现 EDSF 具有表示所有单调子模态函数的能力,这与 DSF 相比是一个显著的进步。此外,我们的研究结果表明,EDSF可以表示任何单调集合函数,这表明EDSF族等价于所有单调集合函数族。此外,我们还证明了当输入向量的分量为非负实数时,EDSFs 保持了 DSFs 固有的凹性--这是某些组合优化问题的基本特征。通过大量实验,我们证明在学习覆盖函数时,EDSF 的经验泛化误差明显低于 DSF。这表明,EDSF 在表示和学习具有更好泛化能力的集合函数方面是一个很有前途的进步。
{"title":"Extended Deep Submodular Functions","authors":"Seyed Mohammad Hosseini, Arash Jamshid, Seyed Mahdi Noormousavi, Mahdi Jafari Siavoshani, Naeimeh Omidvar","doi":"arxiv-2409.12053","DOIUrl":"https://doi.org/arxiv-2409.12053","url":null,"abstract":"We introduce a novel category of set functions called Extended Deep\u0000Submodular functions (EDSFs), which are neural network-representable. EDSFs\u0000serve as an extension of Deep Submodular Functions (DSFs), inheriting crucial\u0000properties from DSFs while addressing innate limitations. It is known that DSFs\u0000can represent a limiting subset of submodular functions. In contrast, through\u0000an analysis of polymatroid properties, we establish that EDSFs possess the\u0000capability to represent all monotone submodular functions, a notable\u0000enhancement compared to DSFs. Furthermore, our findings demonstrate that EDSFs\u0000can represent any monotone set function, indicating the family of EDSFs is\u0000equivalent to the family of all monotone set functions. Additionally, we prove\u0000that EDSFs maintain the concavity inherent in DSFs when the components of the\u0000input vector are non-negative real numbers-an essential feature in certain\u0000combinatorial optimization problems. Through extensive experiments, we\u0000illustrate that EDSFs exhibit significantly lower empirical generalization\u0000error than DSFs in the learning of coverage functions. This suggests that EDSFs\u0000present a promising advancement in the representation and learning of set\u0000functions with improved generalization capabilities.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Edge-Based Graph Component Pooling 基于边的图形组件池
Pub Date : 2024-09-18 DOI: arxiv-2409.11856
T. Snelleman, B. M. Renting, H. H. Hoos, J. N. van Rijn
Graph-structured data naturally occurs in many research fields, such aschemistry and sociology. The relational information contained therein can beleveraged to statistically model graph properties through geometrical deeplearning. Graph neural networks employ techniques, such as message-passinglayers, to propagate local features through a graph. However, message-passinglayers can be computationally expensive when dealing with large and sparsegraphs. Graph pooling operators offer the possibility of removing or mergingnodes in such graphs, thus lowering computational costs. However, poolingoperators that remove nodes cause data loss, and pooling operators that mergenodes are often computationally expensive. We propose a pooling operator thatmerges nodes so as not to cause data loss but is also conceptually simple andcomputationally inexpensive. We empirically demonstrate that the proposedpooling operator performs statistically significantly better than edge pool onfour popular benchmark datasets while reducing time complexity and the numberof trainable parameters by 70.6% on average. Compared to another maximallypowerful method named Graph Isomporhic Network, we show that we outperform themon two popular benchmark datasets while reducing the number of learnableparameters on average by 60.9%.
图结构数据自然出现在许多研究领域,如化学和社会学。其中包含的关系信息可以通过几何深度学习来对图形属性进行统计建模。图神经网络采用消息传递层等技术在图中传播局部特征。然而,在处理大型稀疏图时,消息传递层的计算成本可能会很高。图池算子提供了在此类图中移除或合并节点的可能性,从而降低了计算成本。然而,移除节点的汇集算子会导致数据丢失,而合并节点的汇集算子通常计算成本很高。我们提出了一种合并节点的汇集算子,它不仅不会造成数据丢失,而且概念简单、计算成本低廉。我们通过实证证明,在四个流行的基准数据集上,所提出的汇集算子的统计性能明显优于边缘汇集算子,同时平均降低了 70.6% 的时间复杂度和可训练参数的数量。与另一种名为 "图形等距网络 "的最大化方法相比,我们表明在两个流行的基准数据集上,我们的表现优于它们,同时可学习参数的数量平均减少了 60.9%。
{"title":"Edge-Based Graph Component Pooling","authors":"T. Snelleman, B. M. Renting, H. H. Hoos, J. N. van Rijn","doi":"arxiv-2409.11856","DOIUrl":"https://doi.org/arxiv-2409.11856","url":null,"abstract":"Graph-structured data naturally occurs in many research fields, such as\u0000chemistry and sociology. The relational information contained therein can be\u0000leveraged to statistically model graph properties through geometrical deep\u0000learning. Graph neural networks employ techniques, such as message-passing\u0000layers, to propagate local features through a graph. However, message-passing\u0000layers can be computationally expensive when dealing with large and sparse\u0000graphs. Graph pooling operators offer the possibility of removing or merging\u0000nodes in such graphs, thus lowering computational costs. However, pooling\u0000operators that remove nodes cause data loss, and pooling operators that merge\u0000nodes are often computationally expensive. We propose a pooling operator that\u0000merges nodes so as not to cause data loss but is also conceptually simple and\u0000computationally inexpensive. We empirically demonstrate that the proposed\u0000pooling operator performs statistically significantly better than edge pool on\u0000four popular benchmark datasets while reducing time complexity and the number\u0000of trainable parameters by 70.6% on average. Compared to another maximally\u0000powerful method named Graph Isomporhic Network, we show that we outperform them\u0000on two popular benchmark datasets while reducing the number of learnable\u0000parameters on average by 60.9%.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unsupervised Domain Adaptation Via Data Pruning 通过数据剪枝实现无监督领域自适应
Pub Date : 2024-09-18 DOI: arxiv-2409.12076
Andrea Napoli, Paul White
The removal of carefully-selected examples from training data has recentlyemerged as an effective way of improving the robustness of machine learningmodels. However, the best way to select these examples remains an openquestion. In this paper, we consider the problem from the perspective ofunsupervised domain adaptation (UDA). We propose AdaPrune, a method for UDAwhereby training examples are removed to attempt to align the trainingdistribution to that of the target data. By adopting the maximum meandiscrepancy (MMD) as the criterion for alignment, the problem can be neatlyformulated and solved as an integer quadratic program. We evaluate our approachon a real-world domain shift task of bioacoustic event detection. As a methodfor UDA, we show that AdaPrune outperforms related techniques, and iscomplementary to other UDA algorithms such as CORAL. Our analysis of therelationship between the MMD and model accuracy, along with t-SNE plots,validate the proposed method as a principled and well-founded way of performingdata pruning.
最近,从训练数据中删除精心挑选的示例已成为提高机器学习模型鲁棒性的一种有效方法。然而,选择这些示例的最佳方法仍然是一个悬而未决的问题。在本文中,我们从无监督领域适应(UDA)的角度来考虑这个问题。我们提出的 AdaPrune 是一种用于 UDA 的方法,通过移除训练示例来尝试使训练分布与目标数据的分布保持一致。通过采用最大差分(MMD)作为对齐标准,可以将问题简化为整数二次方程程序并加以解决。我们在生物声学事件检测的实际领域转移任务中评估了我们的方法。结果表明,作为一种 UDA 方法,AdaPrune 优于相关技术,并可与 CORAL 等其他 UDA 算法互补。我们对 MMD 和模型准确性之间关系的分析以及 t-SNE 图验证了所提出的方法是一种原则性的、有理有据的数据剪枝方法。
{"title":"Unsupervised Domain Adaptation Via Data Pruning","authors":"Andrea Napoli, Paul White","doi":"arxiv-2409.12076","DOIUrl":"https://doi.org/arxiv-2409.12076","url":null,"abstract":"The removal of carefully-selected examples from training data has recently\u0000emerged as an effective way of improving the robustness of machine learning\u0000models. However, the best way to select these examples remains an open\u0000question. In this paper, we consider the problem from the perspective of\u0000unsupervised domain adaptation (UDA). We propose AdaPrune, a method for UDA\u0000whereby training examples are removed to attempt to align the training\u0000distribution to that of the target data. By adopting the maximum mean\u0000discrepancy (MMD) as the criterion for alignment, the problem can be neatly\u0000formulated and solved as an integer quadratic program. We evaluate our approach\u0000on a real-world domain shift task of bioacoustic event detection. As a method\u0000for UDA, we show that AdaPrune outperforms related techniques, and is\u0000complementary to other UDA algorithms such as CORAL. Our analysis of the\u0000relationship between the MMD and model accuracy, along with t-SNE plots,\u0000validate the proposed method as a principled and well-founded way of performing\u0000data pruning.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142269706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Grid Graph Neural Networks with Self-Attention for Computational Mechanics 用于计算力学的具有自注意力的多网格图神经网络
Pub Date : 2024-09-18 DOI: arxiv-2409.11899
Paul Garnier, Jonathan Viquerat, Elie Hachem
Advancement in finite element methods have become essential in variousdisciplines, and in particular for Computational Fluid Dynamics (CFD), drivingresearch efforts for improved precision and efficiency. While ConvolutionalNeural Networks (CNNs) have found success in CFD by mapping meshes into images,recent attention has turned to leveraging Graph Neural Networks (GNNs) fordirect mesh processing. This paper introduces a novel model mergingSelf-Attention with Message Passing in GNNs, achieving a 15% reduction in RMSEon the well known flow past a cylinder benchmark. Furthermore, a dynamic meshpruning technique based on Self-Attention is proposed, that leads to a robustGNN-based multigrid approach, also reducing RMSE by 15%. Additionally, a newself-supervised training method based on BERT is presented, resulting in a 25%RMSE reduction. The paper includes an ablation study and outperformsstate-of-the-art models on several challenging datasets, promising advancementssimilar to those recently achieved in natural language and image processing.Finally, the paper introduces a dataset with meshes larger than existing onesby at least an order of magnitude. Code and Datasets will be released athttps://github.com/DonsetPG/multigrid-gnn.
有限元方法的进步已成为各学科,特别是计算流体动力学(CFD)的关键,推动了提高精度和效率的研究工作。虽然卷积神经网络(CNN)通过将网格映射到图像而在 CFD 领域取得了成功,但最近的注意力已转向利用图神经网络(GNN)进行直接网格处理。本文在 GNNs 中引入了一种融合了自我关注和消息传递的新型模型,在众所周知的流过圆柱体基准测试中,RMSE 降低了 15%。此外,本文还提出了一种基于自注意的动态网格剪枝技术,从而产生了一种基于 GNN 的鲁棒多网格方法,也将 RMSE 降低了 15%。此外,还提出了一种基于 BERT 的自我监督训练方法,使 RMSE 降低了 25%。该论文包括一项消融研究,在几个具有挑战性的数据集上的表现优于目前最先进的模型,有望取得类似于最近在自然语言和图像处理领域取得的进展。最后,该论文介绍了一个网格比现有网格大至少一个数量级的数据集。代码和数据集将在https://github.com/DonsetPG/multigrid-gnn。
{"title":"Multi-Grid Graph Neural Networks with Self-Attention for Computational Mechanics","authors":"Paul Garnier, Jonathan Viquerat, Elie Hachem","doi":"arxiv-2409.11899","DOIUrl":"https://doi.org/arxiv-2409.11899","url":null,"abstract":"Advancement in finite element methods have become essential in various\u0000disciplines, and in particular for Computational Fluid Dynamics (CFD), driving\u0000research efforts for improved precision and efficiency. While Convolutional\u0000Neural Networks (CNNs) have found success in CFD by mapping meshes into images,\u0000recent attention has turned to leveraging Graph Neural Networks (GNNs) for\u0000direct mesh processing. This paper introduces a novel model merging\u0000Self-Attention with Message Passing in GNNs, achieving a 15% reduction in RMSE\u0000on the well known flow past a cylinder benchmark. Furthermore, a dynamic mesh\u0000pruning technique based on Self-Attention is proposed, that leads to a robust\u0000GNN-based multigrid approach, also reducing RMSE by 15%. Additionally, a new\u0000self-supervised training method based on BERT is presented, resulting in a 25%\u0000RMSE reduction. The paper includes an ablation study and outperforms\u0000state-of-the-art models on several challenging datasets, promising advancements\u0000similar to those recently achieved in natural language and image processing.\u0000Finally, the paper introduces a dataset with meshes larger than existing ones\u0000by at least an order of magnitude. Code and Datasets will be released at\u0000https://github.com/DonsetPG/multigrid-gnn.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Interpretable End-Stage Renal Disease (ESRD) Prediction: Utilizing Administrative Claims Data with Explainable AI Techniques 实现可解释的终末期肾病 (ESRD) 预测:利用行政索赔数据和可解释的人工智能技术
Pub Date : 2024-09-18 DOI: arxiv-2409.12087
Yubo Li, Saba Al-Sayouri, Rema Padman
This study explores the potential of utilizing administrative claims data,combined with advanced machine learning and deep learning techniques, topredict the progression of Chronic Kidney Disease (CKD) to End-Stage RenalDisease (ESRD). We analyze a comprehensive, 10-year dataset provided by a majorhealth insurance organization to develop prediction models for multipleobservation windows using traditional machine learning methods such as RandomForest and XGBoost as well as deep learning approaches such as Long Short-TermMemory (LSTM) networks. Our findings demonstrate that the LSTM model,particularly with a 24-month observation window, exhibits superior performancein predicting ESRD progression, outperforming existing models in theliterature. We further apply SHapley Additive exPlanations (SHAP) analysis toenhance interpretability, providing insights into the impact of individualfeatures on predictions at the individual patient level. This study underscoresthe value of leveraging administrative claims data for CKD management andpredicting ESRD progression.
本研究探讨了利用行政报销数据,结合先进的机器学习和深度学习技术,预测慢性肾脏病(CKD)向终末期肾病(ESRD)进展的潜力。我们分析了一家大型医疗保险机构提供的为期 10 年的综合数据集,利用随机森林(RandomForest)和 XGBoost 等传统机器学习方法以及长短期记忆(LSTM)网络等深度学习方法,开发了多个观察窗的预测模型。我们的研究结果表明,LSTM 模型,尤其是在 24 个月的观察窗口中,在预测 ESRD 进展方面表现出卓越的性能,优于文献中的现有模型。我们还进一步应用了SHAPLEY Additive exPlanations(SHAP)分析来增强可解释性,从而深入了解个体特征对患者个体水平预测的影响。这项研究强调了利用行政报销数据进行 CKD 管理和预测 ESRD 进展的价值。
{"title":"Towards Interpretable End-Stage Renal Disease (ESRD) Prediction: Utilizing Administrative Claims Data with Explainable AI Techniques","authors":"Yubo Li, Saba Al-Sayouri, Rema Padman","doi":"arxiv-2409.12087","DOIUrl":"https://doi.org/arxiv-2409.12087","url":null,"abstract":"This study explores the potential of utilizing administrative claims data,\u0000combined with advanced machine learning and deep learning techniques, to\u0000predict the progression of Chronic Kidney Disease (CKD) to End-Stage Renal\u0000Disease (ESRD). We analyze a comprehensive, 10-year dataset provided by a major\u0000health insurance organization to develop prediction models for multiple\u0000observation windows using traditional machine learning methods such as Random\u0000Forest and XGBoost as well as deep learning approaches such as Long Short-Term\u0000Memory (LSTM) networks. Our findings demonstrate that the LSTM model,\u0000particularly with a 24-month observation window, exhibits superior performance\u0000in predicting ESRD progression, outperforming existing models in the\u0000literature. We further apply SHapley Additive exPlanations (SHAP) analysis to\u0000enhance interpretability, providing insights into the impact of individual\u0000features on predictions at the individual patient level. This study underscores\u0000the value of leveraging administrative claims data for CKD management and\u0000predicting ESRD progression.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing Semi-Supervised Learning via Representative and Diverse Sample Selection 通过代表性和多样性样本选择加强半监督学习
Pub Date : 2024-09-18 DOI: arxiv-2409.11653
Qian Shao, Jiangrui Kang, Qiyuan Chen, Zepeng Li, Hongxia Xu, Yiwen Cao, Jiajuan Liang, Jian Wu
Semi-Supervised Learning (SSL) has become a preferred paradigm in many deeplearning tasks, which reduces the need for human labor. Previous studiesprimarily focus on effectively utilising the labelled and unlabeled data toimprove performance. However, we observe that how to select samples forlabelling also significantly impacts performance, particularly under extremelylow-budget settings. The sample selection task in SSL has been under-exploredfor a long time. To fill in this gap, we propose a Representative and DiverseSample Selection approach (RDSS). By adopting a modified Frank-Wolfe algorithmto minimise a novel criterion $alpha$-Maximum Mean Discrepancy ($alpha$-MMD),RDSS samples a representative and diverse subset for annotation from theunlabeled data. We demonstrate that minimizing $alpha$-MMD enhances thegeneralization ability of low-budget learning. Experimental results show thatRDSS consistently improves the performance of several popular SSL frameworksand outperforms the state-of-the-art sample selection approaches used in ActiveLearning (AL) and Semi-Supervised Active Learning (SSAL), even with constrainedannotation budgets.
半监督学习(SSL)已成为许多深度学习任务的首选范式,它减少了对人力的需求。以往的研究主要集中在有效利用标记数据和未标记数据来提高性能。然而,我们发现,如何选择标记样本也会对性能产生重大影响,尤其是在预算极低的情况下。长期以来,SSL 中的样本选择任务一直未得到充分探索。为了填补这一空白,我们提出了一种代表性和多样性样本选择方法(RDSS)。通过采用改进的弗兰克-沃尔夫算法(Frank-Wolfe algorithm)来最小化一个新标准($alpha$-Maximum Mean Discrepancy ($alpha$-MMD)),RDSS从未标明的数据中采样出一个具有代表性和多样性的注释子集。我们证明,最小化$alpha$-MMD可以增强低预算学习的泛化能力。实验结果表明,即使在标注预算受限的情况下,RDSS 也能持续提高几种流行的 SSL 框架的性能,并优于主动学习(ActiveLearning,AL)和半监督主动学习(Semi-Supervised Active Learning,SSAL)中使用的最先进的样本选择方法。
{"title":"Enhancing Semi-Supervised Learning via Representative and Diverse Sample Selection","authors":"Qian Shao, Jiangrui Kang, Qiyuan Chen, Zepeng Li, Hongxia Xu, Yiwen Cao, Jiajuan Liang, Jian Wu","doi":"arxiv-2409.11653","DOIUrl":"https://doi.org/arxiv-2409.11653","url":null,"abstract":"Semi-Supervised Learning (SSL) has become a preferred paradigm in many deep\u0000learning tasks, which reduces the need for human labor. Previous studies\u0000primarily focus on effectively utilising the labelled and unlabeled data to\u0000improve performance. However, we observe that how to select samples for\u0000labelling also significantly impacts performance, particularly under extremely\u0000low-budget settings. The sample selection task in SSL has been under-explored\u0000for a long time. To fill in this gap, we propose a Representative and Diverse\u0000Sample Selection approach (RDSS). By adopting a modified Frank-Wolfe algorithm\u0000to minimise a novel criterion $alpha$-Maximum Mean Discrepancy ($alpha$-MMD),\u0000RDSS samples a representative and diverse subset for annotation from the\u0000unlabeled data. We demonstrate that minimizing $alpha$-MMD enhances the\u0000generalization ability of low-budget learning. Experimental results show that\u0000RDSS consistently improves the performance of several popular SSL frameworks\u0000and outperforms the state-of-the-art sample selection approaches used in Active\u0000Learning (AL) and Semi-Supervised Active Learning (SSAL), even with constrained\u0000annotation budgets.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Location based Probabilistic Load Forecasting of EV Charging Sites: Deep Transfer Learning with Multi-Quantile Temporal Convolutional Network 基于位置的电动汽车充电站点概率负荷预测:利用多量级时态卷积网络进行深度迁移学习
Pub Date : 2024-09-18 DOI: arxiv-2409.11862
Mohammad Wazed AliIntelligent Embedded Systems, Asif bin MustafaSchool of CIT, Technical University of Munich, Munich, Germany, Md. Aukerul Moin ShuvoDept. of Computer Science and Engineering, Rajshahi University of Engg. & Technology, Rajshahi, Bangladesh, Bernhard SickIntelligent Embedded Systems
Electrification of vehicles is a potential way of reducing fossil fuel usageand thus lessening environmental pollution. Electric Vehicles (EVs) of varioustypes for different transport modes (including air, water, and land) areevolving. Moreover, different EV user groups (commuters, commercial or domesticusers, drivers) may use different charging infrastructures (public, private,home, and workplace) at various times. Therefore, usage patterns and energydemand are very stochastic. Characterizing and forecasting the charging demandof these diverse EV usage profiles is essential in preventing power outages.Previously developed data-driven load models are limited to specific use casesand locations. None of these models are simultaneously adaptive enough totransfer knowledge of day-ahead forecasting among EV charging sites of diverselocations, trained with limited data, and cost-effective. This article presentsa location-based load forecasting of EV charging sites using a deepMulti-Quantile Temporal Convolutional Network (MQ-TCN) to overcome thelimitations of earlier models. We conducted our experiments on data from fourcharging sites, namely Caltech, JPL, Office-1, and NREL, which have diverse EVuser types like students, full-time and part-time employees, random visitors,etc. With a Prediction Interval Coverage Probability (PICP) score of 93.62%,our proposed deep MQ-TCN model exhibited a remarkable 28.93% improvement overthe XGBoost model for a day-ahead load forecasting at the JPL charging site. Bytransferring knowledge with the inductive Transfer Learning (TL) approach, theMQ-TCN model achieved a 96.88% PICP score for the load forecasting task at theNREL site using only two weeks of data.
车辆电气化是减少化石燃料使用从而减轻环境污染的潜在途径。用于不同运输方式(包括航空、水路和陆路)的各种类型的电动汽车(EV)正在不断发展。此外,不同的电动汽车用户群体(通勤者、商业或家庭用户、驾驶员)可能在不同时间使用不同的充电基础设施(公共、私人、家庭和工作场所)。因此,使用模式和能源需求具有很大的随机性。描述和预测这些不同电动汽车使用情况的充电需求对于防止停电至关重要。以前开发的数据驱动负荷模型仅限于特定的使用情况和地点,这些模型都不具备足够的自适应能力,无法同时在不同地点的电动汽车充电点之间传递日前预测的知识,只能利用有限的数据进行训练,而且成本效益不高。本文介绍了一种基于位置的电动汽车充电点负荷预测模型,该模型采用了深度多梯度时序卷积网络(MQ-TCN),克服了早期模型的局限性。我们在加州理工学院、JPL、Office-1 和 NREL 四个充电点的数据上进行了实验,这些充电点的电动汽车用户类型多种多样,如学生、全职和兼职员工、随机访客等。我们提出的深度 MQ-TCN 模型的预测区间覆盖概率(PICP)为 93.62%,与 XGBoost 模型相比,在 JPL 充电点的日前负荷预测方面有 28.93% 的显著改进。通过使用归纳转移学习(TL)方法转移知识,MQ-TCN 模型仅使用两周的数据就在 NREL 站点的负荷预测任务中取得了 96.88% 的 PICP 分数。
{"title":"Location based Probabilistic Load Forecasting of EV Charging Sites: Deep Transfer Learning with Multi-Quantile Temporal Convolutional Network","authors":"Mohammad Wazed AliIntelligent Embedded Systems, Asif bin MustafaSchool of CIT, Technical University of Munich, Munich, Germany, Md. Aukerul Moin ShuvoDept. of Computer Science and Engineering, Rajshahi University of Engg. & Technology, Rajshahi, Bangladesh, Bernhard SickIntelligent Embedded Systems","doi":"arxiv-2409.11862","DOIUrl":"https://doi.org/arxiv-2409.11862","url":null,"abstract":"Electrification of vehicles is a potential way of reducing fossil fuel usage\u0000and thus lessening environmental pollution. Electric Vehicles (EVs) of various\u0000types for different transport modes (including air, water, and land) are\u0000evolving. Moreover, different EV user groups (commuters, commercial or domestic\u0000users, drivers) may use different charging infrastructures (public, private,\u0000home, and workplace) at various times. Therefore, usage patterns and energy\u0000demand are very stochastic. Characterizing and forecasting the charging demand\u0000of these diverse EV usage profiles is essential in preventing power outages.\u0000Previously developed data-driven load models are limited to specific use cases\u0000and locations. None of these models are simultaneously adaptive enough to\u0000transfer knowledge of day-ahead forecasting among EV charging sites of diverse\u0000locations, trained with limited data, and cost-effective. This article presents\u0000a location-based load forecasting of EV charging sites using a deep\u0000Multi-Quantile Temporal Convolutional Network (MQ-TCN) to overcome the\u0000limitations of earlier models. We conducted our experiments on data from four\u0000charging sites, namely Caltech, JPL, Office-1, and NREL, which have diverse EV\u0000user types like students, full-time and part-time employees, random visitors,\u0000etc. With a Prediction Interval Coverage Probability (PICP) score of 93.62%,\u0000our proposed deep MQ-TCN model exhibited a remarkable 28.93% improvement over\u0000the XGBoost model for a day-ahead load forecasting at the JPL charging site. By\u0000transferring knowledge with the inductive Transfer Learning (TL) approach, the\u0000MQ-TCN model achieved a 96.88% PICP score for the load forecasting task at the\u0000NREL site using only two weeks of data.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - CS - Machine Learning
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1