首页 > 最新文献

Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)最新文献

英文 中文
Modeling Graphs Beyond Hyperbolic: Graph Neural Networks in Symmetric Positive Definite Matrices 超越双曲的图建模:对称正定矩阵中的图神经网络
Weichen Zhao, Federico López, J. M. Riestenberg, M. Strube, Diaaeldin Taha, Steve J. Trettel
Recent research has shown that alignment between the structure of graph data and the geometry of an embedding space is crucial for learning high-quality representations of the data. The uniform geometry of Euclidean and hyperbolic spaces allows for representing graphs with uniform geometric and topological features, such as grids and hierarchies, with minimal distortion. However, real-world graph data is characterized by multiple types of geometric and topological features, necessitating more sophisticated geometric embedding spaces. In this work, we utilize the Riemannian symmetric space of symmetric positive definite matrices (SPD) to construct graph neural networks that can robustly handle complex graphs. To do this, we develop an innovative library that leverages the SPD gyrocalculus tools cite{lopez2021gyroSPD} to implement the building blocks of five popular graph neural networks in SPD. Experimental results demonstrate that our graph neural networks in SPD substantially outperform their counterparts in Euclidean and hyperbolic spaces, as well as the Cartesian product thereof, on complex graphs for node and graph classification tasks. We release the library and datasets at url{https://github.com/andyweizhao/SPD4GNNs}.
最近的研究表明,图数据的结构和嵌入空间的几何结构之间的对齐对于学习高质量的数据表示至关重要。欧几里得空间和双曲空间的统一几何允许用统一的几何和拓扑特征(如网格和层次)来表示图形,并且扭曲最小。然而,现实世界的图形数据具有多种类型的几何和拓扑特征,需要更复杂的几何嵌入空间。在这项工作中,我们利用对称正定矩阵(SPD)的黎曼对称空间构造了能够鲁棒处理复杂图的图神经网络。为此,我们开发了一个创新的库,利用SPD陀螺仪微积分工具cite{lopez2021gyroSPD}在SPD中实现五种流行的图神经网络的构建块。实验结果表明,SPD中的图神经网络在节点和图分类任务的复杂图上,大大优于欧几里得和双曲空间中的图神经网络,以及它们的笛卡尔积。我们在url{https://github.com/andyweizhao/SPD4GNNs}上发布了库和数据集。
{"title":"Modeling Graphs Beyond Hyperbolic: Graph Neural Networks in Symmetric Positive Definite Matrices","authors":"Weichen Zhao, Federico López, J. M. Riestenberg, M. Strube, Diaaeldin Taha, Steve J. Trettel","doi":"10.48550/arXiv.2306.14064","DOIUrl":"https://doi.org/10.48550/arXiv.2306.14064","url":null,"abstract":"Recent research has shown that alignment between the structure of graph data and the geometry of an embedding space is crucial for learning high-quality representations of the data. The uniform geometry of Euclidean and hyperbolic spaces allows for representing graphs with uniform geometric and topological features, such as grids and hierarchies, with minimal distortion. However, real-world graph data is characterized by multiple types of geometric and topological features, necessitating more sophisticated geometric embedding spaces. In this work, we utilize the Riemannian symmetric space of symmetric positive definite matrices (SPD) to construct graph neural networks that can robustly handle complex graphs. To do this, we develop an innovative library that leverages the SPD gyrocalculus tools cite{lopez2021gyroSPD} to implement the building blocks of five popular graph neural networks in SPD. Experimental results demonstrate that our graph neural networks in SPD substantially outperform their counterparts in Euclidean and hyperbolic spaces, as well as the Cartesian product thereof, on complex graphs for node and graph classification tasks. We release the library and datasets at url{https://github.com/andyweizhao/SPD4GNNs}.","PeriodicalId":74091,"journal":{"name":"Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88258828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Binary domain generalization for sparsifying binary neural networks 稀疏化二值神经网络的二值域泛化
Riccardo Schiavone, Francesco Galati, Maria A. Zuluaga
Binary neural networks (BNNs) are an attractive solution for developing and deploying deep neural network (DNN)-based applications in resource constrained devices. Despite their success, BNNs still suffer from a fixed and limited compression factor that may be explained by the fact that existing pruning methods for full-precision DNNs cannot be directly applied to BNNs. In fact, weight pruning of BNNs leads to performance degradation, which suggests that the standard binarization domain of BNNs is not well adapted for the task. This work proposes a novel more general binary domain that extends the standard binary one that is more robust to pruning techniques, thus guaranteeing improved compression and avoiding severe performance losses. We demonstrate a closed-form solution for quantizing the weights of a full-precision network into the proposed binary domain. Finally, we show the flexibility of our method, which can be combined with other pruning strategies. Experiments over CIFAR-10 and CIFAR-100 demonstrate that the novel approach is able to generate efficient sparse networks with reduced memory usage and run-time latency, while maintaining performance.
二元神经网络(bnn)是在资源受限设备中开发和部署基于深度神经网络(DNN)的应用程序的一种有吸引力的解决方案。尽管它们取得了成功,但bnn仍然受到固定和有限的压缩因子的影响,这可能是因为现有的全精度dnn修剪方法不能直接应用于bnn。事实上,对bnn进行权值修剪会导致性能下降,这表明bnn的标准二值化域不能很好地适应该任务。这项工作提出了一种新的更通用的二进制域,扩展了对剪枝技术更健壮的标准二进制域,从而保证了改进的压缩并避免了严重的性能损失。我们展示了一个将全精度网络的权重量化到所提出的二值域的封闭解。最后,我们展示了我们的方法的灵活性,它可以与其他修剪策略相结合。在CIFAR-10和CIFAR-100上的实验表明,这种新方法能够生成高效的稀疏网络,减少内存使用和运行时延迟,同时保持性能。
{"title":"Binary domain generalization for sparsifying binary neural networks","authors":"Riccardo Schiavone, Francesco Galati, Maria A. Zuluaga","doi":"10.48550/arXiv.2306.13515","DOIUrl":"https://doi.org/10.48550/arXiv.2306.13515","url":null,"abstract":"Binary neural networks (BNNs) are an attractive solution for developing and deploying deep neural network (DNN)-based applications in resource constrained devices. Despite their success, BNNs still suffer from a fixed and limited compression factor that may be explained by the fact that existing pruning methods for full-precision DNNs cannot be directly applied to BNNs. In fact, weight pruning of BNNs leads to performance degradation, which suggests that the standard binarization domain of BNNs is not well adapted for the task. This work proposes a novel more general binary domain that extends the standard binary one that is more robust to pruning techniques, thus guaranteeing improved compression and avoiding severe performance losses. We demonstrate a closed-form solution for quantizing the weights of a full-precision network into the proposed binary domain. Finally, we show the flexibility of our method, which can be combined with other pruning strategies. Experiments over CIFAR-10 and CIFAR-100 demonstrate that the novel approach is able to generate efficient sparse networks with reduced memory usage and run-time latency, while maintaining performance.","PeriodicalId":74091,"journal":{"name":"Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80730553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning Conditional Instrumental Variable Representation for Causal Effect Estimation 学习因果效应估计的条件工具变量表示
Debo Cheng, Ziqi Xu, Jiuyong Li, Lin Liu, T. Le, Jixue Liu
One of the fundamental challenges in causal inference is to estimate the causal effect of a treatment on its outcome of interest from observational data. However, causal effect estimation often suffers from the impacts of confounding bias caused by unmeasured confounders that affect both the treatment and the outcome. The instrumental variable (IV) approach is a powerful way to eliminate the confounding bias from latent confounders. However, the existing IV-based estimators require a nominated IV, and for a conditional IV (CIV) the corresponding conditioning set too, for causal effect estimation. This limits the application of IV-based estimators. In this paper, by leveraging the advantage of disentangled representation learning, we propose a novel method, named DVAE.CIV, for learning and disentangling the representations of CIV and the representations of its conditioning set for causal effect estimations from data with latent confounders. Extensive experimental results on both synthetic and real-world datasets demonstrate the superiority of the proposed DVAE.CIV method against the existing causal effect estimators.
因果推理的基本挑战之一是根据观察数据估计治疗对其感兴趣的结果的因果效应。然而,因果效应估计经常受到混杂偏倚的影响,这些混杂偏倚是由影响治疗和结果的未测量混杂因素引起的。工具变量(IV)方法是消除潜在混杂因素的混杂偏差的有效方法。然而,现有的基于IV的估计器需要指定的IV,并且对于条件IV (CIV),也需要相应的条件集来进行因果效应估计。这限制了基于iv的估计器的应用。在本文中,我们利用解纠缠表示学习的优势,提出了一种新的方法,称为DVAE。CIV,用于学习和解开CIV的表示及其条件集的表示,用于从具有潜在混杂因素的数据中进行因果效应估计。在合成数据集和实际数据集上的大量实验结果表明了所提出的DVAE的优越性。CIV方法对现有的因果效应估计。
{"title":"Learning Conditional Instrumental Variable Representation for Causal Effect Estimation","authors":"Debo Cheng, Ziqi Xu, Jiuyong Li, Lin Liu, T. Le, Jixue Liu","doi":"10.48550/arXiv.2306.12453","DOIUrl":"https://doi.org/10.48550/arXiv.2306.12453","url":null,"abstract":"One of the fundamental challenges in causal inference is to estimate the causal effect of a treatment on its outcome of interest from observational data. However, causal effect estimation often suffers from the impacts of confounding bias caused by unmeasured confounders that affect both the treatment and the outcome. The instrumental variable (IV) approach is a powerful way to eliminate the confounding bias from latent confounders. However, the existing IV-based estimators require a nominated IV, and for a conditional IV (CIV) the corresponding conditioning set too, for causal effect estimation. This limits the application of IV-based estimators. In this paper, by leveraging the advantage of disentangled representation learning, we propose a novel method, named DVAE.CIV, for learning and disentangling the representations of CIV and the representations of its conditioning set for causal effect estimations from data with latent confounders. Extensive experimental results on both synthetic and real-world datasets demonstrate the superiority of the proposed DVAE.CIV method against the existing causal effect estimators.","PeriodicalId":74091,"journal":{"name":"Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85497514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pattern Mining for Anomaly Detection in Graphs: Application to Fraud in Public Procurement 图中异常检测的模式挖掘:在公共采购欺诈中的应用
Lucas Potin, R. Figueiredo, Vincent Labatut, C. Largeron
In the context of public procurement, several indicators called red flags are used to estimate fraud risk. They are computed according to certain contract attributes and are therefore dependent on the proper filling of the contract and award notices. However, these attributes are very often missing in practice, which prohibits red flags computation. Traditional fraud detection approaches focus on tabular data only, considering each contract separately, and are therefore very sensitive to this issue. In this work, we adopt a graph-based method allowing leveraging relations between contracts, to compensate for the missing attributes. We propose PANG (Pattern-Based Anomaly Detection in Graphs), a general supervised framework relying on pattern extraction to detect anomalous graphs in a collection of attributed graphs. Notably, it is able to identify induced subgraphs, a type of pattern widely overlooked in the literature. When benchmarked on standard datasets, its predictive performance is on par with state-of-the-art methods, with the additional advantage of being explainable. These experiments also reveal that induced patterns are more discriminative on certain datasets. When applying PANG to public procurement data, the prediction is superior to other methods, and it identifies subgraph patterns that are characteristic of fraud-prone situations, thereby making it possible to better understand fraudulent behavior.
在公共采购的背景下,几个被称为危险信号的指标被用来估计欺诈风险。它们是根据某些合同属性计算的,因此取决于合同和授予通知的正确填写。然而,这些属性在实践中经常缺失,这就禁止了危险信号计算。传统的欺诈检测方法只关注表格数据,单独考虑每个合同,因此对这个问题非常敏感。在这项工作中,我们采用了一种基于图的方法,允许利用契约之间的关系来补偿缺失的属性。我们提出了基于模式的异常检测(pattern - based Anomaly Detection in Graphs),这是一个基于模式提取的通用监督框架,用于检测属性图集合中的异常图。值得注意的是,它能够识别诱导子图,这是一种在文献中被广泛忽视的模式。当在标准数据集上进行基准测试时,其预测性能与最先进的方法相当,并且具有可解释的额外优势。这些实验还表明,诱导模式在某些数据集上更具歧视性。当将PANG应用于公共采购数据时,预测优于其他方法,并且它识别出易发生欺诈情况的子图模式,从而可以更好地理解欺诈行为。
{"title":"Pattern Mining for Anomaly Detection in Graphs: Application to Fraud in Public Procurement","authors":"Lucas Potin, R. Figueiredo, Vincent Labatut, C. Largeron","doi":"10.48550/arXiv.2306.10857","DOIUrl":"https://doi.org/10.48550/arXiv.2306.10857","url":null,"abstract":"In the context of public procurement, several indicators called red flags are used to estimate fraud risk. They are computed according to certain contract attributes and are therefore dependent on the proper filling of the contract and award notices. However, these attributes are very often missing in practice, which prohibits red flags computation. Traditional fraud detection approaches focus on tabular data only, considering each contract separately, and are therefore very sensitive to this issue. In this work, we adopt a graph-based method allowing leveraging relations between contracts, to compensate for the missing attributes. We propose PANG (Pattern-Based Anomaly Detection in Graphs), a general supervised framework relying on pattern extraction to detect anomalous graphs in a collection of attributed graphs. Notably, it is able to identify induced subgraphs, a type of pattern widely overlooked in the literature. When benchmarked on standard datasets, its predictive performance is on par with state-of-the-art methods, with the additional advantage of being explainable. These experiments also reveal that induced patterns are more discriminative on certain datasets. When applying PANG to public procurement data, the prediction is superior to other methods, and it identifies subgraph patterns that are characteristic of fraud-prone situations, thereby making it possible to better understand fraudulent behavior.","PeriodicalId":74091,"journal":{"name":"Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81411534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
FDTI: Fine-grained Deep Traffic Inference with Roadnet-enriched Graph 基于路网富图的细粒度深度交通推断
Zhanyu Liu, Chumeng Liang, Guanjie Zheng, Hua Wei
This paper proposes the fine-grained traffic prediction task (e.g. interval between data points is 1 minute), which is essential to traffic-related downstream applications. Under this setting, traffic flow is highly influenced by traffic signals and the correlation between traffic nodes is dynamic. As a result, the traffic data is non-smooth between nodes, and hard to utilize previous methods which focus on smooth traffic data. To address this problem, we propose Fine-grained Deep Traffic Inference, termed as FDTI. Specifically, we construct a fine-grained traffic graph based on traffic signals to model the inter-road relations. Then, a physically-interpretable dynamic mobility convolution module is proposed to capture vehicle moving dynamics controlled by the traffic signals. Furthermore, traffic flow conservation is introduced to accurately infer future volume. Extensive experiments demonstrate that our method achieves state-of-the-art performance and learned traffic dynamics with good properties. To the best of our knowledge, we are the first to conduct the city-level fine-grained traffic prediction.
本文提出了细粒度的流量预测任务(例如数据点之间的间隔为1分钟),这对于流量相关的下游应用至关重要。在此设置下,交通流受交通信号的影响较大,交通节点之间的关联是动态的。这导致节点间的交通数据不平滑,难以利用以往关注交通数据平滑的方法。为了解决这个问题,我们提出了细粒度深度流量推断,称为FDTI。具体而言,我们基于交通信号构建了细粒度交通图来建模道路间关系。然后,提出了一个物理可解释的动态移动卷积模块,以捕获交通信号控制下的车辆运动动态。此外,引入了交通流守恒来准确地推断未来的交通量。大量的实验表明,我们的方法达到了最先进的性能,并且具有良好的学习性能。据我们所知,我们是第一个进行城市级细粒度交通预测的公司。
{"title":"FDTI: Fine-grained Deep Traffic Inference with Roadnet-enriched Graph","authors":"Zhanyu Liu, Chumeng Liang, Guanjie Zheng, Hua Wei","doi":"10.48550/arXiv.2306.10945","DOIUrl":"https://doi.org/10.48550/arXiv.2306.10945","url":null,"abstract":"This paper proposes the fine-grained traffic prediction task (e.g. interval between data points is 1 minute), which is essential to traffic-related downstream applications. Under this setting, traffic flow is highly influenced by traffic signals and the correlation between traffic nodes is dynamic. As a result, the traffic data is non-smooth between nodes, and hard to utilize previous methods which focus on smooth traffic data. To address this problem, we propose Fine-grained Deep Traffic Inference, termed as FDTI. Specifically, we construct a fine-grained traffic graph based on traffic signals to model the inter-road relations. Then, a physically-interpretable dynamic mobility convolution module is proposed to capture vehicle moving dynamics controlled by the traffic signals. Furthermore, traffic flow conservation is introduced to accurately infer future volume. Extensive experiments demonstrate that our method achieves state-of-the-art performance and learned traffic dynamics with good properties. To the best of our knowledge, we are the first to conduct the city-level fine-grained traffic prediction.","PeriodicalId":74091,"journal":{"name":"Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87785705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Neural models for Factual Inconsistency Classification with Explanations 带有解释的事实不一致分类的神经模型
Tathagata Raha, Mukund Choudhary, Abhinav Menon, Harshit Gupta, KV Aditya Srivatsa, Manish Gupta, Vasudeva Varma
Factual consistency is one of the most important requirements when editing high quality documents. It is extremely important for automatic text generation systems like summarization, question answering, dialog modeling, and language modeling. Still, automated factual inconsistency detection is rather under-studied. Existing work has focused on (a) finding fake news keeping a knowledge base in context, or (b) detecting broad contradiction (as part of natural language inference literature). However, there has been no work on detecting and explaining types of factual inconsistencies in text, without any knowledge base in context. In this paper, we leverage existing work in linguistics to formally define five types of factual inconsistencies. Based on this categorization, we contribute a novel dataset, FICLE (Factual Inconsistency CLassification with Explanation), with ~8K samples where each sample consists of two sentences (claim and context) annotated with type and span of inconsistency. When the inconsistency relates to an entity type, it is labeled as well at two levels (coarse and fine-grained). Further, we leverage this dataset to train a pipeline of four neural models to predict inconsistency type with explanations, given a (claim, context) sentence pair. Explanations include inconsistent claim fact triple, inconsistent context span, inconsistent claim component, coarse and fine-grained inconsistent entity types. The proposed system first predicts inconsistent spans from claim and context; and then uses them to predict inconsistency types and inconsistent entity types (when inconsistency is due to entities). We experiment with multiple Transformer-based natural language classification as well as generative models, and find that DeBERTa performs the best. Our proposed methods provide a weighted F1 of ~87% for inconsistency type classification across the five classes.
事实一致性是编辑高质量文档时最重要的要求之一。它对于自动文本生成系统(如摘要、问答、对话建模和语言建模)非常重要。尽管如此,自动化的事实不一致检测还没有得到充分的研究。现有的工作集中在(a)发现假新闻,保持知识库在上下文中,或(b)发现广泛的矛盾(作为自然语言推理文献的一部分)。然而,在没有上下文知识基础的情况下,还没有关于检测和解释文本中事实不一致类型的工作。在本文中,我们利用语言学现有的工作,正式定义五种类型的事实不一致。基于这种分类,我们贡献了一个新的数据集,FICLE(事实不一致分类与解释),有大约8K个样本,每个样本由两个句子(声明和上下文)组成,并标注了不一致的类型和范围。当不一致性与实体类型相关时,它也被标记为两个级别(粗粒度和细粒度)。此外,我们利用该数据集来训练一个由四个神经模型组成的管道,以预测给定(声明,上下文)句子对的解释不一致类型。解释包括不一致的索赔事实三重、不一致的上下文范围、不一致的索赔组件、粗粒度和细粒度不一致的实体类型。提出的系统首先从权利要求和上下文预测不一致的跨度;然后使用它们来预测不一致类型和不一致实体类型(当不一致是由实体引起的时候)。我们对多个基于transformer的自然语言分类以及生成模型进行了实验,发现DeBERTa表现最好。我们提出的方法为跨五个类的不一致类型分类提供了约87%的加权F1。
{"title":"Neural models for Factual Inconsistency Classification with Explanations","authors":"Tathagata Raha, Mukund Choudhary, Abhinav Menon, Harshit Gupta, KV Aditya Srivatsa, Manish Gupta, Vasudeva Varma","doi":"10.48550/arXiv.2306.08872","DOIUrl":"https://doi.org/10.48550/arXiv.2306.08872","url":null,"abstract":"Factual consistency is one of the most important requirements when editing high quality documents. It is extremely important for automatic text generation systems like summarization, question answering, dialog modeling, and language modeling. Still, automated factual inconsistency detection is rather under-studied. Existing work has focused on (a) finding fake news keeping a knowledge base in context, or (b) detecting broad contradiction (as part of natural language inference literature). However, there has been no work on detecting and explaining types of factual inconsistencies in text, without any knowledge base in context. In this paper, we leverage existing work in linguistics to formally define five types of factual inconsistencies. Based on this categorization, we contribute a novel dataset, FICLE (Factual Inconsistency CLassification with Explanation), with ~8K samples where each sample consists of two sentences (claim and context) annotated with type and span of inconsistency. When the inconsistency relates to an entity type, it is labeled as well at two levels (coarse and fine-grained). Further, we leverage this dataset to train a pipeline of four neural models to predict inconsistency type with explanations, given a (claim, context) sentence pair. Explanations include inconsistent claim fact triple, inconsistent context span, inconsistent claim component, coarse and fine-grained inconsistent entity types. The proposed system first predicts inconsistent spans from claim and context; and then uses them to predict inconsistency types and inconsistent entity types (when inconsistency is due to entities). We experiment with multiple Transformer-based natural language classification as well as generative models, and find that DeBERTa performs the best. Our proposed methods provide a weighted F1 of ~87% for inconsistency type classification across the five classes.","PeriodicalId":74091,"journal":{"name":"Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80597684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Towards Memory-Efficient Training for Extremely Large Output Spaces - Learning with 500k Labels on a Single Commodity GPU 面向超大输出空间的高效内存训练——在单个商用GPU上学习500k个标签
Erik Schultheis, Rohit Babbar
In classification problems with large output spaces (up to millions of labels), the last layer can require an enormous amount of memory. Using sparse connectivity would drastically reduce the memory requirements, but as we show below, it can result in much diminished predictive performance of the model. Fortunately, we found that this can be mitigated by introducing a penultimate layer of intermediate size. We further demonstrate that one can constrain the connectivity of the sparse layer to be uniform, in the sense that each output neuron will have the exact same number of incoming connections. This allows for efficient implementations of sparse matrix multiplication and connection redistribution on GPU hardware. Via a custom CUDA implementation, we show that the proposed approach can scale to datasets with 670,000 labels on a single commodity GPU with only 4GB memory.
在具有大输出空间(多达数百万个标签)的分类问题中,最后一层可能需要大量的内存。使用稀疏连接将大大减少内存需求,但正如我们下面所示,它可能导致模型的预测性能大大降低。幸运的是,我们发现这可以通过引入中等大小的倒数第二层来缓解。我们进一步证明,我们可以约束稀疏层的连通性是均匀的,在某种意义上,每个输出神经元将具有完全相同数量的传入连接。这允许在GPU硬件上有效地实现稀疏矩阵乘法和连接再分配。通过自定义CUDA实现,我们表明所提出的方法可以扩展到单个商品GPU上具有670,000个标签的数据集,只有4GB内存。
{"title":"Towards Memory-Efficient Training for Extremely Large Output Spaces - Learning with 500k Labels on a Single Commodity GPU","authors":"Erik Schultheis, Rohit Babbar","doi":"10.48550/arXiv.2306.03725","DOIUrl":"https://doi.org/10.48550/arXiv.2306.03725","url":null,"abstract":"In classification problems with large output spaces (up to millions of labels), the last layer can require an enormous amount of memory. Using sparse connectivity would drastically reduce the memory requirements, but as we show below, it can result in much diminished predictive performance of the model. Fortunately, we found that this can be mitigated by introducing a penultimate layer of intermediate size. We further demonstrate that one can constrain the connectivity of the sparse layer to be uniform, in the sense that each output neuron will have the exact same number of incoming connections. This allows for efficient implementations of sparse matrix multiplication and connection redistribution on GPU hardware. Via a custom CUDA implementation, we show that the proposed approach can scale to datasets with 670,000 labels on a single commodity GPU with only 4GB memory.","PeriodicalId":74091,"journal":{"name":"Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77020824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Lumos in the Night Sky: AI-enabled Visual Tool for Exploring Night-Time Light Patterns 夜空中的荧光:用于探索夜间光线模式的ai支持的视觉工具
Jakob Hederich, Shreya Ghosh, Zeyu He, P. Mitra
We introduce NightPulse, an interactive tool for Night-time light (NTL) data visualization and analytics, which enables researchers and stakeholders to explore and analyze NTL data with a user-friendly platform. Powered by efficient system architecture, NightPulse supports image segmentation, clustering, and change pattern detection to identify urban development and sprawl patterns. It captures temporal trends of NTL and semantics of cities, answering questions about demographic factors, city boundaries, and unusual differences.
我们介绍NightPulse,一个用于夜间照明(NTL)数据可视化和分析的交互式工具,它使研究人员和利益相关者能够在一个用户友好的平台上探索和分析NTL数据。通过高效的系统架构,NightPulse支持图像分割、聚类和变化模式检测,以识别城市发展和蔓延模式。它捕捉了NTL和城市语义的时间趋势,回答了有关人口因素、城市边界和异常差异的问题。
{"title":"Lumos in the Night Sky: AI-enabled Visual Tool for Exploring Night-Time Light Patterns","authors":"Jakob Hederich, Shreya Ghosh, Zeyu He, P. Mitra","doi":"10.48550/arXiv.2306.03195","DOIUrl":"https://doi.org/10.48550/arXiv.2306.03195","url":null,"abstract":"We introduce NightPulse, an interactive tool for Night-time light (NTL) data visualization and analytics, which enables researchers and stakeholders to explore and analyze NTL data with a user-friendly platform. Powered by efficient system architecture, NightPulse supports image segmentation, clustering, and change pattern detection to identify urban development and sprawl patterns. It captures temporal trends of NTL and semantics of cities, answering questions about demographic factors, city boundaries, and unusual differences.","PeriodicalId":74091,"journal":{"name":"Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89442175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Symbolic Regression via Control Variable Genetic Programming 通过控制变量遗传规划的符号回归
Nan Jiang, Yexiang Xue
Learning symbolic expressions directly from experiment data is a vital step in AI-driven scientific discovery. Nevertheless, state-of-the-art approaches are limited to learning simple expressions. Regressing expressions involving many independent variables still remain out of reach. Motivated by the control variable experiments widely utilized in science, we propose Control Variable Genetic Programming (CVGP) for symbolic regression over many independent variables. CVGP expedites symbolic expression discovery via customized experiment design, rather than learning from a fixed dataset collected a priori. CVGP starts by fitting simple expressions involving a small set of independent variables using genetic programming, under controlled experiments where other variables are held as constants. It then extends expressions learned in previous generations by adding new independent variables, using new control variable experiments in which these variables are allowed to vary. Theoretically, we show CVGP as an incremental building approach can yield an exponential reduction in the search space when learning a class of expressions. Experimentally, CVGP outperforms several baselines in learning symbolic expressions involving multiple independent variables.
直接从实验数据中学习符号表达式是人工智能驱动的科学发现的重要一步。然而,最先进的方法仅限于学习简单的表达式。涉及许多自变量的回归表达式仍然遥不可及。受科学中广泛使用的控制变量实验的启发,我们提出了控制变量遗传规划(CVGP)用于多自变量的符号回归。CVGP通过定制的实验设计加速符号表达式的发现,而不是从先验收集的固定数据集中学习。CVGP首先使用遗传编程拟合涉及一小组自变量的简单表达式,在控制实验中,其他变量被保持为常量。然后,它通过添加新的自变量来扩展前几代学习的表达式,使用允许这些变量变化的新控制变量实验。从理论上讲,我们证明了CVGP作为一种增量构建方法,在学习一类表达式时可以在搜索空间中产生指数减少。在实验中,CVGP在学习涉及多个自变量的符号表达式方面优于几个基线。
{"title":"Symbolic Regression via Control Variable Genetic Programming","authors":"Nan Jiang, Yexiang Xue","doi":"10.48550/arXiv.2306.08057","DOIUrl":"https://doi.org/10.48550/arXiv.2306.08057","url":null,"abstract":"Learning symbolic expressions directly from experiment data is a vital step in AI-driven scientific discovery. Nevertheless, state-of-the-art approaches are limited to learning simple expressions. Regressing expressions involving many independent variables still remain out of reach. Motivated by the control variable experiments widely utilized in science, we propose Control Variable Genetic Programming (CVGP) for symbolic regression over many independent variables. CVGP expedites symbolic expression discovery via customized experiment design, rather than learning from a fixed dataset collected a priori. CVGP starts by fitting simple expressions involving a small set of independent variables using genetic programming, under controlled experiments where other variables are held as constants. It then extends expressions learned in previous generations by adding new independent variables, using new control variable experiments in which these variables are allowed to vary. Theoretically, we show CVGP as an incremental building approach can yield an exponential reduction in the search space when learning a class of expressions. Experimentally, CVGP outperforms several baselines in learning symbolic expressions involving multiple independent variables.","PeriodicalId":74091,"journal":{"name":"Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87719531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Interpretable Regional Descriptors: Hyperbox-Based Local Explanations 可解释的区域描述符:基于hyperbox的局部解释
Susanne Dandl, Giuseppe Casalicchio, Bernd Bischl, Ludwig Bothmann
This work introduces interpretable regional descriptors, or IRDs, for local, model-agnostic interpretations. IRDs are hyperboxes that describe how an observation's feature values can be changed without affecting its prediction. They justify a prediction by providing a set of"even if"arguments (semi-factual explanations), and they indicate which features affect a prediction and whether pointwise biases or implausibilities exist. A concrete use case shows that this is valuable for both machine learning modelers and persons subject to a decision. We formalize the search for IRDs as an optimization problem and introduce a unifying framework for computing IRDs that covers desiderata, initialization techniques, and a post-processing method. We show how existing hyperbox methods can be adapted to fit into this unified framework. A benchmark study compares the methods based on several quality measures and identifies two strategies to improve IRDs.
这项工作引入了可解释的区域描述符,或ird,用于局部的,模型不可知的解释。ird是描述如何在不影响其预测的情况下改变观测值特征值的超框。它们通过提供一组“即使”论据(半事实性解释)来证明预测的正确性,并指出哪些特征会影响预测,以及是否存在点偏差或不可信。一个具体的用例表明,这对机器学习建模者和受决策影响的人都是有价值的。我们将ird的搜索形式化为一个优化问题,并引入了一个计算ird的统一框架,该框架涵盖了所需数据、初始化技术和后处理方法。我们将展示如何调整现有的hyperbox方法以适应这个统一框架。一项基准研究比较了几种基于质量度量的方法,并确定了两种改进ird的策略。
{"title":"Interpretable Regional Descriptors: Hyperbox-Based Local Explanations","authors":"Susanne Dandl, Giuseppe Casalicchio, Bernd Bischl, Ludwig Bothmann","doi":"10.48550/arXiv.2305.02780","DOIUrl":"https://doi.org/10.48550/arXiv.2305.02780","url":null,"abstract":"This work introduces interpretable regional descriptors, or IRDs, for local, model-agnostic interpretations. IRDs are hyperboxes that describe how an observation's feature values can be changed without affecting its prediction. They justify a prediction by providing a set of\"even if\"arguments (semi-factual explanations), and they indicate which features affect a prediction and whether pointwise biases or implausibilities exist. A concrete use case shows that this is valuable for both machine learning modelers and persons subject to a decision. We formalize the search for IRDs as an optimization problem and introduce a unifying framework for computing IRDs that covers desiderata, initialization techniques, and a post-processing method. We show how existing hyperbox methods can be adapted to fit into this unified framework. A benchmark study compares the methods based on several quality measures and identifies two strategies to improve IRDs.","PeriodicalId":74091,"journal":{"name":"Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88890678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1