广义全局池化操作的正则化最优传输层

IF 20.8 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2022-12-13 DOI:10.48550/arXiv.2212.06339
Hongteng Xu, Minjie Cheng
{"title":"广义全局池化操作的正则化最优传输层","authors":"Hongteng Xu, Minjie Cheng","doi":"10.48550/arXiv.2212.06339","DOIUrl":null,"url":null,"abstract":"Global pooling is one of the most significant operations in many machine learning models and tasks, which works for information fusion and structured data (like sets and graphs) representation. However, without solid mathematical fundamentals, its practical implementations often depend on empirical mechanisms and thus lead to sub-optimal, even unsatisfactory performance. In this work, we develop a novel and generalized global pooling framework through the lens of optimal transport. The proposed framework is interpretable from the perspective of expectation-maximization. Essentially, it aims at learning an optimal transport across sample indices and feature dimensions, making the corresponding pooling operation maximize the conditional expectation of input data. We demonstrate that most existing pooling methods are equivalent to solving a regularized optimal transport (ROT) problem with different specializations, and more sophisticated pooling operations can be implemented by hierarchically solving multiple ROT problems. Making the parameters of the ROT problem learnable, we develop a family of regularized optimal transport pooling (ROTP) layers. We implement the ROTP layers as a new kind of deep implicit layer. Their model architectures correspond to different optimization algorithms. We test our ROTP layers in several representative set-level machine learning scenarios, including multi-instance learning (MIL), graph classification, graph set representation, and image classification. Experimental results show that applying our ROTP layers can reduce the difficulty of the design and selection of global pooling - our ROTP layers may either imitate some existing global pooling methods or lead to some new pooling layers fitting data better. The code is available at https://github.com/SDS-Lab/ROT-Pooling.","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":" ","pages":""},"PeriodicalIF":20.8000,"publicationDate":"2022-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Regularized Optimal Transport Layers for Generalized Global Pooling Operations\",\"authors\":\"Hongteng Xu, Minjie Cheng\",\"doi\":\"10.48550/arXiv.2212.06339\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Global pooling is one of the most significant operations in many machine learning models and tasks, which works for information fusion and structured data (like sets and graphs) representation. However, without solid mathematical fundamentals, its practical implementations often depend on empirical mechanisms and thus lead to sub-optimal, even unsatisfactory performance. In this work, we develop a novel and generalized global pooling framework through the lens of optimal transport. The proposed framework is interpretable from the perspective of expectation-maximization. Essentially, it aims at learning an optimal transport across sample indices and feature dimensions, making the corresponding pooling operation maximize the conditional expectation of input data. We demonstrate that most existing pooling methods are equivalent to solving a regularized optimal transport (ROT) problem with different specializations, and more sophisticated pooling operations can be implemented by hierarchically solving multiple ROT problems. Making the parameters of the ROT problem learnable, we develop a family of regularized optimal transport pooling (ROTP) layers. We implement the ROTP layers as a new kind of deep implicit layer. Their model architectures correspond to different optimization algorithms. We test our ROTP layers in several representative set-level machine learning scenarios, including multi-instance learning (MIL), graph classification, graph set representation, and image classification. Experimental results show that applying our ROTP layers can reduce the difficulty of the design and selection of global pooling - our ROTP layers may either imitate some existing global pooling methods or lead to some new pooling layers fitting data better. The code is available at https://github.com/SDS-Lab/ROT-Pooling.\",\"PeriodicalId\":13426,\"journal\":{\"name\":\"IEEE Transactions on Pattern Analysis and Machine Intelligence\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":20.8000,\"publicationDate\":\"2022-12-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Pattern Analysis and Machine Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2212.06339\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Pattern Analysis and Machine Intelligence","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.48550/arXiv.2212.06339","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 2

摘要

全局池是许多机器学习模型和任务中最重要的操作之一,用于信息融合和结构化数据(如集合和图)表示。然而,如果没有坚实的数学基础,其实际实现往往依赖于经验机制,从而导致次优甚至不令人满意的性能。在这项工作中,我们通过最优运输的视角,开发了一个新的、广义的全球联营框架。所提出的框架可以从期望最大化的角度进行解释。本质上,它旨在学习样本指数和特征维度之间的最优传输,使相应的池化操作最大化输入数据的条件期望。我们证明了大多数现有的池化方法相当于解决具有不同专业化的正则化最优传输(ROT)问题,并且可以通过分层解决多个ROT问题来实现更复杂的池化操作。为了使ROT问题的参数可学习,我们开发了一组正则化最优传输池(ROTP)层。我们将ROTP层实现为一种新的深层隐式层。它们的模型体系结构对应于不同的优化算法。我们在几个具有代表性的集级机器学习场景中测试了我们的ROTP层,包括多实例学习(MIL)、图分类、图集表示和图像分类。实验结果表明,应用我们的ROTP层可以降低全局池的设计和选择的难度——我们的ROPP层可以模仿一些现有的全局池方法,也可以产生一些新的池层来更好地拟合数据。代码可在https://github.com/SDS-Lab/ROT-Pooling.
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Regularized Optimal Transport Layers for Generalized Global Pooling Operations
Global pooling is one of the most significant operations in many machine learning models and tasks, which works for information fusion and structured data (like sets and graphs) representation. However, without solid mathematical fundamentals, its practical implementations often depend on empirical mechanisms and thus lead to sub-optimal, even unsatisfactory performance. In this work, we develop a novel and generalized global pooling framework through the lens of optimal transport. The proposed framework is interpretable from the perspective of expectation-maximization. Essentially, it aims at learning an optimal transport across sample indices and feature dimensions, making the corresponding pooling operation maximize the conditional expectation of input data. We demonstrate that most existing pooling methods are equivalent to solving a regularized optimal transport (ROT) problem with different specializations, and more sophisticated pooling operations can be implemented by hierarchically solving multiple ROT problems. Making the parameters of the ROT problem learnable, we develop a family of regularized optimal transport pooling (ROTP) layers. We implement the ROTP layers as a new kind of deep implicit layer. Their model architectures correspond to different optimization algorithms. We test our ROTP layers in several representative set-level machine learning scenarios, including multi-instance learning (MIL), graph classification, graph set representation, and image classification. Experimental results show that applying our ROTP layers can reduce the difficulty of the design and selection of global pooling - our ROTP layers may either imitate some existing global pooling methods or lead to some new pooling layers fitting data better. The code is available at https://github.com/SDS-Lab/ROT-Pooling.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
28.40
自引率
3.00%
发文量
885
审稿时长
8.5 months
期刊介绍: The IEEE Transactions on Pattern Analysis and Machine Intelligence publishes articles on all traditional areas of computer vision and image understanding, all traditional areas of pattern analysis and recognition, and selected areas of machine intelligence, with a particular emphasis on machine learning for pattern analysis. Areas such as techniques for visual search, document and handwriting analysis, medical image analysis, video and image sequence analysis, content-based retrieval of image and video, face and gesture recognition and relevant specialized hardware and/or software architectures are also covered.
期刊最新文献
Streaming quanta sensors for online, high-performance imaging and vision FSD V2: Improving Fully Sparse 3D Object Detection with Virtual Voxels Partial Scene Text Retrieval BokehMe++: Harmonious Fusion of Classical and Neural Rendering for Versatile Bokeh Creation DiffI2I: Efficient Diffusion Model for Image-to-Image Translation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1