Scalable kernel logistic regression with Nyström approximation: Theoretical analysis and application to discrete choice modelling

IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Neurocomputing Pub Date : 2024-11-23 DOI:10.1016/j.neucom.2024.128975
José Ángel Martín-Baos , Ricardo García-Ródenas , Luis Rodriguez-Benitez , Michel Bierlaire
{"title":"Scalable kernel logistic regression with Nyström approximation: Theoretical analysis and application to discrete choice modelling","authors":"José Ángel Martín-Baos ,&nbsp;Ricardo García-Ródenas ,&nbsp;Luis Rodriguez-Benitez ,&nbsp;Michel Bierlaire","doi":"10.1016/j.neucom.2024.128975","DOIUrl":null,"url":null,"abstract":"<div><div>The application of kernel-based Machine Learning (ML) techniques to discrete choice modelling using large datasets often faces challenges due to memory requirements and the considerable number of parameters involved in these models. This complexity hampers the efficient training of large-scale models. This paper addresses these problems of scalability by introducing the Nyström approximation for Kernel Logistic Regression (KLR) on large datasets. The study begins by presenting a theoretical analysis in which: (i) the set of KLR solutions is characterised, (ii) an upper bound to the solution of KLR with Nyström approximation is provided, and finally (iii) a specialisation of the optimisation algorithms to Nyström KLR is described. After this, the Nyström KLR is computationally validated. Four landmark selection methods are tested, including basic uniform sampling, a <span><math><mi>k</mi></math></span>-means sampling strategy, and two non-uniform methods grounded in leverage scores. The performance of these strategies is evaluated using large-scale transport mode choice datasets and is compared with traditional methods such as Multinomial Logit (MNL) and contemporary ML techniques. The study also assesses the efficiency of various optimisation techniques for the proposed Nyström KLR model. The performance of gradient descent, Momentum, Adam, and L-BFGS-B optimisation methods is examined on these datasets. Among these strategies, the <span><math><mi>k</mi></math></span>-means Nyström KLR approach emerges as a successful solution for applying KLR to large datasets, particularly when combined with the L-BFGS-B and Adam optimisation methods. The results highlight the ability of this strategy to handle datasets exceeding 200,000 observations while maintaining robust performance.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"617 ","pages":"Article 128975"},"PeriodicalIF":5.5000,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231224017466","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

The application of kernel-based Machine Learning (ML) techniques to discrete choice modelling using large datasets often faces challenges due to memory requirements and the considerable number of parameters involved in these models. This complexity hampers the efficient training of large-scale models. This paper addresses these problems of scalability by introducing the Nyström approximation for Kernel Logistic Regression (KLR) on large datasets. The study begins by presenting a theoretical analysis in which: (i) the set of KLR solutions is characterised, (ii) an upper bound to the solution of KLR with Nyström approximation is provided, and finally (iii) a specialisation of the optimisation algorithms to Nyström KLR is described. After this, the Nyström KLR is computationally validated. Four landmark selection methods are tested, including basic uniform sampling, a k-means sampling strategy, and two non-uniform methods grounded in leverage scores. The performance of these strategies is evaluated using large-scale transport mode choice datasets and is compared with traditional methods such as Multinomial Logit (MNL) and contemporary ML techniques. The study also assesses the efficiency of various optimisation techniques for the proposed Nyström KLR model. The performance of gradient descent, Momentum, Adam, and L-BFGS-B optimisation methods is examined on these datasets. Among these strategies, the k-means Nyström KLR approach emerges as a successful solution for applying KLR to large datasets, particularly when combined with the L-BFGS-B and Adam optimisation methods. The results highlight the ability of this strategy to handle datasets exceeding 200,000 observations while maintaining robust performance.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
可扩展的核逻辑回归Nyström近似:理论分析和应用,以离散选择建模
将基于核的机器学习(ML)技术应用于使用大型数据集的离散选择建模,由于内存需求和这些模型中涉及的大量参数,通常面临挑战。这种复杂性阻碍了大规模模型的有效训练。本文通过在大数据集上引入Nyström近似核逻辑回归(KLR)来解决这些可扩展性问题。该研究首先提出了一个理论分析,其中:(i) KLR解的集合被表征,(ii)提供了Nyström近似的KLR解的上界,最后(iii)描述了Nyström KLR优化算法的专门化。在此之后,对Nyström KLR进行计算验证。本文测试了四种里程碑选择方法,包括基本均匀抽样、k-means抽样策略和基于杠杆分数的两种非均匀方法。使用大规模传输模式选择数据集对这些策略的性能进行了评估,并与传统方法(如Multinomial Logit (MNL))和当代ML技术进行了比较。该研究还评估了提出的Nyström KLR模型的各种优化技术的效率。在这些数据集上检验了梯度下降、动量、亚当和L-BFGS-B优化方法的性能。在这些策略中,k-means Nyström KLR方法是将KLR应用于大型数据集的成功解决方案,特别是与L-BFGS-B和Adam优化方法结合使用时。结果突出了该策略处理超过200,000个观测值的数据集的能力,同时保持了稳健的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Neurocomputing
Neurocomputing 工程技术-计算机:人工智能
CiteScore
13.10
自引率
10.00%
发文量
1382
审稿时长
70 days
期刊介绍: Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.
期刊最新文献
Monocular thermal SLAM with neural radiance fields for 3D scene reconstruction Learning a more compact representation for low-rank tensor completion An HVS-derived network for assessing the quality of camouflaged targets with feature fusion Global Span Semantic Dependency Awareness and Filtering Network for nested named entity recognition A user behavior-aware multi-task learning model for enhanced short video recommendation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1