学习预测极限的惩罚方法

2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2018-09-10 DOI:10.1109/ICASSP.2018.8461832

Jie Ding, Enmao Diao, Jiawei Zhou, V. Tarokh

{"title":"学习预测极限的惩罚方法","authors":"Jie Ding, Enmao Diao, Jiawei Zhou, V. Tarokh","doi":"10.1109/ICASSP.2018.8461832","DOIUrl":null,"url":null,"abstract":"Machine learning systems learn from and make predictions by building models from observed data. Because large models tend to overfit while small models tend to underfit for a given fixed dataset, a critical challenge is to select an appropriate model (e.g. set of variables/features). Model selection aims to strike a balance between the goodness of fit and model complexity, and thus to gain reliable predictive power. In this paper, we study a penalized model selection technique that asymptotically achieves the optimal expected prediction loss (referred to as the limit of learning) offered by a set of candidate models. We prove that the proposed procedure is both statistically efficient in the sense that it asymptotically approaches the limit of learning, and computationally efficient in the sense that it can be much faster than cross validation methods. Our theory applies for a wide variety of model classes, loss functions, and high dimensions (in the sense that the models' complexity can grow with data size). We released a python package with our proposed method for general usage like logistic regression and neural networks.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"38 1","pages":"4414-4418"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Penalized Method for the Predictive Limit of Learning\",\"authors\":\"Jie Ding, Enmao Diao, Jiawei Zhou, V. Tarokh\",\"doi\":\"10.1109/ICASSP.2018.8461832\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine learning systems learn from and make predictions by building models from observed data. Because large models tend to overfit while small models tend to underfit for a given fixed dataset, a critical challenge is to select an appropriate model (e.g. set of variables/features). Model selection aims to strike a balance between the goodness of fit and model complexity, and thus to gain reliable predictive power. In this paper, we study a penalized model selection technique that asymptotically achieves the optimal expected prediction loss (referred to as the limit of learning) offered by a set of candidate models. We prove that the proposed procedure is both statistically efficient in the sense that it asymptotically approaches the limit of learning, and computationally efficient in the sense that it can be much faster than cross validation methods. Our theory applies for a wide variety of model classes, loss functions, and high dimensions (in the sense that the models' complexity can grow with data size). We released a python package with our proposed method for general usage like logistic regression and neural networks.\",\"PeriodicalId\":6638,\"journal\":{\"name\":\"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)\",\"volume\":\"38 1\",\"pages\":\"4414-4418\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-09-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASSP.2018.8461832\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.2018.8461832","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

机器学习系统通过从观察到的数据中建立模型来学习和预测。因为对于给定的固定数据集，大模型倾向于过拟合，而小模型倾向于欠拟合，所以一个关键的挑战是选择一个合适的模型(例如一组变量/特征)。模型选择的目的是在拟合优度和模型复杂性之间取得平衡，从而获得可靠的预测能力。在本文中，我们研究了一种惩罚模型选择技术，该技术渐近地达到一组候选模型提供的最优预期预测损失(称为学习极限)。我们证明了所提出的过程在统计上是有效的，因为它渐进地接近学习的极限，在计算上是有效的，因为它可以比交叉验证方法快得多。我们的理论适用于各种各样的模型类、损失函数和高维(在这种意义上，模型的复杂性可以随着数据大小而增长)。我们发布了一个python包，其中包含我们提出的方法，用于逻辑回归和神经网络等一般用途。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A Penalized Method for the Predictive Limit of Learning

Machine learning systems learn from and make predictions by building models from observed data. Because large models tend to overfit while small models tend to underfit for a given fixed dataset, a critical challenge is to select an appropriate model (e.g. set of variables/features). Model selection aims to strike a balance between the goodness of fit and model complexity, and thus to gain reliable predictive power. In this paper, we study a penalized model selection technique that asymptotically achieves the optimal expected prediction loss (referred to as the limit of learning) offered by a set of candidate models. We prove that the proposed procedure is both statistically efficient in the sense that it asymptotically approaches the limit of learning, and computationally efficient in the sense that it can be much faster than cross validation methods. Our theory applies for a wide variety of model classes, loss functions, and high dimensions (in the sense that the models' complexity can grow with data size). We released a python package with our proposed method for general usage like logistic regression and neural networks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

自引率

0.00%

发文量

期刊最新文献

Reduced Dimension Minimum BER PSK Precoding for Constrained Transmit Signals in Massive MIMO Low Complexity Joint RDO of Prediction Units Couples for HEVC Intra Coding Non-Native Children Speech Recognition Through Transfer Learning Synthesis of Images by Two-Stage Generative Adversarial Networks Statistical T+2d Subband Modelling for Crowd Counting