Model Loss and Distribution Analysis of Regression Problems in Machine Learning

Nan Yang, Zeyu Zheng, Tianran Wang
{"title":"Model Loss and Distribution Analysis of Regression Problems in Machine Learning","authors":"Nan Yang, Zeyu Zheng, Tianran Wang","doi":"10.1145/3318299.3318367","DOIUrl":null,"url":null,"abstract":"The machine learning regression model is based on the assumption of normal distribution. In this paper, we mainly study the probability distribution of the machine learning model and the effect of the convergence values of different loss functions on the probability distribution model. Based on the idea of robust regression and the assumption of homogeneous variance of the model, we solved the statistical solution of two-dimensional regression problem by using least square method. The maximum likelihood estimation parameters of the probabilistic model are obtained by using the maximum likelihood estimation method. In order to compare the solving parameters of the two methods, the convergence values of L1 loss function and L2 loss function are used for the regression verification. Through the mathematical and statistical rigorous derivation, obtained two important conclusions; First, under the condition that the data satisfies normal distribution and is based on the assumption of homogeneous variance, the probability model conforms to the multivariate gaussian distribution. Secondly, the model satisfying the multi-gaussian distribution has little influence on the parameter estimation under the condition of the large number theorem, that is, the multi-gaussian distribution model has good tolerance to the loss function.","PeriodicalId":164987,"journal":{"name":"International Conference on Machine Learning and Computing","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Machine Learning and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3318299.3318367","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

The machine learning regression model is based on the assumption of normal distribution. In this paper, we mainly study the probability distribution of the machine learning model and the effect of the convergence values of different loss functions on the probability distribution model. Based on the idea of robust regression and the assumption of homogeneous variance of the model, we solved the statistical solution of two-dimensional regression problem by using least square method. The maximum likelihood estimation parameters of the probabilistic model are obtained by using the maximum likelihood estimation method. In order to compare the solving parameters of the two methods, the convergence values of L1 loss function and L2 loss function are used for the regression verification. Through the mathematical and statistical rigorous derivation, obtained two important conclusions; First, under the condition that the data satisfies normal distribution and is based on the assumption of homogeneous variance, the probability model conforms to the multivariate gaussian distribution. Secondly, the model satisfying the multi-gaussian distribution has little influence on the parameter estimation under the condition of the large number theorem, that is, the multi-gaussian distribution model has good tolerance to the loss function.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
机器学习中回归问题的模型损失与分布分析
机器学习回归模型是基于正态分布的假设。本文主要研究机器学习模型的概率分布,以及不同损失函数的收敛值对概率分布模型的影响。基于稳健回归思想和模型方差齐次假设,利用最小二乘法求解了二维回归问题的统计解。利用极大似然估计法获得了概率模型的极大似然估计参数。为了比较两种方法的求解参数,分别使用L1损失函数和L2损失函数的收敛值进行回归验证。通过数学和统计学的严格推导,得到了两个重要结论;首先,在数据满足正态分布的条件下,基于方差齐次假设,概率模型符合多元高斯分布。其次,在大数定理条件下,满足多高斯分布的模型对参数估计的影响较小,即多高斯分布模型对损失函数有较好的容忍度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Particle Competition for Multilayer Network Community Detection Power Load Forecasting Using a Refined LSTM Research on the Application of Big Data Management in Enterprise Management Decision-making and Execution Literature Review A Flexible Approach for Human Activity Recognition Based on Broad Learning System Decentralized Adaptive Latency-Aware Cloud-Edge-Dew Architecture for Unreliable Network
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1