Adaptive application of machine learning models on separate segments of a data sample in regression and classification problems

I. Lebedev
{"title":"Adaptive application of machine learning models on separate segments of a data sample in regression and classification problems","authors":"I. Lebedev","doi":"10.31799/1684-8853-2022-3-20-30","DOIUrl":null,"url":null,"abstract":"Introduction: Achievement of specified qualitative indicators in machine learning solutions depends not only on the efficiency of algorithms, but also on data properties. One of the lines for the development of classification and regression models is the specification of local properties of data. Purpose: To improve the qualitative predictors when solving classification and regression problems based on the adaptive selection of various machine learning models on separate local segments of data sample. Results: We propose a method that uses a combination of different models and machine learning algorithms on subsamples in regression and classification problems. The method is based on the calculation of qualitative predictors and the selection of the best models on the local segments of data sample. The finding of transformations of data and time series allows to create sample sets, with the data having different properties (for example, variance, sampling fraction, data range, etc.). We consider the data segmentation based on the change point detection algorithm in time series trends and on analytical information. On the example of the real dataset, we show the experimental values of the loss function for the proposed method with different classifiers on separate segments and on the whole sample. Practical relevance: The results can be used in classification and regression problems for the development of machine learning models and methods. The proposed method allows to improve classification and regression qualitative predictors by assigning models that have the best performance on separate segments.","PeriodicalId":36977,"journal":{"name":"Informatsionno-Upravliaiushchie Sistemy","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Informatsionno-Upravliaiushchie Sistemy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.31799/1684-8853-2022-3-20-30","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 1

Abstract

Introduction: Achievement of specified qualitative indicators in machine learning solutions depends not only on the efficiency of algorithms, but also on data properties. One of the lines for the development of classification and regression models is the specification of local properties of data. Purpose: To improve the qualitative predictors when solving classification and regression problems based on the adaptive selection of various machine learning models on separate local segments of data sample. Results: We propose a method that uses a combination of different models and machine learning algorithms on subsamples in regression and classification problems. The method is based on the calculation of qualitative predictors and the selection of the best models on the local segments of data sample. The finding of transformations of data and time series allows to create sample sets, with the data having different properties (for example, variance, sampling fraction, data range, etc.). We consider the data segmentation based on the change point detection algorithm in time series trends and on analytical information. On the example of the real dataset, we show the experimental values of the loss function for the proposed method with different classifiers on separate segments and on the whole sample. Practical relevance: The results can be used in classification and regression problems for the development of machine learning models and methods. The proposed method allows to improve classification and regression qualitative predictors by assigning models that have the best performance on separate segments.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
机器学习模型在回归和分类问题中对数据样本单独片段的自适应应用
引言:机器学习解决方案中特定定性指标的实现不仅取决于算法的效率,还取决于数据属性。分类和回归模型的发展方向之一是指定数据的局部属性。目的:在解决分类和回归问题时,基于在数据样本的单独局部片段上自适应选择各种机器学习模型,提高定性预测因子。结果:我们提出了一种在回归和分类问题中对子样本使用不同模型和机器学习算法的方法。该方法基于定性预测因子的计算和对数据样本局部片段的最佳模型的选择。数据和时间序列转换的发现允许创建样本集,其中数据具有不同的属性(例如,方差、采样分数、数据范围等)。我们考虑基于时间序列趋势中的变化点检测算法和分析信息的数据分割。在真实数据集的例子中,我们展示了所提出的方法的损失函数的实验值,该方法在单独的片段和整个样本上使用不同的分类器。实际相关性:结果可用于分类和回归问题,用于开发机器学习模型和方法。所提出的方法允许通过分配在单独分段上具有最佳性能的模型来改进分类和回归定性预测因子。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Informatsionno-Upravliaiushchie Sistemy
Informatsionno-Upravliaiushchie Sistemy Mathematics-Control and Optimization
CiteScore
1.40
自引率
0.00%
发文量
35
期刊最新文献
Modeling of bumping routes in the RSK algorithm and analysis of their approach to limit shapes Continuous control algorithms for conveyer belt routing based on multi-agent deep reinforcement learning Fully integrated optical sensor system with intensity interrogation Decoding of linear codes for single error bursts correction based on the determination of certain events Backend Bug Finder — a platform for effective compiler fuzzing
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1