Rts: learning robustly from time series data with noisy label

IF 3.4 3区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Frontiers of Computer Science Pub Date : 2023-12-28 DOI:10.1007/s11704-023-3200-z
Zhi Zhou, Yi-Xuan Jin, Yu-Feng Li
{"title":"Rts: learning robustly from time series data with noisy label","authors":"Zhi Zhou, Yi-Xuan Jin, Yu-Feng Li","doi":"10.1007/s11704-023-3200-z","DOIUrl":null,"url":null,"abstract":"<p>Significant progress has been made in machine learning with large amounts of clean labels and static data. However, in many real-world applications, the data often changes with time and it is difficult to obtain massive clean annotations, that is, noisy labels and time series are faced simultaneously. For example, in product-buyer evaluation, each sample records the daily time behavior of users, but the long transaction period brings difficulties to analysis, and salespeople often erroneously annotate the user’s purchase behavior. Such a novel setting, to our best knowledge, has not been thoroughly studied yet, and there is still a lack of effective machine learning methods. In this paper, we present a systematic approach RTS both theoretically and empirically, consisting of two components, Noise-Tolerant Time Series Representation and Purified Oversampling Learning. Specifically, we propose reducing label noise’s destructive impact to obtain robust feature representations and potential clean samples. Then, a novel learning method based on the purified data and time series oversampling is adopted to train an unbiased model. Theoretical analysis proves that our proposal can improve the quality of the noisy data set. Empirical experiments on diverse tasks, such as the house-buyer evaluation task from real-world applications and various benchmark tasks, clearly demonstrate that our new algorithm robustly outperforms many competitive methods.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":null,"pages":null},"PeriodicalIF":3.4000,"publicationDate":"2023-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers of Computer Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11704-023-3200-z","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Significant progress has been made in machine learning with large amounts of clean labels and static data. However, in many real-world applications, the data often changes with time and it is difficult to obtain massive clean annotations, that is, noisy labels and time series are faced simultaneously. For example, in product-buyer evaluation, each sample records the daily time behavior of users, but the long transaction period brings difficulties to analysis, and salespeople often erroneously annotate the user’s purchase behavior. Such a novel setting, to our best knowledge, has not been thoroughly studied yet, and there is still a lack of effective machine learning methods. In this paper, we present a systematic approach RTS both theoretically and empirically, consisting of two components, Noise-Tolerant Time Series Representation and Purified Oversampling Learning. Specifically, we propose reducing label noise’s destructive impact to obtain robust feature representations and potential clean samples. Then, a novel learning method based on the purified data and time series oversampling is adopted to train an unbiased model. Theoretical analysis proves that our proposal can improve the quality of the noisy data set. Empirical experiments on diverse tasks, such as the house-buyer evaluation task from real-world applications and various benchmark tasks, clearly demonstrate that our new algorithm robustly outperforms many competitive methods.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Rts:从带有噪声标签的时间序列数据中稳健学习
利用大量干净的标签和静态数据进行机器学习已经取得了重大进展。然而,在现实世界的许多应用中,数据往往会随时间发生变化,很难获得大量干净的注释,即同时面临噪声标签和时间序列的问题。例如,在商品购买评价中,每个样本都记录了用户每天的时间行为,但交易周期较长,给分析带来了困难,而且销售人员经常错误地注释用户的购买行为。据我们所知,这样一种新颖的环境尚未得到深入研究,而且仍然缺乏有效的机器学习方法。在本文中,我们从理论和经验两方面提出了一种系统的 RTS 方法,它由两个部分组成:噪声容忍时间序列表示和纯化过采样学习。具体来说,我们建议减少标签噪声的破坏性影响,以获得稳健的特征表示和潜在的干净样本。然后,采用一种基于净化数据和时间序列超采样的新型学习方法来训练无偏模型。理论分析证明,我们的建议可以提高噪声数据集的质量。在各种任务(如实际应用中的房屋购买评估任务和各种基准任务)上的经验实验清楚地表明,我们的新算法稳健地优于许多竞争方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Frontiers of Computer Science
Frontiers of Computer Science COMPUTER SCIENCE, INFORMATION SYSTEMS-COMPUTER SCIENCE, SOFTWARE ENGINEERING
CiteScore
8.60
自引率
2.40%
发文量
799
审稿时长
6-12 weeks
期刊介绍: Frontiers of Computer Science aims to provide a forum for the publication of peer-reviewed papers to promote rapid communication and exchange between computer scientists. The journal publishes research papers and review articles in a wide range of topics, including: architecture, software, artificial intelligence, theoretical computer science, networks and communication, information systems, multimedia and graphics, information security, interdisciplinary, etc. The journal especially encourages papers from new emerging and multidisciplinary areas, as well as papers reflecting the international trends of research and development and on special topics reporting progress made by Chinese computer scientists.
期刊最新文献
A comprehensive survey of federated transfer learning: challenges, methods and applications DMFVAE: miRNA-disease associations prediction based on deep matrix factorization method with variational autoencoder Graph foundation model ABLkit: a Python toolkit for abductive learning SEOE: an option graph based semantically embedding method for prenatal depression detection
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1