Not all samples are equal: Boosting action segmentation via selective incremental learning

IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Engineering Applications of Artificial Intelligence Pub Date : 2025-02-26 DOI:10.1016/j.engappai.2025.110334
Feng Huang , Xiao-Diao Chen , Wen Wu , Weiyin Ma
{"title":"Not all samples are equal: Boosting action segmentation via selective incremental learning","authors":"Feng Huang ,&nbsp;Xiao-Diao Chen ,&nbsp;Wen Wu ,&nbsp;Weiyin Ma","doi":"10.1016/j.engappai.2025.110334","DOIUrl":null,"url":null,"abstract":"<div><div>Temporal action segmentation (TAS) seeks to perform classification for each frame in a video. Existing methods tend to design diverse network architectures, while overlooking the intrinsic characteristics of training samples. Notably, two key issues arise: (1) Frames around action boundaries are more ambiguous and thus pose greater difficulties for training compared to other frames; and (2) beyond the commonly used categorical labels, the total number of action instances within a video may serve as an additional, potentially vital, supervision cue. To address these issues, this paper introduces a novel method that combines a model-agnostic training strategy with an instance number alignment loss, designed to enhance the performance of existing models. Specifically, a selective incremental learning (SIL) strategy is proposed to alleviate the impact of noisy samples by progressively training the model in an easy-to-difficult manner through a dynamic sample selection mechanism. Furthermore, an instance number alignment loss (INAL) is developed to capture both global and local features simultaneously by incorporating a multi-task learning module. Extensive evaluations are conducted on three benchmark datasets, namely 50Salads, Georgia Tech egocentric activities (GTEA), and Breakfast. The experimental results demonstrate that the proposed method achieves substantial performance improvements over state-of-the-art approaches.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"147 ","pages":"Article 110334"},"PeriodicalIF":8.0000,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197625003343","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Temporal action segmentation (TAS) seeks to perform classification for each frame in a video. Existing methods tend to design diverse network architectures, while overlooking the intrinsic characteristics of training samples. Notably, two key issues arise: (1) Frames around action boundaries are more ambiguous and thus pose greater difficulties for training compared to other frames; and (2) beyond the commonly used categorical labels, the total number of action instances within a video may serve as an additional, potentially vital, supervision cue. To address these issues, this paper introduces a novel method that combines a model-agnostic training strategy with an instance number alignment loss, designed to enhance the performance of existing models. Specifically, a selective incremental learning (SIL) strategy is proposed to alleviate the impact of noisy samples by progressively training the model in an easy-to-difficult manner through a dynamic sample selection mechanism. Furthermore, an instance number alignment loss (INAL) is developed to capture both global and local features simultaneously by incorporating a multi-task learning module. Extensive evaluations are conducted on three benchmark datasets, namely 50Salads, Georgia Tech egocentric activities (GTEA), and Breakfast. The experimental results demonstrate that the proposed method achieves substantial performance improvements over state-of-the-art approaches.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
并非所有样本都是相同的:通过选择性增量学习来促进动作分割
时间动作分割(TAS)试图对视频中的每一帧进行分类。现有的方法倾向于设计多样化的网络架构,而忽略了训练样本的内在特征。值得注意的是,出现了两个关键问题:(1)围绕动作边界的帧更加模糊,因此与其他帧相比,对训练造成了更大的困难;(2)除了常用的分类标签之外,视频中动作实例的总数可以作为额外的、潜在的重要监督线索。为了解决这些问题,本文引入了一种新的方法,该方法将模型不可知的训练策略与实例数对齐损失相结合,旨在提高现有模型的性能。具体而言,提出了一种选择性增量学习(SIL)策略,通过动态样本选择机制,以易难的方式逐步训练模型,以减轻噪声样本的影响。此外,通过结合多任务学习模块,开发了实例数对齐损失(INAL)算法来同时捕获全局和局部特征。在三个基准数据集上进行了广泛的评估,即50salad, Georgia Tech egocentric activities (GTEA)和Breakfast。实验结果表明,所提出的方法比最先进的方法取得了实质性的性能改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Engineering Applications of Artificial Intelligence
Engineering Applications of Artificial Intelligence 工程技术-工程:电子与电气
CiteScore
9.60
自引率
10.00%
发文量
505
审稿时长
68 days
期刊介绍: Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.
期刊最新文献
Multiphysics response and internal leakage prediction of seismic hydraulic systems considering structural clearance effects Machine learning-based prediction of ductility of strain-hardening fiber-reinforced cementitious composites Neighborhood constrained attention for lightweight image super-resolution A quantum group decision-making model for patient-capital project selection integrating cumulative prospect theory under linear Diophantine fuzzy uncertainty Forecast-enhanced bilevel real-time pricing for microgrids via hybrid-action reinforcement learning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1