EIKA: Explicit & Implicit Knowledge-Augmented Network for entity-aware sports video captioning

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Expert Systems with Applications Pub Date : 2025-05-15 Epub Date: 2025-02-22 DOI:10.1016/j.eswa.2025.126906

Zeyu Xi, Ge Shi, Haoying Sun, Bowen Zhang, Shuyi Li, Lifang Wu

{"title":"EIKA: Explicit & Implicit Knowledge-Augmented Network for entity-aware sports video captioning","authors":"Zeyu Xi, Ge Shi, Haoying Sun, Bowen Zhang, Shuyi Li, Lifang Wu","doi":"10.1016/j.eswa.2025.126906","DOIUrl":null,"url":null,"abstract":"<div><div>Sports video captioning in real application scenarios requires both entities and specific scenes. However, it is difficult to extract this fine-grained information solely from the video content. This paper introduces an Explicit & Implicit Knowledge-Augmented Network for Entity-Aware Sports Video Captioning (EIKA), which leverages both explicit game-related knowledge (i.e., the set of involved player entities) and implicit visual scene knowledge extracted from the training set. Our innovative Entity-Video Interaction Module (EVIM) and Video-Knowledge Interaction Module (VKIM) are instrumental in enhancing the extraction of entity-related and scene-specific video features, respectively. The spatiotemporal information in video is encoded by introducing the Spatial-Temporal Modeling Module (STMM). And the designed Scene-To-Entity (STE) decoder fully utilizes the two kinds of knowledge to generate informative captions with the distributed decoding approach. Extensive evaluations on the VC-NBA-2022, Goal and NSVA datasets demonstrate that our method has the leading performance compared with existing methods.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"274 ","pages":"Article 126906"},"PeriodicalIF":7.5000,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425005287","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/22 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Sports video captioning in real application scenarios requires both entities and specific scenes. However, it is difficult to extract this fine-grained information solely from the video content. This paper introduces an Explicit & Implicit Knowledge-Augmented Network for Entity-Aware Sports Video Captioning (EIKA), which leverages both explicit game-related knowledge (i.e., the set of involved player entities) and implicit visual scene knowledge extracted from the training set. Our innovative Entity-Video Interaction Module (EVIM) and Video-Knowledge Interaction Module (VKIM) are instrumental in enhancing the extraction of entity-related and scene-specific video features, respectively. The spatiotemporal information in video is encoded by introducing the Spatial-Temporal Modeling Module (STMM). And the designed Scene-To-Entity (STE) decoder fully utilizes the two kinds of knowledge to generate informative captions with the distributed decoding approach. Extensive evaluations on the VC-NBA-2022, Goal and NSVA datasets demonstrate that our method has the leading performance compared with existing methods.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

EIKA：用于实体感知体育视频字幕的显式和隐式知识增强网络

真实应用场景下的体育视频字幕既需要实体，也需要特定场景。然而，仅从视频内容中提取这种细粒度信息是很困难的。本文介绍了一个显式&；用于实体感知运动视频字幕（EIKA）的隐式知识增强网络，它利用了显性游戏相关知识（即涉及的玩家实体集）和从训练集中提取的隐式视觉场景知识。我们创新的实体-视频交互模块（EVIM）和视频-知识交互模块（VKIM）分别有助于增强实体相关和场景特定视频特征的提取。通过引入时空建模模块（Spatial-Temporal Modeling Module， STMM）对视频中的时空信息进行编码。所设计的场景到实体（STE）解码器充分利用了这两种知识，采用分布式解码的方法生成了信息丰富的字幕。对VC-NBA-2022、Goal和NSVA数据集的广泛评估表明，与现有方法相比，我们的方法具有领先的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Expert Systems with Applications 工程技术-工程：电子与电气

CiteScore

13.80

自引率

10.60%

发文量

2045

审稿时长

8.7 months

期刊介绍： Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.