{"title":"Lighthouse: A User-Friendly Library for Reproducible Video Moment Retrieval and Highlight Detection","authors":"Taichi Nishimura, Shota Nakada, Hokuto Munakata, Tatsuya Komatsu","doi":"arxiv-2408.02901","DOIUrl":null,"url":null,"abstract":"We propose Lighthouse, a user-friendly library for reproducible video moment\nretrieval and highlight detection (MR-HD). Although researchers proposed\nvarious MR-HD approaches, the research community holds two main issues. The\nfirst is a lack of comprehensive and reproducible experiments across various\nmethods, datasets, and video-text features. This is because no unified training\nand evaluation codebase covers multiple settings. The second is user-unfriendly\ndesign. Because previous works use different libraries, researchers set up\nindividual environments. In addition, most works release only the training\ncodes, requiring users to implement the whole inference process of MR-HD.\nLighthouse addresses these issues by implementing a unified reproducible\ncodebase that includes six models, three features, and five datasets. In\naddition, it provides an inference API and web demo to make these methods\neasily accessible for researchers and developers. Our experiments demonstrate\nthat Lighthouse generally reproduces the reported scores in the reference\npapers. The code is available at https://github.com/line/lighthouse.","PeriodicalId":501480,"journal":{"name":"arXiv - CS - Multimedia","volume":"22 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.02901","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We propose Lighthouse, a user-friendly library for reproducible video moment
retrieval and highlight detection (MR-HD). Although researchers proposed
various MR-HD approaches, the research community holds two main issues. The
first is a lack of comprehensive and reproducible experiments across various
methods, datasets, and video-text features. This is because no unified training
and evaluation codebase covers multiple settings. The second is user-unfriendly
design. Because previous works use different libraries, researchers set up
individual environments. In addition, most works release only the training
codes, requiring users to implement the whole inference process of MR-HD.
Lighthouse addresses these issues by implementing a unified reproducible
codebase that includes six models, three features, and five datasets. In
addition, it provides an inference API and web demo to make these methods
easily accessible for researchers and developers. Our experiments demonstrate
that Lighthouse generally reproduces the reported scores in the reference
papers. The code is available at https://github.com/line/lighthouse.