首页 > 最新文献

MULTIMEDIA '04最新文献

英文 中文
A robust on-the-fly pitch (OTFP) estimation algorithm 一种鲁棒动态节距估计算法
Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027591
S. Sood, A. Krishnamurthy
Pitch detection or fundamental frequency (f0) estimation is a classical research topic and has been extensively studied for many years. Pitch estimation by embedding speech signal into multiple state-space dimensions is a relatively recent technique. Also YIN pitch detection algorithm [1] has been cited recently as an improvement over other standard pitch estimation algorithms. In this paper an attempt is made to present a unifying view on some of these existing and seemingly disparate techniques. The unified view enables the development of robust formulations of some existing definitions and also helps to interpret the limitations of the classical/existing approaches in use. Application of the idea for a robust On-the-Fly pitch (OTFP) detection is demonstrated and comparison with robust YIN pitch detector has yielded encouraging results. The On-The-Fly imposes a constraint that pitch or aperiodicity estimates from past or future speech frames are not to be used at a post processing stage and OTFP outperforms the YIN estimator with this constraint.
基音检测或基频估计是一个经典的研究课题,多年来得到了广泛的研究。将语音信号嵌入到多个状态空间维度中进行基音估计是一种相对较新的技术。此外,YIN基音检测算法[1]最近被引用为对其他标准基音估计算法的改进。在本文中,试图提出一个统一的观点,对这些现有的和看似不同的技术。统一视图能够开发一些现有定义的健壮公式,也有助于解释使用中的经典/现有方法的局限性。演示了鲁棒动态基音(OTFP)检测思想的应用,并与鲁棒YIN基音检测器进行了比较,取得了令人鼓舞的结果。on - fly施加了一个约束,即不能在后处理阶段使用来自过去或未来语音帧的音高或非周期性估计,OTFP在此约束下优于YIN估计器。
{"title":"A robust on-the-fly pitch (OTFP) estimation algorithm","authors":"S. Sood, A. Krishnamurthy","doi":"10.1145/1027527.1027591","DOIUrl":"https://doi.org/10.1145/1027527.1027591","url":null,"abstract":"Pitch detection or fundamental frequency (f<inf>0</inf>) estimation is a classical research topic and has been extensively studied for many years. Pitch estimation by embedding speech signal into multiple state-space dimensions is a relatively recent technique. Also YIN pitch detection algorithm [1] has been cited recently as an improvement over other standard pitch estimation algorithms. In this paper an attempt is made to present a unifying view on some of these existing and seemingly disparate techniques. The unified view enables the development of robust formulations of some existing definitions and also helps to interpret the limitations of the classical/existing approaches in use. Application of the idea for a robust On-the-Fly pitch (OTFP) detection is demonstrated and comparison with robust YIN pitch detector has yielded encouraging results. The On-The-Fly imposes a constraint that pitch or aperiodicity estimates from past or future speech frames are not to be used at a post processing stage and OTFP outperforms the YIN estimator with this constraint.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122670152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Affinity relation discovery in image database clustering and content-based retrieval 图像数据库聚类和基于内容的检索中的亲和关系发现
Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027614
M. Shyu, Shu‐Ching Chen, Min Chen, Chengcui Zhang
In this paper, we propose a unified framework, called Markov Model Mediator (MMM), to facilitate image database clustering and to improve the query performance. The structure of the MMM framework consists of two hierarchical levels: local MMMs and integrated MMMs, which model the affinity relations among the images within a single image database and within a set of image databases, respectively, via an effective data mining process. The effectiveness and efficiency of the MMM framework for database clustering and image retrieval are demonstrated over a set of image databases which contain various numbers of images with different dimensions and concept categories.
在本文中,我们提出了一个统一的框架,称为马尔可夫模型中介(MMM),以方便图像数据库聚类和提高查询性能。MMM框架的结构包括两个层次:局部mm和集成mm,它们分别通过有效的数据挖掘过程对单个图像数据库和一组图像数据库中的图像之间的亲和关系进行建模。通过一组包含不同维度和概念类别的不同数量图像的图像数据库,验证了MMM框架在数据库聚类和图像检索方面的有效性和效率。
{"title":"Affinity relation discovery in image database clustering and content-based retrieval","authors":"M. Shyu, Shu‐Ching Chen, Min Chen, Chengcui Zhang","doi":"10.1145/1027527.1027614","DOIUrl":"https://doi.org/10.1145/1027527.1027614","url":null,"abstract":"In this paper, we propose a unified framework, called <i>Markov Model Mediator</i> (MMM), to facilitate image database clustering and to improve the query performance. The structure of the MMM framework consists of two hierarchical levels: local MMMs and integrated MMMs, which model the affinity relations among the images within a single image database and within a set of image databases, respectively, via an effective data mining process. The effectiveness and efficiency of the MMM framework for database clustering and image retrieval are demonstrated over a set of image databases which contain various numbers of images with different dimensions and concept categories.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"129 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123291917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
A multiple watermarking algorithm based on CDMA technique 基于 CDMA 技术的多重水印算法
Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027629
F. Zou, Zhengding Lu, H. Ling
This paper proposes a multiple watermarking algorithm based on code division multiple access (CDMA) technique. Before the watermark embedded, each user uses his private key as a seed to generate an address code which is subjected to pseudorandom noise distribution. Each watermark is modulated into a carrier signal with its corresponding address code. And then these carrier signals are added to host media (e.g. image, video and audio). During watermark detection, using the same address code, each user extracts his watermark from the detected media via calculating the correlation coefficient between address code and watermarked vector of the detected media. Each user can embed and extract his watermark independently, regardless of others. The experimental results show that this scheme can embed and extract watermarks independently for each user and is capable for multiple watermarking.
本文提出了一种基于码分多址(CDMA)技术的多重水印算法。在嵌入水印之前,每个用户使用自己的私人密钥作为种子,生成一个地址码,并对其进行伪随机噪声分布。每个水印与相应的地址码一起被调制到载波信号中。然后将这些载波信号添加到主机媒体(如图像、视频和音频)中。在水印检测过程中,每个用户使用相同的地址码,通过计算地址码与被检测媒体的水印向量之间的相关系数,从被检测媒体中提取自己的水印。每个用户都可以独立嵌入和提取自己的水印,而不受其他用户的影响。实验结果表明,该方案能为每个用户独立嵌入和提取水印,并能进行多重水印处理。
{"title":"A multiple watermarking algorithm based on CDMA technique","authors":"F. Zou, Zhengding Lu, H. Ling","doi":"10.1145/1027527.1027629","DOIUrl":"https://doi.org/10.1145/1027527.1027629","url":null,"abstract":"This paper proposes a multiple watermarking algorithm based on code division multiple access (CDMA) technique. Before the watermark embedded, each user uses his private key as a seed to generate an address code which is subjected to pseudorandom noise distribution. Each watermark is modulated into a carrier signal with its corresponding address code. And then these carrier signals are added to host media (e.g. image, video and audio). During watermark detection, using the same address code, each user extracts his watermark from the detected media via calculating the correlation coefficient between address code and watermarked vector of the detected media. Each user can embed and extract his watermark independently, regardless of others. The experimental results show that this scheme can embed and extract watermarks independently for each user and is capable for multiple watermarking.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131329139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Semantic-aware automatic video editing 语义感知自动视频编辑
Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027753
S. Bocconi
One of the challenges of multimedia applications is to provide user-tailored access to information encoded in different media. Particularly, previous research has not yet fully explored how to automatically compose different video segments according to a communicative goal. We propose a rhetoric-based method to support the selection and automatic editing of user-requested content from video footage. The method is applied to the domain of video documentaries to create biased sequences about a user selected subject.
多媒体应用程序的挑战之一是为用户提供对不同媒体编码的信息的定制访问。特别是,以往的研究尚未充分探索如何根据交际目标自动合成不同的视频片段。我们提出了一种基于修辞学的方法来支持从视频片段中选择和自动编辑用户请求的内容。将该方法应用于视频纪录片领域,以创建关于用户选择主题的偏置序列。
{"title":"Semantic-aware automatic video editing","authors":"S. Bocconi","doi":"10.1145/1027527.1027753","DOIUrl":"https://doi.org/10.1145/1027527.1027753","url":null,"abstract":"One of the challenges of multimedia applications is to provide user-tailored access to information encoded in different media. Particularly, previous research has not yet fully explored how to automatically compose different video segments according to a communicative goal. We propose a rhetoric-based method to support the selection and automatic editing of user-requested content from video footage. The method is applied to the domain of video documentaries to create biased sequences about a user selected subject.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125681697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Between context-aware media capture and multimedia content analysis: where do we find the promised land? 在上下文感知媒体捕获和多媒体内容分析之间:我们在哪里找到应许之地?
Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027727
Susanne CJ Boll, D. Bulterman, R. Jain, Tat-Seng Chua, R. Lienhart, L. Wilcox, Marc Davis, S. Venkatesh
Various issues related to the multimedia information retrieval and media access are discussed. The feasible solutions for automatic signal-based analysis of media content are analyzed. The extent of user involvement in the content creation process is emphasized. The applications driving the creation and usage of context and metadata are also elaborated.
讨论了与多媒体信息检索和媒体访问有关的各种问题。分析了基于信号的媒体内容自动分析的可行方案。强调用户在内容创建过程中的参与程度。还详细阐述了驱动上下文和元数据的创建和使用的应用程序。
{"title":"Between context-aware media capture and multimedia content analysis: where do we find the promised land?","authors":"Susanne CJ Boll, D. Bulterman, R. Jain, Tat-Seng Chua, R. Lienhart, L. Wilcox, Marc Davis, S. Venkatesh","doi":"10.1145/1027527.1027727","DOIUrl":"https://doi.org/10.1145/1027527.1027727","url":null,"abstract":"Various issues related to the multimedia information retrieval and media access are discussed. The feasible solutions for automatic signal-based analysis of media content are analyzed. The extent of user involvement in the content creation process is emphasized. The applications driving the creation and usage of context and metadata are also elaborated.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"113 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124694108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
BiReality: mutually-immersive telepresence BiReality:相互沉浸式远程呈现
Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027725
N. Jouppi, Subu Iyer, Stan Thomas, A. Mitchell
BiReality (a.k.a. Mutually-Immersive Telepresence) uses a teleoperated robotic surrogate to provide an immersive telepresence system for face-to-face interactions. Our goal is to recreate to the greatest extent practical, both for the user and the people at the remote location, the sensory experience relevant for face-to-face interactions of the user actually being in the remote location. Our system provides a 360-degree surround immersive audio and visual experience for both the user and remote participants, and streams eight 704x480 MPEG-2 coded videos totaling 20Mb/s. The system preserves gaze and eye contact, presents local and remote participants to each other at life size, and preserves the head height of the user at the remote location. Initial user experiences are presented.
BiReality(又名相互沉浸式远程呈现)使用远程操作的机器人代理来提供面对面交互的沉浸式远程呈现系统。我们的目标是在最大程度上为用户和在远程位置的人重现,与用户在远程位置的面对面互动相关的感官体验。我们的系统为用户和远程参与者提供360度环绕式沉浸式音频和视觉体验,并流式传输8个704x480 MPEG-2编码视频,总计20Mb/s。该系统保持凝视和目光接触,以真人大小向本地和远程参与者展示对方,并保持远程位置用户的头部高度。呈现初始用户体验。
{"title":"BiReality: mutually-immersive telepresence","authors":"N. Jouppi, Subu Iyer, Stan Thomas, A. Mitchell","doi":"10.1145/1027527.1027725","DOIUrl":"https://doi.org/10.1145/1027527.1027725","url":null,"abstract":"BiReality (a.k.a. Mutually-Immersive Telepresence) uses a teleoperated robotic surrogate to provide an immersive telepresence system for face-to-face interactions. Our goal is to recreate to the greatest extent practical, both for the user and the people at the remote location, the sensory experience relevant for face-to-face interactions of the user actually being in the remote location. Our system provides a 360-degree surround immersive audio and visual experience for both the user and remote participants, and streams eight 704x480 MPEG-2 coded videos totaling 20Mb/s. The system preserves gaze and eye contact, presents local and remote participants to each other at life size, and preserves the head height of the user at the remote location. Initial user experiences are presented.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116446331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Automatic replay generation for soccer video broadcasting 自动回放生成足球视频广播
Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027535
Jinjun Wang, Changsheng Xu, Chng Eng Siong, K. Wan, Q. Tian
While most current approaches for sports video analysis are based on broadcast video, in this paper, we present a novel approach for highlight detection and automatic replay generation for soccer videos taken by the main camera. This research is important as current soccer highlight detection and replay generation from a live game is a labor-intensive process. A robust multi-level, multi-model event detection framework is proposed to detect the event and event boundaries from the video taken by the main camera. This framework explores the possible analysis cues, using a mid-level representation to bridge the gap between low-level features and high-level events. The event detection results and mid-level representation are used to generate replays which are automatically inserted into the video. Experimental results are promising and found to be comparable with those generated by broadcast professionals.
虽然目前大多数体育视频分析方法都是基于广播视频,但在本文中,我们提出了一种针对主摄像机拍摄的足球视频进行高光检测和自动重放生成的新方法。这项研究是重要的,因为目前的足球高光检测和回放生成从现场比赛是一个劳动密集型的过程。提出了一种鲁棒的多级、多模型事件检测框架,从主摄像机拍摄的视频中检测事件和事件边界。这个框架探索可能的分析线索,使用中级表示来弥合低级特征和高级事件之间的差距。事件检测结果和中级表示用于生成自动插入视频的重播。实验结果是有希望的,并且发现与广播专业人员产生的结果相当。
{"title":"Automatic replay generation for soccer video broadcasting","authors":"Jinjun Wang, Changsheng Xu, Chng Eng Siong, K. Wan, Q. Tian","doi":"10.1145/1027527.1027535","DOIUrl":"https://doi.org/10.1145/1027527.1027535","url":null,"abstract":"While most current approaches for sports video analysis are based on broadcast video, in this paper, we present a novel approach for highlight detection and automatic replay generation for soccer videos taken by the main camera. This research is important as current soccer highlight detection and replay generation from a live game is a labor-intensive process. A robust multi-level, multi-model event detection framework is proposed to detect the event and event boundaries from the video taken by the main camera. This framework explores the possible analysis cues, using a mid-level representation to bridge the gap between low-level features and high-level events. The event detection results and mid-level representation are used to generate replays which are automatically inserted into the video. Experimental results are promising and found to be comparable with those generated by broadcast professionals.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130211149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 70
3D reconstruction and enrichment of broadcast soccer video 足球转播视频的三维重建与丰富
Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027586
Xinguo Yu, Xin Yan, Tze Sen Hay, H. Leong
Recently, it has become a new trend to reconstruct sports video for various purposes. This paper presents a 3D reconstruction and enrichment system that not only reconstructs broadcast soccer video but also enriches reconstructed video with music and illustrations of the video contents. The system can reconstruct not only the goalmouth scene but also the midfield scene, which cannot be reconstructed by the existing systems. To quickly find the feature points for calibrating the camera, we propose a fast algorithm to detect the lines in the goalmouth scene and use the algorithm proposed in our previous papers to detect the partial ellipses in the midfield scene. The reconstruction is conducted on several video sequences of two scenes. The reconstructed videos eliminate the ball deformation and unnecessary camera changes through smoothing the camera parameters. This system also serves as an experimental system for our project that reconstructs the on-going soccer game in real time.
近年来,对体育录像进行各种目的的重构已成为一种新的趋势。本文提出了一种三维重建与充实系统,该系统不仅可以对转播足球视频进行重建,还可以利用视频内容的音乐和插图对重建视频进行丰富。该系统不仅可以重建球门场景,还可以重建现有系统无法重建的中场场景。为了快速找到用于标定摄像机的特征点,我们提出了一种快速检测门口场景中的直线的算法,并使用我们之前论文中提出的算法检测中场场景中的部分椭圆。对两个场景的多个视频序列进行重构。重构后的视频通过平滑摄像机参数消除了球的变形和不必要的摄像机变化。该系统也可以作为我们项目的实验系统,用于实时重建正在进行的足球比赛。
{"title":"3D reconstruction and enrichment of broadcast soccer video","authors":"Xinguo Yu, Xin Yan, Tze Sen Hay, H. Leong","doi":"10.1145/1027527.1027586","DOIUrl":"https://doi.org/10.1145/1027527.1027586","url":null,"abstract":"Recently, it has become a new trend to reconstruct sports video for various purposes. This paper presents a 3D reconstruction and enrichment system that not only reconstructs broadcast soccer video but also enriches reconstructed video with music and illustrations of the video contents. The system can reconstruct not only the goalmouth scene but also the midfield scene, which cannot be reconstructed by the existing systems. To quickly find the feature points for calibrating the camera, we propose a fast algorithm to detect the lines in the goalmouth scene and use the algorithm proposed in our previous papers to detect the partial ellipses in the midfield scene. The reconstruction is conducted on several video sequences of two scenes. The reconstructed videos eliminate the ball deformation and unnecessary camera changes through smoothing the camera parameters. This system also serves as an experimental system for our project that reconstructs the on-going soccer game in real time.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125705995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
N.A.G.: network auralization for Gnutella Gnutella的网络听觉化
Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027567
Jason Freeman
N.A.G. (Network Auralization for Gnutella) is interactive software art designed to actively involve a lay public without musical training in a creative musical experience. Users enter search keywords, and the software looks for matching music files on the Gnutella peer-to-peer file-sharing network. As it downloads music, it plays an audio collage whose structure is based on the relative download rates of the files.
N.A.G. (Gnutella的网络听觉化)是一种交互式软件艺术,旨在积极地让没有受过音乐训练的外行参与到创造性的音乐体验中。用户输入搜索关键词,软件就会在Gnutella点对点文件共享网络上寻找匹配的音乐文件。当它下载音乐时,它会播放音频拼贴,其结构基于文件的相对下载速率。
{"title":"N.A.G.: network auralization for Gnutella","authors":"Jason Freeman","doi":"10.1145/1027527.1027567","DOIUrl":"https://doi.org/10.1145/1027527.1027567","url":null,"abstract":"N.A.G. (Network Auralization for Gnutella) is interactive software art designed to actively involve a lay public without musical training in a creative musical experience. Users enter search keywords, and the software looks for matching music files on the Gnutella peer-to-peer file-sharing network. As it downloads music, it plays an audio collage whose structure is based on the relative download rates of the files.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"128 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131639353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards auto-documentary: tracking the evolution of news stories 走向自动纪录片:追踪新闻故事的演变
Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027719
P. D. Sahin, Jia-Yu Pan, D. Forsyth
News videos constitute an important source of information for tracking and documenting important events. In these videos, news stories are often accompanied by short video shots that tend to be repeated during the course of the event. Automatic detection of such repetitions is essential for creating auto-documentaries, for alleviating the limitation of traditional textual topic detection methods. In this paper, we propose novel methods for detecting and tracking the evolution of news over time. The proposed method exploits both visual cues and textual information to summarize evolving news stories. Experiments are carried on the TREC-VID data set consisting of 120 hours of news videos from two different channels.
新闻视频是跟踪和记录重大事件的重要信息来源。在这些视频中,新闻故事通常伴随着短视频镜头,这些镜头往往在事件发生的过程中反复出现。这种重复的自动检测对于创建自动纪录片是必不可少的,可以减轻传统文本主题检测方法的局限性。在本文中,我们提出了新的方法来检测和跟踪新闻随时间的演变。该方法利用视觉线索和文本信息来总结新闻故事的发展。实验在由两个不同频道的120小时新闻视频组成的TREC-VID数据集上进行。
{"title":"Towards auto-documentary: tracking the evolution of news stories","authors":"P. D. Sahin, Jia-Yu Pan, D. Forsyth","doi":"10.1145/1027527.1027719","DOIUrl":"https://doi.org/10.1145/1027527.1027719","url":null,"abstract":"News videos constitute an important source of information for tracking and documenting important events. In these videos, news stories are often accompanied by short video shots that tend to be repeated during the course of the event. Automatic detection of such repetitions is essential for creating auto-documentaries, for alleviating the limitation of traditional textual topic detection methods. In this paper, we propose novel methods for detecting and tracking the evolution of news over time. The proposed method exploits both visual cues and textual information to summarize evolving news stories. Experiments are carried on the TREC-VID data set consisting of 120 hours of news videos from two different channels.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132852728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 74
期刊
MULTIMEDIA '04
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1