首页 > 最新文献

2006 IEEE International Conference on Multimedia and Expo最新文献

英文 中文
A System for Automatic Judgment of Offsides in Soccer Games 足球比赛越位自动判罚系统
Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262924
Sadatsugu Hashimoto, S. Ozawa
In this paper, we propose a system for automatic judgment of offsides in soccer games. We detect and track players in fixed multi camera images and calculate the world coordinates of them. Furthermore, we do a formation analysis by classifying uniforms and calculate the position of an offside line. On the other hand, we calculate the 3D coordinates and the trajectories of a ball in world coordinates from the plane coordinates of a ball in multi cameras and recognize the moment of a play from the 3D trajectories of a ball. In addition, we make a judge player's interfering with play by analyzing the spatial relationship between a ball and players. Finally, we make an offside judgment by integrating these results. We apply our system to a real soccer match and demonstrate the availability of this system by showing the experimental results
本文提出了一种足球比赛越位自动判罚系统。我们在固定的多相机图像中检测和跟踪玩家,并计算他们的世界坐标。此外,通过队服分类进行队形分析,并计算越位线位置。另一方面,我们从多摄像机中球的平面坐标计算出球在世界坐标中的三维坐标和轨迹,并从球的三维轨迹中识别出比赛的时刻。另外,通过分析球与球员之间的空间关系,判断球员对比赛的干扰。最后,综合这些结果,做出越位判断。将该系统应用于一场真实的足球比赛,并通过实验结果验证了该系统的有效性
{"title":"A System for Automatic Judgment of Offsides in Soccer Games","authors":"Sadatsugu Hashimoto, S. Ozawa","doi":"10.1109/ICME.2006.262924","DOIUrl":"https://doi.org/10.1109/ICME.2006.262924","url":null,"abstract":"In this paper, we propose a system for automatic judgment of offsides in soccer games. We detect and track players in fixed multi camera images and calculate the world coordinates of them. Furthermore, we do a formation analysis by classifying uniforms and calculate the position of an offside line. On the other hand, we calculate the 3D coordinates and the trajectories of a ball in world coordinates from the plane coordinates of a ball in multi cameras and recognize the moment of a play from the 3D trajectories of a ball. In addition, we make a judge player's interfering with play by analyzing the spatial relationship between a ball and players. Finally, we make an offside judgment by integrating these results. We apply our system to a real soccer match and demonstrate the availability of this system by showing the experimental results","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115865735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
PING: a Group-to-Individual Distributed Meeting System PING:群组到个人的分布式会议系统
Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262737
Y. Rui, Eric Rudolph, Li-wei He, Rico Malvar, Michael F. Cohen, I. Tashev
Group-to-individual (G2I) distributed meeting is an important but understudied area. Because of the asymmetry between different parties in G2I meetings, it has two unique challenges: l)the remote participant tends to be ignored by the local participants; and 2) the remote participant has inferior audio, video, and data experience than the local participants. To address these issues, in this paper we present PING, a system explicitly designed for G2I distributed meetings that combines recent advances in both hardware, e.g., microphone arrays, remote person stand-in devices, and software, e.g., audio-video processing, to improve users' G2I meeting experience. We report how PING addresses the above two challenges and its system design and implementation
群体对个人(G2I)分布式会议是一个重要但尚未得到充分研究的领域。由于G2I会议中不同参与方之间的不对称性,它面临着两个独特的挑战:1)远程参与者容易被本地参与者忽视;2)远程参与者的音频、视频和数据体验不如本地参与者。为了解决这些问题,在本文中,我们提出了PING,这是一个专门为G2I分布式会议设计的系统,它结合了硬件(如麦克风阵列、远程人员替代设备)和软件(如音频视频处理)的最新进展,以改善用户的G2I会议体验。我们报告了PING如何解决上述两个挑战以及它的系统设计和实现
{"title":"PING: a Group-to-Individual Distributed Meeting System","authors":"Y. Rui, Eric Rudolph, Li-wei He, Rico Malvar, Michael F. Cohen, I. Tashev","doi":"10.1109/ICME.2006.262737","DOIUrl":"https://doi.org/10.1109/ICME.2006.262737","url":null,"abstract":"Group-to-individual (G2I) distributed meeting is an important but understudied area. Because of the asymmetry between different parties in G2I meetings, it has two unique challenges: l)the remote participant tends to be ignored by the local participants; and 2) the remote participant has inferior audio, video, and data experience than the local participants. To address these issues, in this paper we present PING, a system explicitly designed for G2I distributed meetings that combines recent advances in both hardware, e.g., microphone arrays, remote person stand-in devices, and software, e.g., audio-video processing, to improve users' G2I meeting experience. We report how PING addresses the above two challenges and its system design and implementation","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132551391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Using Implicit Relevane Feedback to Advance Web Image Search 使用隐式相关反馈推进网络图像搜索
Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262895
En Cheng, Feng Jing, Mingjing Li, Wei-Ying Ma, Hai Jin
Although relevance feedback has been extensively studied in content-based image retrieval in the academic area, no commercial Web image search engine has employed the idea. There are several obstacles for Web image search engines in applying relevance feedback. To overcome these obstacles, we proposed an efficient implicit relevance feedback mechanism. The proposed mechanism shows advantage over traditional relevance feedback methods in the following three aspects. Firstly, instead of enforcing the users to make explicit judgment on the results, our method regards user's click-through data as implicit relevance feedback which release burden from users. Secondly, a hierarchical image search results clustering algorithm is proposed to semantically organize the search results. Using the clustering results as features, our relevance feedback scheme could catch and reflect users' search intention precisely. Lastly, unlike traditional relevance feedback user interface which hardily substitutes subsequent results for previous ones, our method employed friendly recommendation rather than substitution to let the user narrow down on the refined images. To evaluate the implicit relevance feedback mechanism, comprehensive user studies were performed
尽管学术界对基于内容的图像检索中的相关反馈进行了广泛的研究,但目前还没有商业的Web图像搜索引擎采用这一思想。Web图像搜索引擎在应用相关反馈方面存在一些障碍。为了克服这些障碍,我们提出了一种有效的隐式相关反馈机制。与传统的相关反馈方法相比,本文提出的机制在以下三个方面具有优势。首先,我们的方法不是强迫用户对结果做出明确的判断,而是将用户的点击率数据作为隐式的关联反馈,从而减轻了用户的负担。其次,提出了一种分层图像搜索结果聚类算法,对搜索结果进行语义组织。以聚类结果为特征,我们的关联反馈方案能够准确捕捉和反映用户的搜索意图。最后,与传统的相关性反馈用户界面很难替代之前的结果不同,我们的方法采用友好推荐而不是替代来让用户缩小精细图像的范围。为了评估内隐关联反馈机制,我们进行了全面的用户研究
{"title":"Using Implicit Relevane Feedback to Advance Web Image Search","authors":"En Cheng, Feng Jing, Mingjing Li, Wei-Ying Ma, Hai Jin","doi":"10.1109/ICME.2006.262895","DOIUrl":"https://doi.org/10.1109/ICME.2006.262895","url":null,"abstract":"Although relevance feedback has been extensively studied in content-based image retrieval in the academic area, no commercial Web image search engine has employed the idea. There are several obstacles for Web image search engines in applying relevance feedback. To overcome these obstacles, we proposed an efficient implicit relevance feedback mechanism. The proposed mechanism shows advantage over traditional relevance feedback methods in the following three aspects. Firstly, instead of enforcing the users to make explicit judgment on the results, our method regards user's click-through data as implicit relevance feedback which release burden from users. Secondly, a hierarchical image search results clustering algorithm is proposed to semantically organize the search results. Using the clustering results as features, our relevance feedback scheme could catch and reflect users' search intention precisely. Lastly, unlike traditional relevance feedback user interface which hardily substitutes subsequent results for previous ones, our method employed friendly recommendation rather than substitution to let the user narrow down on the refined images. To evaluate the implicit relevance feedback mechanism, comprehensive user studies were performed","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129985665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
DMB (Digital Multimedia Broadcasting) Voice EPG Application DMB(数字多媒体广播)语音EPG应用
Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262781
Bong-Ho Lee, S. Park, Heejeong Kim, C. Ahn, S. Lee
Recently, mobile TV is becoming a mainstream service in mobile broadcasting scenario, where requires lots of mobile factors such as robust transmission, high performance and easy interface and so on. As mobile broadcasting services are being highlighted, the easy interface and appropriate application are increasingly in demand. In most mobile scenarios, where users may actually share their attention with other concurrent tasks and where highly integrated devices may have very limited physical characteristics, intuitive man-machine interfaces are key factors to successful applications. The EPG, an essential application in most digital broadcasting systems, is also not free from the easy interface and mobile factors. The conventional GUI driven EPG solutions are, sometimes, not appropriate to the mobile system aiming the mobile TV and rich interactive data services. In this paper we present voice enabled EPG application that features voice user interaction and dialog technology allowing the user to have a speech interaction with the terminal in navigating and searching any program or service. We illustrate an overall service framework addressing the content delivery and consuming architecture fitted to the DMB environment. Moreover, we propose and implement an agent platform by profiling the elements of VoiceXML and extending EPG related elements to enable the EPG functionalities
近年来,移动电视正在成为移动广播领域的主流业务,对传输鲁棒性、高性能、易接口等诸多移动要素提出了更高的要求。随着移动广播业务的日益突出,人们越来越需要简单的界面和合适的应用程序。在大多数移动场景中,用户实际上可能与其他并发任务共享他们的注意力,并且高度集成的设备可能具有非常有限的物理特性,直观的人机界面是成功应用程序的关键因素。EPG是大多数数字广播系统中必不可少的应用程序,但也不能摆脱易于使用的界面和移动性因素。传统的图形用户界面驱动的EPG解决方案有时不适合以移动电视和丰富的交互式数据服务为目标的移动系统。在本文中,我们提出了语音支持的EPG应用程序,该应用程序具有语音用户交互和对话技术,允许用户在导航和搜索任何程序或服务时与终端进行语音交互。我们演示了一个解决适合DMB环境的内容交付和消费体系结构的整体服务框架。此外,我们提出并实现了一个代理平台,通过分析VoiceXML的元素和扩展EPG相关的元素来实现EPG功能
{"title":"DMB (Digital Multimedia Broadcasting) Voice EPG Application","authors":"Bong-Ho Lee, S. Park, Heejeong Kim, C. Ahn, S. Lee","doi":"10.1109/ICME.2006.262781","DOIUrl":"https://doi.org/10.1109/ICME.2006.262781","url":null,"abstract":"Recently, mobile TV is becoming a mainstream service in mobile broadcasting scenario, where requires lots of mobile factors such as robust transmission, high performance and easy interface and so on. As mobile broadcasting services are being highlighted, the easy interface and appropriate application are increasingly in demand. In most mobile scenarios, where users may actually share their attention with other concurrent tasks and where highly integrated devices may have very limited physical characteristics, intuitive man-machine interfaces are key factors to successful applications. The EPG, an essential application in most digital broadcasting systems, is also not free from the easy interface and mobile factors. The conventional GUI driven EPG solutions are, sometimes, not appropriate to the mobile system aiming the mobile TV and rich interactive data services. In this paper we present voice enabled EPG application that features voice user interaction and dialog technology allowing the user to have a speech interaction with the terminal in navigating and searching any program or service. We illustrate an overall service framework addressing the content delivery and consuming architecture fitted to the DMB environment. Moreover, we propose and implement an agent platform by profiling the elements of VoiceXML and extending EPG related elements to enable the EPG functionalities","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130175061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Path-Diversity Overlay Retransmission Architecture for Reliable Multicast 可靠组播的路径分集覆盖重传架构
Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262872
W. Zeng, Yingnan Zhu, Haibin Lu, Hongbing Jiang
IP-multicast is a bandwidth efficient transmission mechanism for group communications. Reliability in IP-multicast, however, poses a set of significant challenges. To address the reliability and scalability issues in IP-multicast, this paper proposes a novel overlay retransmission architecture that exploits path-diversity by taking advantages of both IP multicast and an overlay network. We show that the proposed path diversity overlay retransmission architecture has the potential to significantly improve the reliability, delay, playback quality, and scalability of IP-multicast based multimedia applications. The general concept of using P2P overlay networks to help improve the QoS performance of multimedia applications as illustrated in this paper is expected to have significant impact on the deployment of next generation multimedia services
ip组播是一种带宽高效的组通信传输机制。然而,ip组播的可靠性提出了一系列重大挑战。为了解决IP组播中的可靠性和可扩展性问题,本文提出了一种新的覆盖重传架构,该架构利用IP组播和覆盖网络的优势来利用路径分集。我们证明了所提出的路径分集覆盖重传架构具有显著改善基于ip组播的多媒体应用的可靠性、延迟、重放质量和可扩展性的潜力。本文所阐述的使用P2P覆盖网络来帮助提高多媒体应用的QoS性能的一般概念预计将对下一代多媒体服务的部署产生重大影响
{"title":"Path-Diversity Overlay Retransmission Architecture for Reliable Multicast","authors":"W. Zeng, Yingnan Zhu, Haibin Lu, Hongbing Jiang","doi":"10.1109/ICME.2006.262872","DOIUrl":"https://doi.org/10.1109/ICME.2006.262872","url":null,"abstract":"IP-multicast is a bandwidth efficient transmission mechanism for group communications. Reliability in IP-multicast, however, poses a set of significant challenges. To address the reliability and scalability issues in IP-multicast, this paper proposes a novel overlay retransmission architecture that exploits path-diversity by taking advantages of both IP multicast and an overlay network. We show that the proposed path diversity overlay retransmission architecture has the potential to significantly improve the reliability, delay, playback quality, and scalability of IP-multicast based multimedia applications. The general concept of using P2P overlay networks to help improve the QoS performance of multimedia applications as illustrated in this paper is expected to have significant impact on the deployment of next generation multimedia services","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130220495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Prefilter Control Scheme for Low bitrate TV Distribution 低比特率电视分配的预滤波控制方案
Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262952
Ryoichi Kawada, A. Koike, Y. Nakajima
In IP-based TV distribution, coding degradation is sometimes evident in critical scenes because the bit rate for compression is rather low. Prefiltering is an effective countermeasure since it replaces the coding noise with the degradation more difficult to detect visually, though it has the drawback that excessive smoothing might occur. This paper proposes a scene-adaptive method to control a prefilter separate from the encoder. By calculating block-wise motion-compensated predictive error variances and correlation coefficients, it estimates the coding noise as well as the potential improvement by prefiltering each frame, realizing a control scheme which performs prefiltering only when effective
在基于ip的电视分配中,由于压缩比特率相当低,在关键场景中编码退化有时很明显。预滤波是一种有效的对策,因为它用更难以视觉检测的退化取代了编码噪声,尽管它有可能出现过度平滑的缺点。本文提出了一种场景自适应方法来控制预滤波器与编码器分离。通过计算逐块运动补偿预测误差方差和相关系数,估计编码噪声以及预滤波每帧的潜在改进,实现了一种只在有效时进行预滤波的控制方案
{"title":"Prefilter Control Scheme for Low bitrate TV Distribution","authors":"Ryoichi Kawada, A. Koike, Y. Nakajima","doi":"10.1109/ICME.2006.262952","DOIUrl":"https://doi.org/10.1109/ICME.2006.262952","url":null,"abstract":"In IP-based TV distribution, coding degradation is sometimes evident in critical scenes because the bit rate for compression is rather low. Prefiltering is an effective countermeasure since it replaces the coding noise with the degradation more difficult to detect visually, though it has the drawback that excessive smoothing might occur. This paper proposes a scene-adaptive method to control a prefilter separate from the encoder. By calculating block-wise motion-compensated predictive error variances and correlation coefficients, it estimates the coding noise as well as the potential improvement by prefiltering each frame, realizing a control scheme which performs prefiltering only when effective","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"221 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134268481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
A Rank based Metric of Anchor Models for Speaker Verification 基于秩的说话人锚定模型验证度量
Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262726
Yingchun Yang, Min Yang, Zhaohui Wu
In this paper, we present an improved method of anchor models for speaker verification. Anchor model is the method that represent a speaker by his relativity of a set of other speakers, called anchor speakers. It was firstly introduced for speaker indexing in large audio database. We suggest a rank based metric for the measurement of speaker character vectors in anchor model. Different from conventional metric methods which consider each anchor speaker equally and compare the log likelihood scores directly, in our method the relative order of anchor speakers is exploited to characterize target speaker. We have taken experiments on the YOHO database. The results show that EER of our method is 13.29% lower than that of conventional metric. Also, our method is more robust against the mismatching between test set and anchor set
本文提出了一种改进的锚定模型验证说话人的方法。锚定模型是用一个说话人对一组其他说话人的相对性来表示一个说话人的方法,这些说话人被称为锚定说话人。该方法最早用于大型音频数据库的说话人索引。我们提出了一种基于秩的度量来测量锚定模型中的说话人特征向量。与传统度量方法平等地考虑每个主播说话人并直接比较对数似然评分不同,该方法利用主播说话人的相对顺序来表征目标说话人。我们在YOHO数据库上做了实验。结果表明,该方法的EER比传统度量法低13.29%。此外,该方法对测试集和锚集之间的不匹配具有更强的鲁棒性
{"title":"A Rank based Metric of Anchor Models for Speaker Verification","authors":"Yingchun Yang, Min Yang, Zhaohui Wu","doi":"10.1109/ICME.2006.262726","DOIUrl":"https://doi.org/10.1109/ICME.2006.262726","url":null,"abstract":"In this paper, we present an improved method of anchor models for speaker verification. Anchor model is the method that represent a speaker by his relativity of a set of other speakers, called anchor speakers. It was firstly introduced for speaker indexing in large audio database. We suggest a rank based metric for the measurement of speaker character vectors in anchor model. Different from conventional metric methods which consider each anchor speaker equally and compare the log likelihood scores directly, in our method the relative order of anchor speakers is exploited to characterize target speaker. We have taken experiments on the YOHO database. The results show that EER of our method is 13.29% lower than that of conventional metric. Also, our method is more robust against the mismatching between test set and anchor set","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134354877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Recognizing Commercials in Real-Time using Three Visual Descriptors and a Decision-Tree 基于三种视觉描述符和决策树的实时商业广告识别
Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262822
R. Glasberg, Cengiz Tas, T. Sikora
We present a new approach for classifying mpeg-2 video sequences as `commercial' or `non-commercial' by analyzing specific color, texture and motion features of consecutive frames in real-time. This is part of the well-known video-genre-classification problem, where popular TV-broadcast genres like cartoon, commercial, music, news and sports are studied. Such applications have also been discussed in the context of MPEG-7. In our method the extracted features from three visual descriptors are logically combined using a decision tree to produce a reliable recognition. The results demonstrate a high identification rate based on a large collection of 200 representative video sequences (40 `commercials' and 4*40 `non-commercials') gathered from free digital TV-broadcasting in Germany
我们提出了一种新的方法,通过实时分析连续帧的特定颜色、纹理和运动特征,将mpeg-2视频序列分类为“商业”或“非商业”。这是众所周知的视频类型分类问题的一部分,该问题研究了流行的电视广播类型,如卡通、商业、音乐、新闻和体育。这些应用也在MPEG-7的背景下进行了讨论。在我们的方法中,从三个视觉描述符中提取的特征使用决策树进行逻辑组合以产生可靠的识别。结果表明,基于从德国免费数字电视广播中收集的200个代表性视频序列(40个“商业”和4*40个“非商业”)的大量集合,识别率很高
{"title":"Recognizing Commercials in Real-Time using Three Visual Descriptors and a Decision-Tree","authors":"R. Glasberg, Cengiz Tas, T. Sikora","doi":"10.1109/ICME.2006.262822","DOIUrl":"https://doi.org/10.1109/ICME.2006.262822","url":null,"abstract":"We present a new approach for classifying mpeg-2 video sequences as `commercial' or `non-commercial' by analyzing specific color, texture and motion features of consecutive frames in real-time. This is part of the well-known video-genre-classification problem, where popular TV-broadcast genres like cartoon, commercial, music, news and sports are studied. Such applications have also been discussed in the context of MPEG-7. In our method the extracted features from three visual descriptors are logically combined using a decision tree to produce a reliable recognition. The results demonstrate a high identification rate based on a large collection of 200 representative video sequences (40 `commercials' and 4*40 `non-commercials') gathered from free digital TV-broadcasting in Germany","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"3 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131641175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Methods for None Intrusive Delay Measurment for Audio Communication over Packet Networks 分组网络音频通信无干扰时延测量方法
Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262597
M. Zad-issa, Norbert Rossello, L. Pilati
Measurement of the delay is an important and common problem in communication over packet networks. The end-to-end and the round trip delay are among the factors directly impacting the quality of service as well as the user satisfaction. Multimedia gateways or base stations that perform echo cancellation or suppression often rely on the round trip delay to enhance their performance or to reduce the computational complexity of echo processing logics. In this work, we present two none intrusive methods for delay estimation and tracking. Both methods find the delay using the actual audio signal that is sent through the network. The first approach uses the MDCT transformed domain coefficients of the signal while the second operates in a perceptual domain. Experiments illustrate that both schemes can track the end-to-end and the round trip delay under various network and signal conditions
延迟的测量是分组网络通信中一个重要而常见的问题。端到端和往返延迟是直接影响服务质量和用户满意度的因素之一。执行回波消除或抑制的多媒体网关或基站通常依靠往返延迟来提高其性能或降低回波处理逻辑的计算复杂度。在这项工作中,我们提出了两种无干扰的延迟估计和跟踪方法。这两种方法都使用通过网络发送的实际音频信号来查找延迟。第一种方法使用信号的MDCT变换域系数,而第二种方法在感知域中操作。实验表明,两种方案都能在各种网络和信号条件下跟踪端到端时延和往返时延
{"title":"Methods for None Intrusive Delay Measurment for Audio Communication over Packet Networks","authors":"M. Zad-issa, Norbert Rossello, L. Pilati","doi":"10.1109/ICME.2006.262597","DOIUrl":"https://doi.org/10.1109/ICME.2006.262597","url":null,"abstract":"Measurement of the delay is an important and common problem in communication over packet networks. The end-to-end and the round trip delay are among the factors directly impacting the quality of service as well as the user satisfaction. Multimedia gateways or base stations that perform echo cancellation or suppression often rely on the round trip delay to enhance their performance or to reduce the computational complexity of echo processing logics. In this work, we present two none intrusive methods for delay estimation and tracking. Both methods find the delay using the actual audio signal that is sent through the network. The first approach uses the MDCT transformed domain coefficients of the signal while the second operates in a perceptual domain. Experiments illustrate that both schemes can track the end-to-end and the round trip delay under various network and signal conditions","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"2 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131726466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Semantic Labeling of Multimedia Content Clusters 多媒体内容集群的语义标注
Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262825
Jelena Tešić, John R. Smith
In this paper we present a novel approach for labeling clusters of multimedia content that leverages supervised classification techniques in conjunction with unsupervised clustering. Recent research has produced significant results for automatic tagging of video content such as broadcast news. For example, powerful techniques have been demonstrated in the context of the NIST TRECVID video retrieval benchmark. However, the information needs of users typically span a range of semantic concepts. One of the challenges of these multimedia retrieval systems is to organize the video data in such a way that allows the user to most efficiently navigate the semantic space for the video data set. One important tool for video data organization is clustering. However, clustering results cannot be leveraged effectively when they are not labeled. We propose to build on clustering by aggregating the automatically tagged semantics. We propose and compare four techniques for labeling the clusters and evaluate the performance compared to human labeled ground-truth. We present examples of the cluster labeling results obtained on the BBC stock shots from the TRECVID-2005 video data set
在本文中,我们提出了一种标记多媒体内容聚类的新方法,该方法利用监督分类技术与无监督聚类相结合。最近的研究在广播新闻等视频内容的自动标注方面取得了重大成果。例如,强大的技术已经在NIST的TRECVID视频检索基准中得到了演示。然而,用户的信息需求通常跨越一系列语义概念。这些多媒体检索系统面临的挑战之一是如何组织视频数据,使用户能够最有效地在视频数据集的语义空间中导航。聚类是视频数据组织的一个重要工具。然而,当聚类结果没有被标记时,就不能有效地利用它们。我们建议通过聚合自动标记语义来构建聚类。我们提出并比较了四种标记聚类的技术,并评估了与人类标记的基础真值相比的性能。我们给出了从TRECVID-2005视频数据集中获得的BBC库存镜头的聚类标记结果的示例
{"title":"Semantic Labeling of Multimedia Content Clusters","authors":"Jelena Tešić, John R. Smith","doi":"10.1109/ICME.2006.262825","DOIUrl":"https://doi.org/10.1109/ICME.2006.262825","url":null,"abstract":"In this paper we present a novel approach for labeling clusters of multimedia content that leverages supervised classification techniques in conjunction with unsupervised clustering. Recent research has produced significant results for automatic tagging of video content such as broadcast news. For example, powerful techniques have been demonstrated in the context of the NIST TRECVID video retrieval benchmark. However, the information needs of users typically span a range of semantic concepts. One of the challenges of these multimedia retrieval systems is to organize the video data in such a way that allows the user to most efficiently navigate the semantic space for the video data set. One important tool for video data organization is clustering. However, clustering results cannot be leveraged effectively when they are not labeled. We propose to build on clustering by aggregating the automatically tagged semantics. We propose and compare four techniques for labeling the clusters and evaluate the performance compared to human labeled ground-truth. We present examples of the cluster labeling results obtained on the BBC stock shots from the TRECVID-2005 video data set","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125217431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
期刊
2006 IEEE International Conference on Multimedia and Expo
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1