首页 > 最新文献

MULTIMEDIA '04最新文献

英文 中文
Context for semantic metadata 语义元数据的上下文
Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027574
K. Haase
This article argues for the growing importance of quality metadata and the equation of that quality with precision and semantic grounding. Such semantic grounding requires metadata that derives from intentional human intervention as well as mechanistic measurement of content media. In both cases, one chief problem in the automatic generation of semantic metadata is ambiguity leading to the overgeneration of inaccurate annotations. We look at a particular richly annotated image collection to show how context dramatically reduces the problem of ambiguity over this particular corpus. In particular, we consider both the abstract measurement of "contextual ambiguity" over the collection and the application of a particular disambiguation algorithm to synthesized keyword searches across the selection.
本文论证了高质量元数据日益增长的重要性,以及这种质量与精度和语义基础的关系。这种语义基础需要源自有意的人为干预以及内容媒体的机械测量的元数据。在这两种情况下,自动生成语义元数据的一个主要问题是歧义性导致过度生成不准确的注释。我们看一个特殊的丰富注释的图像集合,以显示上下文如何显着减少这个特定语料库的歧义问题。特别是,我们考虑了集合上“上下文歧义”的抽象测量和特定消歧义算法在整个选择中合成关键字搜索的应用。
{"title":"Context for semantic metadata","authors":"K. Haase","doi":"10.1145/1027527.1027574","DOIUrl":"https://doi.org/10.1145/1027527.1027574","url":null,"abstract":"This article argues for the growing importance of quality metadata and the equation of that quality with precision and semantic grounding. Such semantic grounding requires metadata that derives from intentional human intervention as well as mechanistic measurement of content media. In both cases, one chief problem in the automatic generation of semantic metadata is ambiguity leading to the overgeneration of inaccurate annotations. We look at a particular richly annotated image collection to show how context dramatically reduces the problem of ambiguity over this particular corpus. In particular, we consider both the abstract measurement of \"contextual ambiguity\" over the collection and the application of a particular disambiguation algorithm to synthesized keyword searches across the selection.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"361 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113956197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 51
Video transport over wireless networks 通过无线网络传输视频
Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027626
H. Garudadri, P. Sagetong, S. Nanda
In this paper, we propose an efficient scheme to transport video over wireless networks, specifically cdma2000® 1x. Speech transmission over cdma2000® uses a variable rate voice coder (vocoder) over a channel with multiple fixed rates. We apply these ideas to compressed video transmission over wireless IP networks. Explicit Bit Rate (EBR) video compression is designed to match the video encoder output to a set of fixed channel rates. We show that in comparison with VBR video transmission over a fixed rate wireless channel, EBR video transmission provides improved error resilience, reduced latency and improved efficiency.
在本文中,我们提出了一种通过无线网络传输视频的有效方案,特别是cdma2000®1x。cdma2000®上的语音传输使用可变速率语音编码器(vocoder)在多个固定速率的信道上传输。我们将这些思想应用于无线IP网络上的压缩视频传输。显式比特率(EBR)视频压缩是为了将视频编码器输出匹配到一组固定的信道速率。研究表明,与固定速率无线信道上的VBR视频传输相比,EBR视频传输具有更好的容错能力、更低的延迟和更高的效率。
{"title":"Video transport over wireless networks","authors":"H. Garudadri, P. Sagetong, S. Nanda","doi":"10.1145/1027527.1027626","DOIUrl":"https://doi.org/10.1145/1027527.1027626","url":null,"abstract":"In this paper, we propose an efficient scheme to transport video over wireless networks, specifically cdma2000® 1x. Speech transmission over cdma2000® uses a variable rate voice coder (vocoder) over a channel with multiple fixed rates. We apply these ideas to compressed video transmission over wireless IP networks. Explicit Bit Rate (EBR) video compression is designed to match the video encoder output to a set of fixed channel rates. We show that in comparison with VBR video transmission over a fixed rate wireless channel, EBR video transmission provides improved error resilience, reduced latency and improved efficiency.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125177922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Parsing and browsing tools for colonoscopy videos 结肠镜检查视频的解析和浏览工具
Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027723
Yu Cao, Dalei Li, Wallapak Tavanapong, Jung-Hwan Oh, J. Wong, P. C. Groen
Colonoscopy is an important screening tool for colorectal cancer. During a colonoscopic procedure, a tiny video camera at the tip of the endoscope generates a video signal of the internal mucosa of the colon. The video data are displayed on a monitor for real-time analysis by the endoscopist. We call videos captured from colonoscopic procedures colonoscopy videos. Because these videos possess unique characteristics, new types of semantic units and parsing techniques are required. In this paper, we define new semantic units called operation shots, each is a segment of visual and audio data that correspond to a therapeutic or biopsy operation. We introduce a new spatio-temporal analysis technique to detect operation shots. Our experiments on colonoscopy videos demonstrate that the technique does not miss any meaningful operation shots and incurs a small number of false operation shots. Our prototype parsing software implements the operation shot detection technique along with our other techniques previously developed for colonoscopy videos. Our browsing tool enables users to quickly locate operation shots of interest. The proposed technique and software are useful (1) for post-procedure reviews and analyses for causes of complications due to biopsy or therapeutic operations, (2) for developing an effective content-based retrieval system for colonoscopy videos to facilitate endoscopic research and education, and (3) for development of a systematic approach to assess endoscopists' procedural skills.
结肠镜检查是结直肠癌的重要筛查手段。在结肠镜检查过程中,内窥镜尖端的微型摄像机产生结肠内部粘膜的视频信号。视频数据显示在监视器上,供内窥镜医师实时分析。我们把从结肠镜检查过程中捕获的视频称为结肠镜检查视频。由于这些视频具有独特的特征,因此需要新的语义单元类型和解析技术。在本文中,我们定义了新的语义单位,称为手术镜头,每个是一段视觉和音频数据,对应于治疗或活检手术。本文介绍了一种新的时空分析技术来检测手术镜头。我们对结肠镜视频的实验表明,该技术不会遗漏任何有意义的手术镜头,并且会产生少量的假手术镜头。我们的原型解析软件实现了操作镜头检测技术以及我们以前为结肠镜检查视频开发的其他技术。我们的浏览工具使用户能够快速定位感兴趣的操作镜头。所提出的技术和软件是有用的(1)用于术后检查和分析活检或治疗性手术引起的并发症的原因,(2)用于开发有效的基于内容的结肠镜检查视频检索系统,以促进内窥镜研究和教育,(3)用于开发评估内窥镜医师操作技能的系统方法。
{"title":"Parsing and browsing tools for colonoscopy videos","authors":"Yu Cao, Dalei Li, Wallapak Tavanapong, Jung-Hwan Oh, J. Wong, P. C. Groen","doi":"10.1145/1027527.1027723","DOIUrl":"https://doi.org/10.1145/1027527.1027723","url":null,"abstract":"Colonoscopy is an important screening tool for colorectal cancer. During a colonoscopic procedure, a tiny video camera at the tip of the endoscope generates a video signal of the internal mucosa of the colon. The video data are displayed on a monitor for real-time analysis by the endoscopist. We call videos captured from colonoscopic procedures <i>colonoscopy videos</i>. Because these videos possess unique characteristics, new types of semantic units and parsing techniques are required. In this paper, we define new semantic units called <i>operation shots</i>, each is a segment of visual and audio data that correspond to a therapeutic or biopsy operation. We introduce a new spatio-temporal analysis technique to detect operation shots. Our experiments on colonoscopy videos demonstrate that the technique does not miss any meaningful operation shots and incurs a small number of false operation shots. Our prototype parsing software implements the operation shot detection technique along with our other techniques previously developed for colonoscopy videos. Our browsing tool enables users to quickly locate operation shots of interest. The proposed technique and software are useful (1) for post-procedure reviews and analyses for causes of complications due to biopsy or therapeutic operations, (2) for developing an effective content-based retrieval system for colonoscopy videos to facilitate endoscopic research and education, and (3) for development of a systematic approach to assess endoscopists' procedural skills.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125100538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 37
Facial expression representation and recognition based on texture augmentation and topographic masking 基于纹理增强和地形掩蔽的面部表情表示与识别
Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027580
L. Yin, J. Loi, Wei Xiong
The variation of facial texture and surface due to the change of expression is an important cue for analyzing and modeling facial expressions. In this paper, we propose a new approach to represent the facial expression by using a so-called topographic feature. In order to capture the variation of facial surface structure, facial textures are processed by increasing the resolution. The topographical structure of human face is analyzed based on the resolution-enhanced textures. We investigate the relationship between the facial expression and its topographic features, and propose to represent the facial expression by the topographic labels. The detected topographic facial surface and the expressive regions reflect the status of facial skin movement. Based on the observation that the facial texture and its topographic features change along with facial expressions, we compare the disparity of these features between the neutral face and the expressive face to distinguish a number of universal expressions. The experiment demonstrates the feasibility of the proposed approach for facial expression representation and recognition.
表情变化引起的面部纹理和表面变化是分析和建模面部表情的重要线索。在本文中,我们提出了一种新的方法,即使用所谓的地形特征来表示面部表情。为了捕捉人脸表面结构的变化,通过提高分辨率对人脸纹理进行处理。基于分辨率增强纹理对人脸的地形结构进行分析。研究了面部表情与其地形特征之间的关系,提出了用地形标签来表示面部表情的方法。检测到的地形面部表面和表情区域反映了面部皮肤运动的状态。在观察面部纹理及其地形特征随面部表情变化的基础上,我们比较了中性脸和表情脸在这些特征上的差异,以区分一些通用表情。实验证明了该方法在面部表情表征和识别方面的可行性。
{"title":"Facial expression representation and recognition based on texture augmentation and topographic masking","authors":"L. Yin, J. Loi, Wei Xiong","doi":"10.1145/1027527.1027580","DOIUrl":"https://doi.org/10.1145/1027527.1027580","url":null,"abstract":"The variation of facial texture and surface due to the change of expression is an important cue for analyzing and modeling facial expressions. In this paper, we propose a new approach to represent the facial expression by using a so-called topographic feature. In order to capture the variation of facial surface structure, facial textures are processed by increasing the resolution. The topographical structure of human face is analyzed based on the resolution-enhanced textures. We investigate the relationship between the facial expression and its topographic features, and propose to represent the facial expression by the topographic labels. The detected topographic facial surface and the expressive regions reflect the status of facial skin movement. Based on the observation that the facial texture and its topographic features change along with facial expressions, we compare the disparity of these features between the neutral face and the expressive face to distinguish a number of universal expressions. The experiment demonstrates the feasibility of the proposed approach for facial expression representation and recognition.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"238 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131449945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Location-aware projection with robust 3-D viewing point detection and fast image deformation 位置感知投影,具有鲁棒的三维观察点检测和快速图像变形
Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027595
J. Shimamura, K. Arakawa
This paper describes a novel approach to the construction of a projector-based augmented reality environment. The approach is based on capturing the dynamic changes of surfaces and projecting the images within a large real environment using a system that includes a laser range finder and a projector, whose optical axes are integrated by mirrors. The proposed method offers two distinct advances: (1) robust 3-D viewing point detection from consecutive range images, and (2) fast view-driven image generation and presentation with view frustum clipping to measured surfaces. A prototype system is shown to confirm the feasibility of the method; it generates view-driven images to suit the user's viewing position that are then projected within dynamic real environment, in real-time.
本文描述了一种构建基于投影仪的增强现实环境的新方法。该方法基于捕捉表面的动态变化,并使用包括激光测距仪和投影仪在内的系统在大型真实环境中投射图像,其光轴由镜子集成。该方法提供了两个明显的进步:(1)从连续距离图像中进行鲁棒的三维观察点检测;(2)快速的视觉驱动图像生成和显示,并将视场截距裁剪到被测量表面。仿真结果验证了该方法的可行性;它生成适合用户观看位置的视图驱动图像,然后在动态真实环境中实时投影。
{"title":"Location-aware projection with robust 3-D viewing point detection and fast image deformation","authors":"J. Shimamura, K. Arakawa","doi":"10.1145/1027527.1027595","DOIUrl":"https://doi.org/10.1145/1027527.1027595","url":null,"abstract":"This paper describes a novel approach to the construction of a projector-based augmented reality environment. The approach is based on capturing the dynamic changes of surfaces and projecting the images within a large real environment using a system that includes a laser range finder and a projector, whose optical axes are integrated by mirrors. The proposed method offers two distinct advances: (1) robust 3-D viewing point detection from consecutive range images, and (2) fast view-driven image generation and presentation with view frustum clipping to measured surfaces. A prototype system is shown to confirm the feasibility of the method; it generates view-driven images to suit the user's viewing position that are then projected within dynamic real environment, in real-time.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"204 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134308462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Speech, ink, and slides: the interaction of content channels 语音、墨水和幻灯片:内容渠道的互动
Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027713
Richard J. Anderson, C. Hoyer, Craig Prince, Jonathan Su, F. Videon, S. Wolfman
In this paper, we report on an empirical exploration of digital ink and speech usage in lecture presentation. We studied the video archives of five Master's level Computer Science courses to understand how instructors use ink and speech together while lecturing, and to evaluate techniques for analyzing digital ink. Our interest in understanding how ink and speech are used together is to inform the development of future tools for supporting classroom presentation, distance education, and viewing of archived lectures. We want to make it easier to interact with electronic materials and to extract information from them. We want to provide an empirical basis for addressing challenging problems such as automatically generating full text transcripts of lectures, matching speaker audio with slide content, and recognizing the meaning of the instructor's ink. Our results include an evaluation of handwritten word recognition in the lecture domain, an approach for associating attentional marks with content, an analysis of linkage between speech and ink, and an application of recognition techniques to infer speaker actions.
在本文中,我们报告了一项关于数字墨水和演讲在演讲中的使用的实证研究。我们研究了五门硕士级计算机科学课程的视频档案,以了解教师如何在讲课时同时使用墨水和语音,并评估分析数字墨水的技术。我们对理解墨水和语言如何一起使用的兴趣是为支持课堂演示、远程教育和观看存档讲座的未来工具的开发提供信息。我们希望能更容易地与电子材料互动,并从中提取信息。我们希望为解决具有挑战性的问题提供经验基础,例如自动生成讲座的全文文本,将演讲者的音频与幻灯片内容相匹配,以及识别讲师墨水的含义。我们的研究结果包括对演讲领域手写单词识别的评估,将注意力标记与内容关联的方法,语音和墨水之间联系的分析,以及应用识别技术来推断说话者的行为。
{"title":"Speech, ink, and slides: the interaction of content channels","authors":"Richard J. Anderson, C. Hoyer, Craig Prince, Jonathan Su, F. Videon, S. Wolfman","doi":"10.1145/1027527.1027713","DOIUrl":"https://doi.org/10.1145/1027527.1027713","url":null,"abstract":"In this paper, we report on an empirical exploration of digital ink and speech usage in lecture presentation. We studied the video archives of five Master's level Computer Science courses to understand how instructors use ink and speech together while lecturing, and to evaluate techniques for analyzing digital ink. Our interest in understanding how ink and speech are used together is to inform the development of future tools for supporting classroom presentation, distance education, and viewing of archived lectures. We want to make it easier to interact with electronic materials and to extract information from them. We want to provide an empirical basis for addressing challenging problems such as automatically generating full text transcripts of lectures, matching speaker audio with slide content, and recognizing the meaning of the instructor's ink. Our results include an evaluation of handwritten word recognition in the lecture domain, an approach for associating attentional marks with content, an analysis of linkage between speech and ink, and an application of recognition techniques to infer speaker actions.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114259673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
Implementation and evaluation of EXT3NS multimedia file system EXT3NS多媒体文件系统的实现与评价
Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027668
B. Ahn, Sung-Hoon Sohn, Chei-Yol Kim, Gyuil Cha, Y. Baek, Sung-In Jung, Myungjoon Kim
The EXT3NS is a scalable file system designed to handle video streaming workload in large-scale on-demand streaming services. It is based on a special H/W device, called Network-Storage card (NS card), which aims at accelerating streaming operation by shortening the data path from storage device to network interface. The design objective of EXT3NS is to minimize the delay and the delay variance of I/O request in the sequential workload on NS card. Metadata structure, file organization, metadata structure, unit of storage, etc. are elaborately tailored to achieve this objective. Further, EXT3NS provides the standard API's to read and write files in storage unit of NS card. The streaming server utilizes it to gain high disk I/O bandwidth, to avoid unnecessary memory copies on the data path from disk to network, and to alleviates CPU's burden by offloading parts of network protocol processing, The EXT3NS is a full functional file system based on the popular EXT3. The performance measurements on our prototype video server show obvious performance improvements. Specifically, we obtain better results from file system benchmark program, and obtain performance improvements in disk read and network transmission, which leads to overall streaming performance increase. Especially, the streaming server shows much less server's CPU utilization and less fluctuation of client bit rate, hence more reliable streaming service is possible.
EXT3NS是一个可扩展的文件系统,设计用于处理大规模按需流媒体服务中的视频流工作负载。它基于一种特殊的H/W设备,称为network - storage card (NS卡),其目的是通过缩短数据从存储设备到网络接口的路径来加速流操作。EXT3NS的设计目标是最小化NS卡上顺序工作负载中I/O请求的延迟和延迟变化。元数据结构、文件组织、元数据结构、存储单元等都是为实现这一目标而精心定制的。此外,EXT3NS还提供了标准的API来读写NS卡存储单元中的文件。流服务器利用它来获得高磁盘I/O带宽,避免从磁盘到网络的数据路径上不必要的内存拷贝,并通过卸载部分网络协议处理来减轻CPU的负担。EXT3NS是基于流行的EXT3的全功能文件系统。在我们的原型视频服务器上的性能测量显示出明显的性能改进。具体来说,我们从文件系统基准程序中获得了更好的结果,并且在磁盘读取和网络传输方面获得了性能改进,从而导致整体流性能的提高。特别是流媒体服务器的CPU利用率更低,客户端比特率波动更小,从而可以提供更可靠的流媒体服务。
{"title":"Implementation and evaluation of EXT3NS multimedia file system","authors":"B. Ahn, Sung-Hoon Sohn, Chei-Yol Kim, Gyuil Cha, Y. Baek, Sung-In Jung, Myungjoon Kim","doi":"10.1145/1027527.1027668","DOIUrl":"https://doi.org/10.1145/1027527.1027668","url":null,"abstract":"The EXT3NS is a scalable file system designed to handle video streaming workload in large-scale on-demand streaming services. It is based on a special H/W device, called Network-Storage card (NS card), which aims at accelerating streaming operation by shortening the data path from storage device to network interface. The design objective of EXT3NS is to minimize the delay and the delay variance of I/O request in the sequential workload on NS card. Metadata structure, file organization, metadata structure, unit of storage, etc. are elaborately tailored to achieve this objective. Further, EXT3NS provides the standard API's to read and write files in storage unit of NS card. The streaming server utilizes it to gain high disk I/O bandwidth, to avoid unnecessary memory copies on the data path from disk to network, and to alleviates CPU's burden by offloading parts of network protocol processing, The EXT3NS is a full functional file system based on the popular EXT3. The performance measurements on our prototype video server show obvious performance improvements. Specifically, we obtain better results from file system benchmark program, and obtain performance improvements in disk read and network transmission, which leads to overall streaming performance increase. Especially, the streaming server shows much less server's CPU utilization and less fluctuation of client bit rate, hence more reliable streaming service is possible.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124445983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Collaboration-aware peer-to-peer media streaming 协作感知的点对点媒体流
Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027625
S. Ye, F. Makedon
Peer-to-Peer(P2P) media streaming has emerged as a promising solution to media streaming in large distributed systems such as the Internet. Several P2P media streaming solutions have been proposed by researchers, however they all implicitly assume peers are collaborative, thus they suffer from the selfish peers that are not willing to collaborate. In this paper we introduce an incentive mechanism to urge selfish peers to behave collaboratively. It combines the traditional reputation-based approach and an online streaming behavior monitoring scheme. Our preliminary results show that the overall performance achieved by collaborative peers do not suffer from the existence of non-collaborative peers. The incentive mechanism is orthogonal to the existing media streaming solutions and can be integrated into them.
点对点(P2P)媒体流已经成为大型分布式系统(如Internet)中媒体流的一种很有前途的解决方案。研究人员提出了几种P2P流媒体解决方案,但它们都隐含地假设对等体是协作的,因此它们都受到自私的对等体不愿意合作的影响。在本文中,我们引入了一种激励机制来促使自私的同伴合作行为。它结合了传统的基于声誉的方法和在线流媒体行为监控方案。我们的初步研究结果表明,协作同伴的整体绩效并不会因为非协作同伴的存在而受到影响。激励机制与现有的流媒体解决方案是正交的,可以融入其中。
{"title":"Collaboration-aware peer-to-peer media streaming","authors":"S. Ye, F. Makedon","doi":"10.1145/1027527.1027625","DOIUrl":"https://doi.org/10.1145/1027527.1027625","url":null,"abstract":"Peer-to-Peer(P2P) media streaming has emerged as a promising solution to media streaming in large distributed systems such as the Internet. Several P2P media streaming solutions have been proposed by researchers, however they all implicitly assume peers are collaborative, thus they suffer from the selfish peers that are not willing to collaborate. In this paper we introduce an incentive mechanism to urge selfish peers to behave collaboratively. It combines the traditional reputation-based approach and an online streaming behavior monitoring scheme. Our preliminary results show that the overall performance achieved by collaborative peers do not suffer from the existence of non-collaborative peers. The incentive mechanism is orthogonal to the existing media streaming solutions and can be integrated into them.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124193142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
A bootstrapping framework for annotating and retrieving WWW images 一个用于注释和检索WWW图像的引导框架
Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027748
Huamin Feng, Rui Shi, Tat-Seng Chua
Most current image retrieval systems and commercial search engines use mainly text annotations to index and retrieve WWW images. This research explores the use of machine learning approaches to automatically annotate WWW images based on a predefined list of concepts by fusing evidences from image contents and their associated HTML text. One major practical limitation of employing supervised machine learning approaches is that for effective learning, a large set of labeled training samples is needed. This is tedious and severely impedes the practical development of effective search techniques for WWW images, which are dynamic and fast-changing. As web-based images possess both intrinsic visual contents and text annotations, they provide a strong basis to bootstrap the learning process by adopting a co-training approach involving classifiers based on two orthogonal set of features -- visual and text. The idea of co-training is to start from a small set of labeled training samples, and successively annotate a larger set of unlabeled samples using the two orthogonal classifiers. We carry out experiments using a set of over 5,000 images acquired from the Web. We explore the use of different combinations of HTML text and visual representations. We find that our bootstrapping approach can achieve a performance comparable to that of the supervised learning approach with an F1 measure of over 54%. At the same time, it offers the added advantage of requiring only a small initial set of training samples.
目前大多数图像检索系统和商业搜索引擎主要使用文本注释来索引和检索WWW图像。本研究探索使用机器学习方法,通过融合来自图像内容及其相关HTML文本的证据,基于预定义的概念列表自动注释WWW图像。使用监督机器学习方法的一个主要的实际限制是,为了有效的学习,需要大量的标记训练样本。这是一项繁琐的工作,严重阻碍了对动态和快速变化的WWW图像进行有效搜索技术的实际发展。由于基于web的图像具有内在的视觉内容和文本注释,因此通过采用涉及基于视觉和文本两个正交特征集的分类器的共同训练方法,它们为引导学习过程提供了强大的基础。协同训练的思想是从一组小的带标签的训练样本开始,使用两个正交分类器依次标注一组更大的未标记样本。我们使用从网络上获取的5000多张图片进行实验。我们探索了HTML文本和视觉表示的不同组合的使用。我们发现我们的自举方法可以达到与监督学习方法相当的性能,F1度量超过54%。同时,它还提供了只需要少量初始训练样本的额外优势。
{"title":"A bootstrapping framework for annotating and retrieving WWW images","authors":"Huamin Feng, Rui Shi, Tat-Seng Chua","doi":"10.1145/1027527.1027748","DOIUrl":"https://doi.org/10.1145/1027527.1027748","url":null,"abstract":"Most current image retrieval systems and commercial search engines use mainly text annotations to index and retrieve WWW images. This research explores the use of machine learning approaches to automatically annotate WWW images based on a predefined list of concepts by fusing evidences from image contents and their associated HTML text. One major practical limitation of employing supervised machine learning approaches is that for effective learning, a large set of labeled training samples is needed. This is tedious and severely impedes the practical development of effective search techniques for WWW images, which are dynamic and fast-changing. As web-based images possess both intrinsic visual contents and text annotations, they provide a strong basis to bootstrap the learning process by adopting a co-training approach involving classifiers based on two orthogonal set of features -- visual and text. The idea of co-training is to start from a small set of labeled training samples, and successively annotate a larger set of unlabeled samples using the two orthogonal classifiers. We carry out experiments using a set of over 5,000 images acquired from the Web. We explore the use of different combinations of HTML text and visual representations. We find that our bootstrapping approach can achieve a performance comparable to that of the supervised learning approach with an F1 measure of over 54%. At the same time, it offers the added advantage of requiring only a small initial set of training samples.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126185351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 113
Application of packet assembly technology to digital video and VoIP 分组分组技术在数字视频和VoIP中的应用
Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027620
T. Kanda, K. Shimamura
The Internet is composed of many kinds of networks and the networks are composed of network nodes such as routers. Routers use processor power for forwarding each packet with any size. At that time, node processor would be a bottleneck in respect to the high throughput if there would be too many packets to forward. Then, authors propose the packet assembly method. This aims to decrease the number of packets for the reduction of processor load, based on the fact that there are many packets much smaller than maximum transferable unit in backbone network. For the examination of the packet assembly, authors conducted two experiments. One is the experiment that conducts the packet assembly method for the traffic of digital video, and it provides the comparison of the image of digital video forwarded via routers without packet assembly with the one with packet assembly, and transition of edge router load and core router load. The other is the experiment that conducts the packet assembly method for the traffic of VoIP, and investigated about the influence on PSQM score, latency, and jitter.
Internet由多种类型的网络组成,这些网络由路由器等网络节点组成。路由器使用处理器的能力来转发任何大小的数据包。此时,如果有太多的数据包需要转发,节点处理器将成为高吞吐量的瓶颈。在此基础上,提出了分组组装方法。基于骨干网中存在许多比最大可传输单元小得多的数据包,其目的是减少数据包数量以减少处理器负载。为了检验分组装配,作者进行了两个实验。一是对数字视频流量进行分组分组方法的实验,对不分组分组的路由器转发的数字视频图像与分组分组转发的数字视频图像进行了比较,并对边缘路由器负载和核心路由器负载进行了转换。二是针对VoIP业务进行分组分组方法的实验,研究分组分组对PSQM评分、时延和抖动的影响。
{"title":"Application of packet assembly technology to digital video and VoIP","authors":"T. Kanda, K. Shimamura","doi":"10.1145/1027527.1027620","DOIUrl":"https://doi.org/10.1145/1027527.1027620","url":null,"abstract":"The Internet is composed of many kinds of networks and the networks are composed of network nodes such as routers. Routers use processor power for forwarding each packet with any size. At that time, node processor would be a bottleneck in respect to the high throughput if there would be too many packets to forward. Then, authors propose the packet assembly method. This aims to decrease the number of packets for the reduction of processor load, based on the fact that there are many packets much smaller than maximum transferable unit in backbone network.\u0000 For the examination of the packet assembly, authors conducted two experiments. One is the experiment that conducts the packet assembly method for the traffic of digital video, and it provides the comparison of the image of digital video forwarded via routers without packet assembly with the one with packet assembly, and transition of edge router load and core router load. The other is the experiment that conducts the packet assembly method for the traffic of VoIP, and investigated about the influence on PSQM score, latency, and jitter.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129328962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
MULTIMEDIA '04
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1