首页 > 最新文献

2011 IEEE International Symposium on Multimedia最新文献

英文 中文
Rule of Thirds Detection from Photograph 照片的三分法检测
Pub Date : 2011-12-05 DOI: 10.1109/ISM.2011.23
Long Mai, Hoang Le, Yuzhen Niu, Feng Liu
The rule of thirds is one of the most important composition rules used by photographers to create high-quality photos. The rule of thirds states that placing important objects along the imagery thirds lines or around their intersections often produces highly aesthetic photos. In this paper, we present a method to automatically determine whether a photo respects the rule of thirds. Detecting the rule of thirds from a photo requires semantic content understanding to locate important objects, which is beyond the state of the art. This paper makes use of the recent saliency and generic objectness analysis as an alternative and accordingly designs a range of features. Our experiment with a variety of saliency and generic objectness methods shows that an encouraging performance can be achieved in detecting the rule of thirds from photos.
三分法是摄影师用来创作高质量照片的最重要的构图规则之一。三分法指出,将重要的物体放在图像的三分线上或它们的交点周围,通常会产生高度美学的照片。本文提出了一种自动判断照片是否符合三分法则的方法。从照片中检测三分法则需要语义内容理解来定位重要对象,这超出了目前的技术水平。本文利用最近的显着性和一般对象分析作为一种选择,并相应地设计了一系列特征。我们对各种显著性和通用对象方法的实验表明,在从照片中检测三分法则方面可以取得令人鼓舞的性能。
{"title":"Rule of Thirds Detection from Photograph","authors":"Long Mai, Hoang Le, Yuzhen Niu, Feng Liu","doi":"10.1109/ISM.2011.23","DOIUrl":"https://doi.org/10.1109/ISM.2011.23","url":null,"abstract":"The rule of thirds is one of the most important composition rules used by photographers to create high-quality photos. The rule of thirds states that placing important objects along the imagery thirds lines or around their intersections often produces highly aesthetic photos. In this paper, we present a method to automatically determine whether a photo respects the rule of thirds. Detecting the rule of thirds from a photo requires semantic content understanding to locate important objects, which is beyond the state of the art. This paper makes use of the recent saliency and generic objectness analysis as an alternative and accordingly designs a range of features. Our experiment with a variety of saliency and generic objectness methods shows that an encouraging performance can be achieved in detecting the rule of thirds from photos.","PeriodicalId":339410,"journal":{"name":"2011 IEEE International Symposium on Multimedia","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121647939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 52
Affective Video Summarization and Story Board Generation Using Pupillary Dilation and Eye Gaze 利用瞳孔扩张和眼睛注视进行情感视频总结和故事板生成
Pub Date : 2011-12-01 DOI: 10.1109/ISM.2011.57
Harish Katti, Karthik Yadati, M. Kankanhalli, Tat-Seng Chua
We propose a semi-automated, eye-gaze based method for affective analysis of videos. Pupillary Dilation (PD) is introduced as a valuable behavioural signal for assessment of subject arousal and engagement. We use PD information for computationally inexpensive, arousal based composition of video summaries and descriptive story-boards. Video summarization and story-board generation is done offline, subsequent to a subject viewing the video. The method also includes novel eye-gaze analysis and fusion with content based features to discover affective segments of videos and Regions of interest (ROIs) contained therein. Effectiveness of the framework is evaluated using experiments over a diverse set of clips, significant pool of subjects and comparison with a fully automated state-of-art affective video summarization algorithm. Acquisition and analysis of PD information is demonstrated and used as a proxy for human visual attention and arousal based video summarization and story-board generation. An important contribution is to demonstrate usefulness of PD information in identifying affective video segments with abstract semantics or affective elements of discourse and story-telling, that are likely to be missed by automated methods. Another contribution is the use of eye-fixations in the close temporal proximity of PD based events for key frame extraction and subsequent story board generation. We also show how PD based video summarization can to generate either a personalized video summary or to represent a consensus over affective preferences of a larger group or community.
我们提出了一种半自动化的、基于眼睛注视的视频情感分析方法。瞳孔扩张(PD)是一种有价值的行为信号,用于评估受试者的觉醒和参与。我们将PD信息用于计算廉价,基于唤醒的视频摘要和描述性故事板的组成。视频总结和故事板生成是离线完成的,随后受试者观看视频。该方法还包括新颖的眼球注视分析和与基于内容的特征融合,以发现视频的情感片段和其中包含的兴趣区域(roi)。该框架的有效性是通过对不同剪辑集的实验来评估的,重要的主题池,并与全自动的最先进的情感视频摘要算法进行比较。PD信息的获取和分析被证明并用作人类视觉注意力和基于唤醒的视频总结和故事板生成的代理。一个重要的贡献是证明PD信息在识别具有抽象语义的情感视频片段或话语和故事讲述的情感元素方面的有用性,这些可能被自动化方法遗漏。另一个贡献是在基于PD的事件的近时间距离中使用眼睛注视,用于提取关键帧和随后的故事板生成。我们还展示了基于PD的视频摘要如何生成个性化的视频摘要或代表更大群体或社区的情感偏好的共识。
{"title":"Affective Video Summarization and Story Board Generation Using Pupillary Dilation and Eye Gaze","authors":"Harish Katti, Karthik Yadati, M. Kankanhalli, Tat-Seng Chua","doi":"10.1109/ISM.2011.57","DOIUrl":"https://doi.org/10.1109/ISM.2011.57","url":null,"abstract":"We propose a semi-automated, eye-gaze based method for affective analysis of videos. Pupillary Dilation (PD) is introduced as a valuable behavioural signal for assessment of subject arousal and engagement. We use PD information for computationally inexpensive, arousal based composition of video summaries and descriptive story-boards. Video summarization and story-board generation is done offline, subsequent to a subject viewing the video. The method also includes novel eye-gaze analysis and fusion with content based features to discover affective segments of videos and Regions of interest (ROIs) contained therein. Effectiveness of the framework is evaluated using experiments over a diverse set of clips, significant pool of subjects and comparison with a fully automated state-of-art affective video summarization algorithm. Acquisition and analysis of PD information is demonstrated and used as a proxy for human visual attention and arousal based video summarization and story-board generation. An important contribution is to demonstrate usefulness of PD information in identifying affective video segments with abstract semantics or affective elements of discourse and story-telling, that are likely to be missed by automated methods. Another contribution is the use of eye-fixations in the close temporal proximity of PD based events for key frame extraction and subsequent story board generation. We also show how PD based video summarization can to generate either a personalized video summary or to represent a consensus over affective preferences of a larger group or community.","PeriodicalId":339410,"journal":{"name":"2011 IEEE International Symposium on Multimedia","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133214170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
Cooking Ingredient Recognition Based on the Load on a Chopping Board during Cutting 基于切菜板负荷的烹饪配料识别
Pub Date : 2011-12-01 DOI: 10.1109/ISM.2011.69
Yoko Yamakata, Yoshiki Tsuchimoto, Atsushi Hashimoto, Takuya Funatomi, Mayumi Ueda, M. Minoh
This paper presents a method for recognizing recipe ingredients based on the load on a chopping board when ingredients are cut. The load is measured by four sensors attached to the board. Each chop is detected by indentifying a sharp falling edge in the load data. The load features, including the maximum value, duration, impulse, peak position, and kurtosis, are extracted and used for ingredient recognition. Experimental results showed a precision of 98.1% in chop detection and 67.4% in ingredient recognition with a support vector machine (SVM) classifier for 16 common ingredients.
本文提出了一种基于切菜板上的载荷来识别菜谱配料的方法。负载由连接在电路板上的四个传感器测量。通过识别负载数据中的急剧下降沿来检测每个斩波。提取负载特征,包括最大值、持续时间、脉冲、峰位置和峰度,并用于成分识别。实验结果表明,支持向量机(SVM)分类器对16种常见成分的切痕检测准确率为98.1%,成分识别准确率为67.4%。
{"title":"Cooking Ingredient Recognition Based on the Load on a Chopping Board during Cutting","authors":"Yoko Yamakata, Yoshiki Tsuchimoto, Atsushi Hashimoto, Takuya Funatomi, Mayumi Ueda, M. Minoh","doi":"10.1109/ISM.2011.69","DOIUrl":"https://doi.org/10.1109/ISM.2011.69","url":null,"abstract":"This paper presents a method for recognizing recipe ingredients based on the load on a chopping board when ingredients are cut. The load is measured by four sensors attached to the board. Each chop is detected by indentifying a sharp falling edge in the load data. The load features, including the maximum value, duration, impulse, peak position, and kurtosis, are extracted and used for ingredient recognition. Experimental results showed a precision of 98.1% in chop detection and 67.4% in ingredient recognition with a support vector machine (SVM) classifier for 16 common ingredients.","PeriodicalId":339410,"journal":{"name":"2011 IEEE International Symposium on Multimedia","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115025356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Music Genre Classification Based on Entropy and Fractal Lacunarity 基于熵和分形空隙度的音乐类型分类
Pub Date : 2011-12-01 DOI: 10.1109/ISM.2011.94
A. Goulart, Carlos Dias Maciel, R. Guido, Katia Cristina Silva Paulo, Ivan Nunes da Silva
In this letter, we present an automatic music genre classification scheme based on a Gaussian Mixture Model (GMM) classifier. The proposed technique adopts entropies and lacunarities as features for the classifications. Tests were carried out with four styles of Brazilian music, namely Ax ´e, Bossa Nova, Forro ´, and Samba.
在这篇文章中,我们提出了一个基于高斯混合模型(GMM)分类器的音乐类型自动分类方案。该方法采用熵和缺度作为分类的特征。测试使用了四种巴西音乐风格,即Ax ' e, Bossa Nova, Forro '和Samba。
{"title":"Music Genre Classification Based on Entropy and Fractal Lacunarity","authors":"A. Goulart, Carlos Dias Maciel, R. Guido, Katia Cristina Silva Paulo, Ivan Nunes da Silva","doi":"10.1109/ISM.2011.94","DOIUrl":"https://doi.org/10.1109/ISM.2011.94","url":null,"abstract":"In this letter, we present an automatic music genre classification scheme based on a Gaussian Mixture Model (GMM) classifier. The proposed technique adopts entropies and lacunarities as features for the classifications. Tests were carried out with four styles of Brazilian music, namely Ax ´e, Bossa Nova, Forro ´, and Samba.","PeriodicalId":339410,"journal":{"name":"2011 IEEE International Symposium on Multimedia","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129938710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Adaptive Video Compression for Video Surveillance Applications 视频监控应用的自适应视频压缩
Pub Date : 2011-12-01 DOI: 10.1109/ISM.2011.38
Andrew D. Bagdanov, M. Bertini, A. Bimbo, Lorenzo Seidenari
This article describes an approach to adaptive video coding for video surveillance applications. Using a combination of low-level features with low computational cost, we show how it is possible to control the quality of video compression so that semantically meaningful elements of the scene are encoded with higher fidelity, while background elements are allocated fewer bits in the transmitted representation. Our approach is based on adaptive smoothing of individual video frames so that image features highly correlated to semantically interesting objects are preserved. Using only low-level image features on individual frames, this adaptive smoothing can be seamlessly inserted into a video coding pipeline as a pre-processing state. Experiments show that our technique is efficient, outperforms standard H.264 encoding at comparable bit rates, and preserves features critical for downstream detection and recognition.
本文介绍了一种用于视频监控应用的自适应视频编码方法。使用低计算成本的底层特征组合,我们展示了如何控制视频压缩的质量,以便以更高的保真度对场景的语义有意义的元素进行编码,而背景元素在传输表示中分配更少的比特。我们的方法是基于单个视频帧的自适应平滑,以便保留与语义感兴趣的对象高度相关的图像特征。仅使用单个帧的低级图像特征,这种自适应平滑可以无缝地插入到视频编码管道中作为预处理状态。实验表明,我们的技术是高效的,在相当的比特率下优于标准的H.264编码,并保留了下游检测和识别的关键特征。
{"title":"Adaptive Video Compression for Video Surveillance Applications","authors":"Andrew D. Bagdanov, M. Bertini, A. Bimbo, Lorenzo Seidenari","doi":"10.1109/ISM.2011.38","DOIUrl":"https://doi.org/10.1109/ISM.2011.38","url":null,"abstract":"This article describes an approach to adaptive video coding for video surveillance applications. Using a combination of low-level features with low computational cost, we show how it is possible to control the quality of video compression so that semantically meaningful elements of the scene are encoded with higher fidelity, while background elements are allocated fewer bits in the transmitted representation. Our approach is based on adaptive smoothing of individual video frames so that image features highly correlated to semantically interesting objects are preserved. Using only low-level image features on individual frames, this adaptive smoothing can be seamlessly inserted into a video coding pipeline as a pre-processing state. Experiments show that our technique is efficient, outperforms standard H.264 encoding at comparable bit rates, and preserves features critical for downstream detection and recognition.","PeriodicalId":339410,"journal":{"name":"2011 IEEE International Symposium on Multimedia","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114389273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
An Adaptive Splitting and Transmission Control Method for Rendering Point Model on Mobile Devices 移动设备上渲染点模型的自适应分割与传输控制方法
Pub Date : 2010-02-19 DOI: 10.1145/1730804.1730977
Yajie Yan, Xiaohui Liang, Ke Xie, Qinping Zhao
The physical characteristics of current mobile devices impose significant constraints on the processing of 3D graphics. The remote rendering framework is considered a better choice in this regard. However, limited battery life is a critical constraint when using this approach. Earlier methods based on this framework suffered from high transmission frequency. We present a software solution to this problem with a key element, an Adaptive Splitting and Error Handling Mechanism that indirectly reserves the electricity in mobile devices by reducing the transmission frequency. To achieve this goal, a geometric relation is maintained that tightly couples several consecutive Levels of Detail (LOD). Adaptive Splitting can then approximate the LOD from a much coarser split base under the guidance of the relation. Data transmission between the server and mobile device occurs only when out-ranged LOD is about to be displayed. Our remote rendering architecture, based on the above approach, trades splitting process for transmission, thereby alleviating the problem of frequent data transmission.
当前移动设备的物理特性对3D图形的处理施加了很大的限制。在这方面,远程呈现框架被认为是更好的选择。然而,使用这种方法时,有限的电池寿命是一个关键的限制。基于此框架的早期方法存在传输频率高的问题。我们提出了一个软件解决方案,其中包含一个关键元素,即自适应分裂和错误处理机制,该机制通过降低传输频率间接保留移动设备中的电力。为了实现这一目标,需要保持一种几何关系,使几个连续的细节层次(LOD)紧密耦合。自适应分裂可以在关系的指导下从更粗的分裂基近似LOD。只有当远程LOD即将显示时,服务器才会与移动设备进行数据传输。我们的远程渲染架构基于上述方法,以拆分过程为传输,从而缓解了数据频繁传输的问题。
{"title":"An Adaptive Splitting and Transmission Control Method for Rendering Point Model on Mobile Devices","authors":"Yajie Yan, Xiaohui Liang, Ke Xie, Qinping Zhao","doi":"10.1145/1730804.1730977","DOIUrl":"https://doi.org/10.1145/1730804.1730977","url":null,"abstract":"The physical characteristics of current mobile devices impose significant constraints on the processing of 3D graphics. The remote rendering framework is considered a better choice in this regard. However, limited battery life is a critical constraint when using this approach. Earlier methods based on this framework suffered from high transmission frequency. We present a software solution to this problem with a key element, an Adaptive Splitting and Error Handling Mechanism that indirectly reserves the electricity in mobile devices by reducing the transmission frequency. To achieve this goal, a geometric relation is maintained that tightly couples several consecutive Levels of Detail (LOD). Adaptive Splitting can then approximate the LOD from a much coarser split base under the guidance of the relation. Data transmission between the server and mobile device occurs only when out-ranged LOD is about to be displayed. Our remote rendering architecture, based on the above approach, trades splitting process for transmission, thereby alleviating the problem of frequent data transmission.","PeriodicalId":339410,"journal":{"name":"2011 IEEE International Symposium on Multimedia","volume":"116 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117344168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
2011 IEEE International Symposium on Multimedia
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1