首页 > 最新文献

2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)最新文献

英文 中文
A Novel Correntropy Analysis Method with Application to Multi-view Feature Representation 一种新的相关熵分析方法及其在多视图特征表示中的应用
Lei Gao, L. Guan
In this paper, a novel correntropy analysis (CORA) method is proposed for multi-view feature representation. By joint utilization the correntropy and nonlinear kernel transformation tools, the presented CORA method is able to measure the localized similarity between two random variables and further reveal the intrinsic relation between them effectively, leading to a high quality feature representation. Unlike many existing techniques for feature representation such as canonical correlation analysis (CCA) and kernel CCA (KCCA), CORA indicates and explores the mutual relation of two random variables according to the probability density. In addition, different from the kernel entropy component analysis (KECA) method revealing the structural information only from a single data space, CORA is able to explore the mutual structural information between two data spaces jointly instead. The effectiveness of the proposed method is evaluated through experiments on audio emotion recognition and face recognition examples. Comparisons are conducted on the statistics machine learning (SML) and deep neural network (DNN) based algorithms. The results show that the proposed CORA method outperforms other methods.
本文提出了一种新的多视图特征表示的相关熵分析方法。该方法通过联合利用相关系数和非线性核变换工具,能够测量两个随机变量之间的局部相似度,并进一步有效地揭示它们之间的内在关系,从而获得高质量的特征表示。与经典相关分析(canonical correlation analysis, CCA)和核相关分析(kernel CCA, KCCA)等现有的特征表示技术不同,CORA根据概率密度来表示和探索两个随机变量之间的相互关系。此外,与核熵分量分析(kernel entropy component analysis, kea)方法只能揭示单个数据空间的结构信息不同,CORA能够共同探索两个数据空间之间的相互结构信息。通过音频情感识别和人脸识别实例验证了该方法的有效性。比较了基于统计机器学习(SML)和深度神经网络(DNN)的算法。结果表明,本文提出的CORA方法优于其他方法。
{"title":"A Novel Correntropy Analysis Method with Application to Multi-view Feature Representation","authors":"Lei Gao, L. Guan","doi":"10.1109/MIPR51284.2021.00034","DOIUrl":"https://doi.org/10.1109/MIPR51284.2021.00034","url":null,"abstract":"In this paper, a novel correntropy analysis (CORA) method is proposed for multi-view feature representation. By joint utilization the correntropy and nonlinear kernel transformation tools, the presented CORA method is able to measure the localized similarity between two random variables and further reveal the intrinsic relation between them effectively, leading to a high quality feature representation. Unlike many existing techniques for feature representation such as canonical correlation analysis (CCA) and kernel CCA (KCCA), CORA indicates and explores the mutual relation of two random variables according to the probability density. In addition, different from the kernel entropy component analysis (KECA) method revealing the structural information only from a single data space, CORA is able to explore the mutual structural information between two data spaces jointly instead. The effectiveness of the proposed method is evaluated through experiments on audio emotion recognition and face recognition examples. Comparisons are conducted on the statistics machine learning (SML) and deep neural network (DNN) based algorithms. The results show that the proposed CORA method outperforms other methods.","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"159 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115946213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Detecting and Preventing Faked Mixed Reality 虚假混合现实的检测与防范
Fabian Kilger, Alexandre Kabil, Volker Tippmann, G. Klinker, Marc-Oliver Pahl
Virtualized collaboration can significantly increase remote management of critical infrastructures. Crises such as the current COVID-19 pandemic push the technology: they require remote management to keep our infrastructures running. Mixed Reality (MR) prototypes enable remote management in diverse fields such as medicine, industry 4.0, energy systems, education, or cyber awareness. However, the evolution of virtualized collaboration is still in the beginning. By design, MR is fake: its reality is generated from models. This makes detecting attacks very difficult. Many MR-attacks result from well-known cybersecurity threats. This paper identifies classic attack surfaces, vectors, and concrete threats that are relevant for MR. It presents mitigation methods that can help to secure the underlying data exchanges. However, distributed systems are often heterogeneous and under different management authorities, making securing the entire virtualized remote management stack difficult. The paper therefore also introduces considerations towards an MR-client-based attack detection, i.e., MR-forensics, including relevant features and the use of machine learning.
虚拟化协作可以显著增加对关键基础设施的远程管理。当前的COVID-19大流行等危机推动了技术的发展:它们需要远程管理以保持我们的基础设施运行。混合现实(MR)原型可以在医疗、工业4.0、能源系统、教育或网络意识等不同领域进行远程管理。然而,虚拟化协作的发展仍处于起步阶段。在设计上,MR是假的:它的真实性是由模型产生的。这使得检测攻击变得非常困难。许多mr攻击源于众所周知的网络安全威胁。本文确定了与mr相关的经典攻击面、向量和具体威胁,并提出了有助于保护底层数据交换的缓解方法。然而,分布式系统通常是异构的,并且处于不同的管理权限下,这使得保护整个虚拟化远程管理堆栈变得困难。因此,本文还介绍了对基于mr客户端的攻击检测的考虑,即mr取证,包括相关特征和机器学习的使用。
{"title":"Detecting and Preventing Faked Mixed Reality","authors":"Fabian Kilger, Alexandre Kabil, Volker Tippmann, G. Klinker, Marc-Oliver Pahl","doi":"10.1109/MIPR51284.2021.00074","DOIUrl":"https://doi.org/10.1109/MIPR51284.2021.00074","url":null,"abstract":"Virtualized collaboration can significantly increase remote management of critical infrastructures. Crises such as the current COVID-19 pandemic push the technology: they require remote management to keep our infrastructures running. Mixed Reality (MR) prototypes enable remote management in diverse fields such as medicine, industry 4.0, energy systems, education, or cyber awareness. However, the evolution of virtualized collaboration is still in the beginning. By design, MR is fake: its reality is generated from models. This makes detecting attacks very difficult. Many MR-attacks result from well-known cybersecurity threats. This paper identifies classic attack surfaces, vectors, and concrete threats that are relevant for MR. It presents mitigation methods that can help to secure the underlying data exchanges. However, distributed systems are often heterogeneous and under different management authorities, making securing the entire virtualized remote management stack difficult. The paper therefore also introduces considerations towards an MR-client-based attack detection, i.e., MR-forensics, including relevant features and the use of machine learning.","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124203889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The JinYue Database for Huqin Music Emotion, Scene and Imagery Recognition 胡琴音乐情感、场景、图像识别的金悦数据库
Kejun Zhang, Xinda Wu, Ruiyuan Tang, Qiaoqiao Huang, Chang-yuan Yang, Hui Zhang
Traditional Chinese music is a great treasure for China and the rest of the world, which is accompanied by variant traditional musical instruments and featured with distinct melodies within different dynasties. Developing an efficient music retrieval system for traditional Chinese music requires numerous such music data with rich and accurate annotations. However, existing databases usually consider popular and contemporary music, basic taxonomy, and a single task. In this work, we introduce the JinYue database of more than 1000 pieces of music played by variants of huqin (huqin music) spanning the age range of the 20th century to date. The database includes over 10,000 annotations of huqin music in terms of discrete emotion, scene, and imagery labels. We provide extensive benchmarks of multi-class classification results for emotion, scene, and imagery along with the database. Furthermore, due to the copyright, we develop a JinYue Music Exploring System to provide the information of over 1,000 pieces of music played by huqin, including huqin music metadata, audio features, and annotations. We will continuously collect more music by Chinese musical instruments categories to enrich the JinYue database. This database aims to push forward the research in affective computing, music information retrieval, and beyond.
中国传统音乐是中国和世界的瑰宝,它由各种传统乐器伴奏,在不同的朝代有着不同的旋律。开发一个高效的中国传统音乐检索系统需要大量的此类音乐数据,并提供丰富而准确的注释。然而,现有的数据库通常考虑流行音乐和当代音乐、基本分类和单一任务。在这项工作中,我们介绍了金岳数据库,其中有超过1000首由胡琴(胡琴音乐)变体演奏的音乐,跨越了20世纪至今的年龄范围。该数据库包含1万多条胡琴音乐注释,包括离散的情感、场景和图像标签。我们为情感、场景和图像以及数据库提供了广泛的多类分类结果基准。此外,由于版权的原因,我们开发了一个金悦音乐探索系统,提供超过1000首胡琴演奏的音乐信息,包括胡琴音乐元数据、音频特征和注释。我们将继续收集更多的中国乐器类别的音乐,以丰富金悦数据库。该数据库旨在推动情感计算、音乐信息检索等方面的研究。
{"title":"The JinYue Database for Huqin Music Emotion, Scene and Imagery Recognition","authors":"Kejun Zhang, Xinda Wu, Ruiyuan Tang, Qiaoqiao Huang, Chang-yuan Yang, Hui Zhang","doi":"10.1109/MIPR51284.2021.00059","DOIUrl":"https://doi.org/10.1109/MIPR51284.2021.00059","url":null,"abstract":"Traditional Chinese music is a great treasure for China and the rest of the world, which is accompanied by variant traditional musical instruments and featured with distinct melodies within different dynasties. Developing an efficient music retrieval system for traditional Chinese music requires numerous such music data with rich and accurate annotations. However, existing databases usually consider popular and contemporary music, basic taxonomy, and a single task. In this work, we introduce the JinYue database of more than 1000 pieces of music played by variants of huqin (huqin music) spanning the age range of the 20th century to date. The database includes over 10,000 annotations of huqin music in terms of discrete emotion, scene, and imagery labels. We provide extensive benchmarks of multi-class classification results for emotion, scene, and imagery along with the database. Furthermore, due to the copyright, we develop a JinYue Music Exploring System to provide the information of over 1,000 pieces of music played by huqin, including huqin music metadata, audio features, and annotations. We will continuously collect more music by Chinese musical instruments categories to enrich the JinYue database. This database aims to push forward the research in affective computing, music information retrieval, and beyond.","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122712660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Preference Analysis of Shopping Malls’ Followers and Keyword Recommendation on Twitter 购物中心Twitter关注者偏好分析及关键词推荐
Mantaro Yamada, Xueting Wang, T. Yamasaki
In this work, we analyze the preference features of shopping malls’ followers by examining their "following" and "like" behavior on Twitter. The analysis reveals their preferred topics and the differences among shopping malls that can be used for beneficial commercial applications such as effective promotion, marketing, or branding strategy. In addition, we propose a follower-oriented keyword recommendation method that leverages the followers’ preference. The method recommends keywords to use in a tweet to enhance popularity with the followers. It more directly helps shopping malls to use Twitter effectively for commercial applications.
在这项工作中,我们通过检查购物中心的追随者在Twitter上的“关注”和“喜欢”行为来分析他们的偏好特征。分析揭示了他们喜欢的主题和购物中心之间的差异,可以用于有益的商业应用,如有效的促销,营销或品牌战略。此外,我们提出了一种利用关注者偏好的面向关注者的关键词推荐方法。该方法推荐在tweet中使用的关键词,以提高受关注者的欢迎程度。它更直接地帮助购物中心有效地将Twitter用于商业应用。
{"title":"Preference Analysis of Shopping Malls’ Followers and Keyword Recommendation on Twitter","authors":"Mantaro Yamada, Xueting Wang, T. Yamasaki","doi":"10.1109/MIPR51284.2021.00055","DOIUrl":"https://doi.org/10.1109/MIPR51284.2021.00055","url":null,"abstract":"In this work, we analyze the preference features of shopping malls’ followers by examining their \"following\" and \"like\" behavior on Twitter. The analysis reveals their preferred topics and the differences among shopping malls that can be used for beneficial commercial applications such as effective promotion, marketing, or branding strategy. In addition, we propose a follower-oriented keyword recommendation method that leverages the followers’ preference. The method recommends keywords to use in a tweet to enhance popularity with the followers. It more directly helps shopping malls to use Twitter effectively for commercial applications.","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"13 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125064616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AIBO – A Sicko AI Brainwave Opera
E. Pearlman
OpenAI created the algorithm GPT-(Generative Pretrained Transformer 2) (now GPT-3) in February 2019. The algorithm creates imitations of human dialogue producing fake but surprisingly realistic interactions. Using GPT-2, a ‘sicko’ AI was created as a live time entity running in the Google cloud. AIBO (Artificial Intelligent Brainwave Opera) was one of two characters, the other being a human wearing a brain computer interface, both part an emotionally intelligent artificial intelligent brainwave opera. The opera asked two questions - "Can an AI be fascist?" and "Can an AI have epigenetic, or inherited traumatic memory?" This paper discusses aspects involved in building the GPT-2 cloud-based character AIBO and its synthetic emotions in a performative spoken word opera.
OpenAI于2019年2月创建了GPT-(生成预训练变压器2)(现为GPT-3)算法。该算法模仿人类对话,产生虚假但令人惊讶的真实互动。使用GPT-2,一个“病态”人工智能被创建为在谷歌云上运行的实时实体。AIBO(人工智能脑波歌剧)是两个角色之一,另一个是戴着脑机接口的人,都是情商人工智能脑波歌剧的一部分。歌剧提出了两个问题——“人工智能会是法西斯吗?”和“人工智能会有表观遗传或遗传的创伤记忆吗?”本文讨论了基于GPT-2的云角色AIBO的构建及其在表演口语歌剧中的综合情感。
{"title":"AIBO – A Sicko AI Brainwave Opera","authors":"E. Pearlman","doi":"10.1109/MIPR51284.2021.00060","DOIUrl":"https://doi.org/10.1109/MIPR51284.2021.00060","url":null,"abstract":"OpenAI created the algorithm GPT-(Generative Pretrained Transformer 2) (now GPT-3) in February 2019. The algorithm creates imitations of human dialogue producing fake but surprisingly realistic interactions. Using GPT-2, a ‘sicko’ AI was created as a live time entity running in the Google cloud. AIBO (Artificial Intelligent Brainwave Opera) was one of two characters, the other being a human wearing a brain computer interface, both part an emotionally intelligent artificial intelligent brainwave opera. The opera asked two questions - \"Can an AI be fascist?\" and \"Can an AI have epigenetic, or inherited traumatic memory?\" This paper discusses aspects involved in building the GPT-2 cloud-based character AIBO and its synthetic emotions in a performative spoken word opera.","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123568630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Recalibration of Structured-Light RGB-D Cameras with Parametric Depth Error Correction 基于参数深度误差校正的结构光RGB-D相机的再标定
Peng-Yuan Kao, S. Shih, Y. Hung, Aye Mon Tun
Structured-light RGB-D cameras have been widely used in various applications. However, due to the deformation of internal camera parts, their depth estimation accuracy degrades with time. While it is easy to calibrate the camera parameters, updating the calibrated parameters to the camera firmware is difficult. Therefore, existing methods compensate for the depth measurements with different error correction functions. At present, as there are no simple and accurate parametric error correction methods, non-parametric calibration methods must be used when accurate depth measurements are required. The main drawback of such nonparametric approaches is that they require a large number of calibration images to calibrate a large error correction lookup tables. In this paper, we propose a simple parametric depth error correction model based on Taylor-series approximation of depth measurement equations. Experimental results show that the proposed method outperforms other parametric approaches and achieves results comparable to the state-of-the-art nonparametric method although the proposed method uses only nine parameters.
结构光RGB-D相机已广泛应用于各种场合。然而,由于相机内部零件的变形,其深度估计精度随着时间的推移而降低。虽然校准相机参数很容易,但将校准后的参数更新到相机固件却很困难。因此,现有的深度测量方法采用不同的误差校正函数进行补偿。目前,由于没有简单准确的参数误差校正方法,当需要精确的深度测量时,必须采用非参数校准方法。这种非参数方法的主要缺点是它们需要大量的校准图像来校准大型误差校正查找表。本文提出了一种基于深度测量方程泰勒级数近似的简单参数深度误差校正模型。实验结果表明,尽管该方法仅使用了9个参数,但其性能优于其他参数方法,并取得了与最先进的非参数方法相当的结果。
{"title":"Recalibration of Structured-Light RGB-D Cameras with Parametric Depth Error Correction","authors":"Peng-Yuan Kao, S. Shih, Y. Hung, Aye Mon Tun","doi":"10.1109/MIPR51284.2021.00024","DOIUrl":"https://doi.org/10.1109/MIPR51284.2021.00024","url":null,"abstract":"Structured-light RGB-D cameras have been widely used in various applications. However, due to the deformation of internal camera parts, their depth estimation accuracy degrades with time. While it is easy to calibrate the camera parameters, updating the calibrated parameters to the camera firmware is difficult. Therefore, existing methods compensate for the depth measurements with different error correction functions. At present, as there are no simple and accurate parametric error correction methods, non-parametric calibration methods must be used when accurate depth measurements are required. The main drawback of such nonparametric approaches is that they require a large number of calibration images to calibrate a large error correction lookup tables. In this paper, we propose a simple parametric depth error correction model based on Taylor-series approximation of depth measurement equations. Experimental results show that the proposed method outperforms other parametric approaches and achieves results comparable to the state-of-the-art nonparametric method although the proposed method uses only nine parameters.","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126151015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Weakly-Supervised Damaged Building Localization and Assessment with Noise Regularization 基于噪声正则化的弱监督受损建筑定位与评估
Maria Presa-Reyes, Shu‐Ching Chen
Not only does the destruction caused by natural disasters impair human lives, but it can also result in devastating damages to the community infrastructure and possibly cause the loss of historic structures as well as vital documents. Technological advances in remote sensing survey tools such as satellite images and aerial photographs have allowed emergency responders to rapidly and remotely conduct a comprehensive assessment of the damages caused by a disaster event. Most of the previously proposed research in the automatic identification and prediction of building damage assessments from optical remote sensing data depends on the availability of accurate geometric footprints of the affected area’s structures. However, the available building footprints may rapidly become outdated as new infrastructures are built while old ones are demolished or renovated. We propose an end-to-end weakly-supervised damage assessment model where the assumption is that the building footprint is unknown during training. Instead, there is a rough estimate of the building’s location and the level of damage it sustained. Ablation tests are conducted on both a large-scale satellite imagery set and a smaller set of aerial photographs prepared and curated by our team to demonstrate our proposed model’s performance.
自然灾害造成的破坏不仅损害人的生命,而且还可能对社区基础设施造成毁灭性的破坏,并可能造成历史建筑和重要文件的损失。卫星图像和航空照片等遥感调查工具的技术进步使应急人员能够迅速和远程地对灾害事件造成的损害进行全面评估。以前提出的基于光学遥感数据的建筑物损伤评估自动识别和预测的研究大多依赖于受影响区域结构的精确几何足迹的可用性。然而,随着新基础设施的建设和旧基础设施的拆除或翻新,现有的建筑足迹可能会迅速过时。我们提出了一个端到端的弱监督损伤评估模型,该模型假设在训练期间建筑物的足迹是未知的。取而代之的是对该建筑的位置和受损程度的粗略估计。消融测试是在我们团队准备和策划的大型卫星图像集和较小的航空照片集上进行的,以证明我们提出的模型的性能。
{"title":"Weakly-Supervised Damaged Building Localization and Assessment with Noise Regularization","authors":"Maria Presa-Reyes, Shu‐Ching Chen","doi":"10.1109/MIPR51284.2021.00009","DOIUrl":"https://doi.org/10.1109/MIPR51284.2021.00009","url":null,"abstract":"Not only does the destruction caused by natural disasters impair human lives, but it can also result in devastating damages to the community infrastructure and possibly cause the loss of historic structures as well as vital documents. Technological advances in remote sensing survey tools such as satellite images and aerial photographs have allowed emergency responders to rapidly and remotely conduct a comprehensive assessment of the damages caused by a disaster event. Most of the previously proposed research in the automatic identification and prediction of building damage assessments from optical remote sensing data depends on the availability of accurate geometric footprints of the affected area’s structures. However, the available building footprints may rapidly become outdated as new infrastructures are built while old ones are demolished or renovated. We propose an end-to-end weakly-supervised damage assessment model where the assumption is that the building footprint is unknown during training. Instead, there is a rough estimate of the building’s location and the level of damage it sustained. Ablation tests are conducted on both a large-scale satellite imagery set and a smaller set of aerial photographs prepared and curated by our team to demonstrate our proposed model’s performance.","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128408918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
One-Shot Example Videos Localization Network for Weakly-Supervised Temporal Action Localization 弱监督时间动作定位的单镜头视频定位网络
Yushu Liu, Weigang Zhang, Guorong Li, Li Su, Qingming Huang
This paper tackles the problem of example-driven weakly-supervised temporal action localization. We propose the One-shot Example Videos Localization Network (OSEVLNet) for precisely localizing the action instances in untrimmed videos with only one trimmed example video. Since the frame-level ground truth is unavailable under weakly-supervised settings, our approach automatically trains a self-attention module with reconstruction and feature discrepancy restriction. Specifically, the reconstruction restriction minimizes the discrepancy between the original input features and the reconstructed features of a Variational AutoEncoder (VAE) module. The feature discrepancy restriction maximizes the distance of weighted features between highly-responsive regions and slightly-responsive regions. Our approach achieves comparable or better results on THUMOS’14 dataset than other weakly-supervised methods while it is trained with much less videos. Moreover, our approach is especially suitable for the expansion of newly emerging action categories to meet the requirements of different occasions.
本文研究了实例驱动的弱监督时态动作定位问题。我们提出了单镜头示例视频定位网络(OSEVLNet),用于精确定位未修剪视频中的动作实例,只有一个修剪的示例视频。由于在弱监督设置下,帧级地面真值不可用,我们的方法自动训练具有重构和特征差异限制的自关注模块。具体来说,重构限制最小化了变分自编码器(VAE)模块的原始输入特征与重构特征之间的差异。特征差异限制使高响应区域和低响应区域之间的加权特征距离最大化。我们的方法在THUMOS ' 14数据集上取得了与其他弱监督方法相当或更好的结果,而它使用的视频要少得多。而且,我们的方法特别适合于新出现的动作类别的扩展,以满足不同场合的需求。
{"title":"One-Shot Example Videos Localization Network for Weakly-Supervised Temporal Action Localization","authors":"Yushu Liu, Weigang Zhang, Guorong Li, Li Su, Qingming Huang","doi":"10.1109/MIPR51284.2021.00026","DOIUrl":"https://doi.org/10.1109/MIPR51284.2021.00026","url":null,"abstract":"This paper tackles the problem of example-driven weakly-supervised temporal action localization. We propose the One-shot Example Videos Localization Network (OSEVLNet) for precisely localizing the action instances in untrimmed videos with only one trimmed example video. Since the frame-level ground truth is unavailable under weakly-supervised settings, our approach automatically trains a self-attention module with reconstruction and feature discrepancy restriction. Specifically, the reconstruction restriction minimizes the discrepancy between the original input features and the reconstructed features of a Variational AutoEncoder (VAE) module. The feature discrepancy restriction maximizes the distance of weighted features between highly-responsive regions and slightly-responsive regions. Our approach achieves comparable or better results on THUMOS’14 dataset than other weakly-supervised methods while it is trained with much less videos. Moreover, our approach is especially suitable for the expansion of newly emerging action categories to meet the requirements of different occasions.","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130436496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Music Emotion Recognition through Sparse Canonical Correlation Analysis 基于稀疏典型相关分析的音乐情感识别
Hongwei Li, Hongjian Bo, Lin Ma, Lexiang Wang, Haifeng Li
For centuries, music has been an important part of various cultures and a special language for humans to express their thoughts and emotions. Music emotion plays an important role in music retrieval, mood detection and other music-related applications. Music emotion recognition (MER) has become a research hotspot in the world. The traditional music emotion recognition ignores that the subject of emotions is human. Music acts on the brain to finally produce emotions. Therefore, this paper studies the mapping relationship between music features and EEG features. Through the sparse canonical correlation method, the music features are projected onto the EEG features to obtain the new music feature vectors containing EEG information. The support vector machine was used to train and test the new music feature vectors, and good recognition results were obtained in both the self-built database and the public database. The method proposed in this paper combines the advantages of EEG signals that can reflect the most intuitive and accurate emotional expression. At the same time, our method has good transferability. When the EEG samples are representative, the projection vector is universal and can be directly used in other music database.
几个世纪以来,音乐一直是各种文化的重要组成部分,也是人类表达思想和情感的特殊语言。音乐情感在音乐检索、情绪检测等音乐相关应用中发挥着重要作用。音乐情感识别(MER)已成为国际上的研究热点。传统的音乐情感识别忽略了情感的主体是人。音乐作用于大脑,最终产生情感。因此,本文研究了音乐特征与脑电特征之间的映射关系。通过稀疏典型相关方法,将音乐特征投影到脑电特征上,得到新的包含脑电信息的音乐特征向量。利用支持向量机对新的音乐特征向量进行训练和测试,在自建库和公共库中均取得了较好的识别效果。本文提出的方法结合了脑电图信号最直观、最准确地反映情绪表达的优点。同时,该方法具有良好的可移植性。当脑电样本具有代表性时,该投影向量具有通用性,可直接用于其他音乐数据库。
{"title":"Music Emotion Recognition through Sparse Canonical Correlation Analysis","authors":"Hongwei Li, Hongjian Bo, Lin Ma, Lexiang Wang, Haifeng Li","doi":"10.1109/MIPR51284.2021.00066","DOIUrl":"https://doi.org/10.1109/MIPR51284.2021.00066","url":null,"abstract":"For centuries, music has been an important part of various cultures and a special language for humans to express their thoughts and emotions. Music emotion plays an important role in music retrieval, mood detection and other music-related applications. Music emotion recognition (MER) has become a research hotspot in the world. The traditional music emotion recognition ignores that the subject of emotions is human. Music acts on the brain to finally produce emotions. Therefore, this paper studies the mapping relationship between music features and EEG features. Through the sparse canonical correlation method, the music features are projected onto the EEG features to obtain the new music feature vectors containing EEG information. The support vector machine was used to train and test the new music feature vectors, and good recognition results were obtained in both the self-built database and the public database. The method proposed in this paper combines the advantages of EEG signals that can reflect the most intuitive and accurate emotional expression. At the same time, our method has good transferability. When the EEG samples are representative, the projection vector is universal and can be directly used in other music database.","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133902050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
MemoMusic: A Personalized Music Recommendation Framework Based on Emotion and Memory MemoMusic:一个基于情感和记忆的个性化音乐推荐框架
Luntian Mou, Jueying Li, Juehui Li, Feng Gao, Ramesh C. Jain, Baocai Yin
Music is universally recognized as an effective way for human to express emotion and regulate emotional states. But perceived music emotion is subjective and much dependent on culture, environment, and life experience. Therefore, personalized music recommendation is necessary to gain user satisfaction and navigate a listener to a more positive emotional state as well. Existing work on emotion- based music recommendation and personalized music recommendation often lack of considering the impact of past life experiences on music emotion perceiving. We argue that memories associated with music could play a vital role in determining the new emotional states after music listening. To verify our hypothesis, we propose a personalized music recommendation framework called MemoMusic, which estimates the new emotional state of a listener based on an individual’s current emotional state and possible memory associated with the music being listened to. For the preliminary experiment, a dataset of 60 piano music was collected and labelled using the Valence-Arousal model from three categories of Classical, Popular, and Yanni music. Experimental results demonstrate that memory is actually an important factor in determining perceived music emotion. And MemoMusic based on emotion and memory achieves a good performance in terms of improving a listener’s emotional states.
音乐是公认的人类表达情感和调节情绪状态的有效方式。但感知到的音乐情感是主观的,很大程度上取决于文化、环境和生活经历。因此,个性化的音乐推荐对于获得用户满意度和引导听者进入更积极的情绪状态是必要的。现有的基于情感的音乐推荐和个性化音乐推荐工作往往缺乏考虑过去生活经历对音乐情感感知的影响。我们认为,与音乐相关的记忆在决定听音乐后的新情绪状态方面可能起着至关重要的作用。为了验证我们的假设,我们提出了一个名为MemoMusic的个性化音乐推荐框架,它根据个人当前的情绪状态和与所听音乐相关的可能记忆来估计听者的新情绪状态。在初步实验中,收集了60首钢琴曲的数据集,并使用Valence-Arousal模型从古典音乐、流行音乐和燕尼音乐三大类中进行标记。实验结果表明,记忆实际上是决定音乐情感感知的重要因素。而基于情感和记忆的MemoMusic在改善听者的情绪状态方面取得了很好的效果。
{"title":"MemoMusic: A Personalized Music Recommendation Framework Based on Emotion and Memory","authors":"Luntian Mou, Jueying Li, Juehui Li, Feng Gao, Ramesh C. Jain, Baocai Yin","doi":"10.1109/MIPR51284.2021.00064","DOIUrl":"https://doi.org/10.1109/MIPR51284.2021.00064","url":null,"abstract":"Music is universally recognized as an effective way for human to express emotion and regulate emotional states. But perceived music emotion is subjective and much dependent on culture, environment, and life experience. Therefore, personalized music recommendation is necessary to gain user satisfaction and navigate a listener to a more positive emotional state as well. Existing work on emotion- based music recommendation and personalized music recommendation often lack of considering the impact of past life experiences on music emotion perceiving. We argue that memories associated with music could play a vital role in determining the new emotional states after music listening. To verify our hypothesis, we propose a personalized music recommendation framework called MemoMusic, which estimates the new emotional state of a listener based on an individual’s current emotional state and possible memory associated with the music being listened to. For the preliminary experiment, a dataset of 60 piano music was collected and labelled using the Valence-Arousal model from three categories of Classical, Popular, and Yanni music. Experimental results demonstrate that memory is actually an important factor in determining perceived music emotion. And MemoMusic based on emotion and memory achieves a good performance in terms of improving a listener’s emotional states.","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132906599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1