首页 > 最新文献

2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)最新文献

英文 中文
Emotional Acceptance Measure (EAM): An Objective Evaluation Method Towards Information Communication Effect 情绪接受测量:一种客观评价信息传播效果的方法
Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859363
Xiao Yang, Haonan Cheng, Hanyang Song, Li Yang, Long Ye
In this paper, we propose Emotional Acceptance Measure (EAM), an objective information communication effect evaluation (OICEE) method based on the theory of information entropy. Existing evaluation methods of information communication mostly utilize questionnaires and expert scoring, which often consume a lot of human resources. Aiming at this issue, we address the unexplored task - OICEE, and take the first step toward objective evaluation for assessing the effective results produced by the communication behavior in emotional dimension. Specifically, we construct a dataset for evaluating the information communication effect, design a CNN-BiGRU model based on the self-attention mechanism to calculate emotional information entropy, and propose a formula for calculating EAM score. For the first time, we introduce a novel way for objective evaluation of the information communication effect from the emotional dimension. The comparison experiment with manually annotated real evaluations shows that the EAM score achieves 94.41% correlation with subjective user evaluations, which proves the reasonableness and validity of our proposed objective evaluation method.
本文提出了一种基于信息熵理论的客观信息传播效果评价方法——情感接受测量法(EAM)。现有的信息传播评价方法多采用问卷调查和专家打分,往往耗费大量人力资源。针对这一问题,我们解决了尚未被探索的任务——OICEE,并向客观评价情感维度的沟通行为产生的有效结果迈出了第一步。具体而言,我们构建了评估信息传播效果的数据集,设计了基于自注意机制的CNN-BiGRU模型来计算情感信息熵,并提出了计算EAM分数的公式。本文首次提出了一种从情感维度客观评价信息传播效果的新方法。与人工标注的真实评价对比实验表明,EAM得分与用户主观评价相关度达到94.41%,证明了我们提出的客观评价方法的合理性和有效性。
{"title":"Emotional Acceptance Measure (EAM): An Objective Evaluation Method Towards Information Communication Effect","authors":"Xiao Yang, Haonan Cheng, Hanyang Song, Li Yang, Long Ye","doi":"10.1109/ICMEW56448.2022.9859363","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859363","url":null,"abstract":"In this paper, we propose Emotional Acceptance Measure (EAM), an objective information communication effect evaluation (OICEE) method based on the theory of information entropy. Existing evaluation methods of information communication mostly utilize questionnaires and expert scoring, which often consume a lot of human resources. Aiming at this issue, we address the unexplored task - OICEE, and take the first step toward objective evaluation for assessing the effective results produced by the communication behavior in emotional dimension. Specifically, we construct a dataset for evaluating the information communication effect, design a CNN-BiGRU model based on the self-attention mechanism to calculate emotional information entropy, and propose a formula for calculating EAM score. For the first time, we introduce a novel way for objective evaluation of the information communication effect from the emotional dimension. The comparison experiment with manually annotated real evaluations shows that the EAM score achieves 94.41% correlation with subjective user evaluations, which proves the reasonableness and validity of our proposed objective evaluation method.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127209280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Point Cloud Normal Estimation via Triplet Learning (Demonstration) 基于三重学习的深度点云法向估计(演示)
Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859360
Weijia Wang, Xuequan Lu, Dasith de Silva Edirimuni, Xiao Liu, A. Robles-Kelly
In this demonstration paper, we show the technical details of our proposed triplet learning-based point cloud normal estimation method. Our network architecture consists of two phases: (a) feature encoding to learn representations of local patches, and (b) normal estimation that takes the learned representations as input to regress normals. We are motivated that local patches on isotropic and anisotropic surfaces respectively have similar and distinct normals, and these separable representations can be learned to facilitate normal estimation. Experiments show that our method preserves sharp features and achieves good normal estimation results especially on computer-aided design (CAD) shapes.
在这篇演示论文中,我们展示了我们提出的基于三重学习的点云正态估计方法的技术细节。我们的网络架构由两个阶段组成:(a)特征编码以学习局部补丁的表示,以及(b)正态估计,将学习到的表示作为输入以回归正态。我们的动机是,各向同性和各向异性表面上的局部斑块分别具有相似和不同的法线,并且这些可分离的表示可以学习以促进法线估计。实验表明,该方法保留了图像的鲜明特征,并取得了较好的正态估计效果,特别是在计算机辅助设计(CAD)形状上。
{"title":"Deep Point Cloud Normal Estimation via Triplet Learning (Demonstration)","authors":"Weijia Wang, Xuequan Lu, Dasith de Silva Edirimuni, Xiao Liu, A. Robles-Kelly","doi":"10.1109/ICMEW56448.2022.9859360","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859360","url":null,"abstract":"In this demonstration paper, we show the technical details of our proposed triplet learning-based point cloud normal estimation method. Our network architecture consists of two phases: (a) feature encoding to learn representations of local patches, and (b) normal estimation that takes the learned representations as input to regress normals. We are motivated that local patches on isotropic and anisotropic surfaces respectively have similar and distinct normals, and these separable representations can be learned to facilitate normal estimation. Experiments show that our method preserves sharp features and achieves good normal estimation results especially on computer-aided design (CAD) shapes.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128139786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Low-Power Semantic Segmentation on Embedded Systems for Traffic in Asian Countries 亚洲国家交通嵌入式系统的低功耗语义分割
Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859408
Jun-Long Wang
In the real-world road scene, it needs a quick and powerful system to detect any traffic situation and compute prediction results in a limited hardware environment. In this paper, we present a semantic segmentation model that can perform well in the complex and dynamic Asian road scenes and evaluate the accuracy, power, and speed of MediaTek Dimensity 9000 platform. We use a model called Deep Dual-resolution Networks (DDRNets) in PyTorch, and deploy TensorFlow Lite format in MediaTek chip to assess our model. We choose the two-stage training strategy and utilize the decreasing training resolution technique to further improve results. Our team is the first-place winner in Low-power Deep Learning Semantic Segmentation Model Compression Competition for Traffic Scene in Asian Countries at IEEE International Conference on Multimedia & Expo (ICME) 2022.
在现实道路场景中,在有限的硬件环境下,需要一个快速而强大的系统来检测任何交通状况并计算预测结果。本文提出了一个能够在复杂动态的亚洲道路场景中表现良好的语义分割模型,并对联发科Dimensity 9000平台的准确率、功率和速度进行了评估。我们在PyTorch中使用深度双分辨率网络(Deep Dual-resolution Networks, DDRNets)模型,并在联发科芯片中部署TensorFlow Lite格式来评估我们的模型。我们选择了两阶段训练策略,并利用递减训练分辨率技术来进一步提高结果。我们的团队在IEEE国际多媒体与博览会(ICME) 2022年亚洲国家交通场景低功耗深度学习语义分割模型压缩竞赛中获得第一名。
{"title":"Low-Power Semantic Segmentation on Embedded Systems for Traffic in Asian Countries","authors":"Jun-Long Wang","doi":"10.1109/ICMEW56448.2022.9859408","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859408","url":null,"abstract":"In the real-world road scene, it needs a quick and powerful system to detect any traffic situation and compute prediction results in a limited hardware environment. In this paper, we present a semantic segmentation model that can perform well in the complex and dynamic Asian road scenes and evaluate the accuracy, power, and speed of MediaTek Dimensity 9000 platform. We use a model called Deep Dual-resolution Networks (DDRNets) in PyTorch, and deploy TensorFlow Lite format in MediaTek chip to assess our model. We choose the two-stage training strategy and utilize the decreasing training resolution technique to further improve results. Our team is the first-place winner in Low-power Deep Learning Semantic Segmentation Model Compression Competition for Traffic Scene in Asian Countries at IEEE International Conference on Multimedia & Expo (ICME) 2022.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121313606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PressyCube: An Embeddable Pressure Sensor with Softy Prop for Limb Rehabilitation in Immersive Virtual Reality PressyCube:一种可嵌入的带有软支撑的压力传感器,用于沉浸式虚拟现实中的肢体康复
Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859433
Chain Yi Chu, Ho Yin Ng, Chia-Hui Lin, Ping-Hsuan Han
In this demo, we present PressyCube, an embeddable pressure sensor that allows collecting physical force data from different types of exercises and body parts via Bluetooth for home-based rehabilitation. Currently, most home-based rehabilitation devices are focused on one specific body part only. Thus, we designed the pressure sensor that can be embedded with other props to extend its applications. We also demonstrate our work for fulfilling varying rehabilitation needs in the virtual environment with a head-mounted display by a swimming simulation exergame for players to work out using different exercises.
在这个演示中,我们展示了PressyCube,一种可嵌入的压力传感器,可以通过蓝牙从不同类型的锻炼和身体部位收集体力数据,用于家庭康复。目前,大多数基于家庭的康复设备只专注于身体的一个特定部位。因此,我们设计了可以与其他道具嵌入的压力传感器,以扩展其应用范围。我们还展示了我们的工作,以满足不同的康复需求,在虚拟环境中,头戴式显示器通过游泳模拟游戏,让玩家通过不同的练习进行锻炼。
{"title":"PressyCube: An Embeddable Pressure Sensor with Softy Prop for Limb Rehabilitation in Immersive Virtual Reality","authors":"Chain Yi Chu, Ho Yin Ng, Chia-Hui Lin, Ping-Hsuan Han","doi":"10.1109/ICMEW56448.2022.9859433","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859433","url":null,"abstract":"In this demo, we present PressyCube, an embeddable pressure sensor that allows collecting physical force data from different types of exercises and body parts via Bluetooth for home-based rehabilitation. Currently, most home-based rehabilitation devices are focused on one specific body part only. Thus, we designed the pressure sensor that can be embedded with other props to extend its applications. We also demonstrate our work for fulfilling varying rehabilitation needs in the virtual environment with a head-mounted display by a swimming simulation exergame for players to work out using different exercises.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122333461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Transformer-Based Approach for Metal 3d Printing Quality Recognition 基于变压器的金属3d打印质量识别方法
Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859324
Weihao Zhang, Jiapeng Wang, Honglin Ma, Qi Zhang, Shuqian Fan
The mass unlabeled production data hinders the large-scale application of advanced supervised learning techniques in the modern industry. Metal 3D printing generates huge amounts of in-situ data that are closely related to the forming quality of parts. In order to solve the problem of labor cost caused by re-labeling dataset when changing printing materials and process parameters, a forming quality recognition model based on deep clustering is designed, which makes the forming quality recognition task of metal 3D printing more flexible. Inspired by the success of Vision Transformer, we introduce convolutional neural networks into the Vision Transformer structure to model the inductive bias of images while learning the global representations. Our approach achieves state-of-the-art accuracy over the other Vision Transformer-based models. In addition, our proposed framework is a good candidate for specific industrial vision tasks where annotations are scarce.
大量未标注的生产数据阻碍了先进的监督学习技术在现代工业中的大规模应用。金属3D打印产生大量的现场数据,这些数据与零件的成形质量密切相关。为了解决在改变打印材料和工艺参数时,由于数据集重新标注造成的人工成本问题,设计了一种基于深度聚类的成形质量识别模型,使金属3D打印成形质量识别任务更加灵活。受Vision Transformer成功的启发,我们在Vision Transformer结构中引入卷积神经网络,在学习全局表示的同时对图像的归纳偏差进行建模。我们的方法比其他基于Vision transformer的模型实现了最先进的精度。此外,我们提出的框架对于缺乏注释的特定工业视觉任务来说是一个很好的候选。
{"title":"A Transformer-Based Approach for Metal 3d Printing Quality Recognition","authors":"Weihao Zhang, Jiapeng Wang, Honglin Ma, Qi Zhang, Shuqian Fan","doi":"10.1109/ICMEW56448.2022.9859324","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859324","url":null,"abstract":"The mass unlabeled production data hinders the large-scale application of advanced supervised learning techniques in the modern industry. Metal 3D printing generates huge amounts of in-situ data that are closely related to the forming quality of parts. In order to solve the problem of labor cost caused by re-labeling dataset when changing printing materials and process parameters, a forming quality recognition model based on deep clustering is designed, which makes the forming quality recognition task of metal 3D printing more flexible. Inspired by the success of Vision Transformer, we introduce convolutional neural networks into the Vision Transformer structure to model the inductive bias of images while learning the global representations. Our approach achieves state-of-the-art accuracy over the other Vision Transformer-based models. In addition, our proposed framework is a good candidate for specific industrial vision tasks where annotations are scarce.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126834210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Melodic Skeleton: A Musical Feature for Automatic Melody Harmonization 旋律骨架:自动旋律和声的音乐特征
Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859421
Weiyue Sun, Jianguo Wu, Shengcheng Yuan
Recently, deep learning models have achieved a good performance on automatic melody harmonization. However, these models often took melody note sequence as input directly without any feature extraction and analysis, causing the requirement of a large dataset to keep generalization. Inspired from the music theory of counterpoint writing, we introduce a novel musical feature called melodic skeleton, which summarizes the melody movement with strong harmony-related information. Based on the feature, a pipeline involving a skeleton analysis model is proposed for melody harmonization task. We collected a dataset by inviting musicians to annotate the skeleton tones from melodies and trained the skeleton analysis model. Experiments show a great improvement on six metrics which are commonly used in evaluating melody harmonization task, proving the effectiveness of the feature.
近年来,深度学习模型在自动旋律协调方面取得了良好的效果。然而,这些模型往往直接将旋律音符序列作为输入,没有进行任何特征提取和分析,导致对大数据集保持泛化的要求。受对位写作音乐理论的启发,我们引入了一种新的音乐特征——旋律骨架,它概括了具有强烈和声相关信息的旋律乐章。在此基础上,提出了一种包含骨架分析模型的管道来完成旋律和声任务。我们通过邀请音乐家对旋律中的骨架音调进行注释来收集数据集,并训练骨架分析模型。实验结果表明,该特征在评价旋律和声任务的六个常用指标上有较大的改进,证明了该特征的有效性。
{"title":"Melodic Skeleton: A Musical Feature for Automatic Melody Harmonization","authors":"Weiyue Sun, Jianguo Wu, Shengcheng Yuan","doi":"10.1109/ICMEW56448.2022.9859421","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859421","url":null,"abstract":"Recently, deep learning models have achieved a good performance on automatic melody harmonization. However, these models often took melody note sequence as input directly without any feature extraction and analysis, causing the requirement of a large dataset to keep generalization. Inspired from the music theory of counterpoint writing, we introduce a novel musical feature called melodic skeleton, which summarizes the melody movement with strong harmony-related information. Based on the feature, a pipeline involving a skeleton analysis model is proposed for melody harmonization task. We collected a dataset by inviting musicians to annotate the skeleton tones from melodies and trained the skeleton analysis model. Experiments show a great improvement on six metrics which are commonly used in evaluating melody harmonization task, proving the effectiveness of the feature.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126432574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Foldingnet-Based Geometry Compression of Point Cloud with Multi Descriptions 基于折叠网的多描述点云几何压缩
Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859339
Xiaoqi Ma, Qian Yin, Xinfeng Zhang, Lv Tang
Traditional point cloud compression (PCC) methods are not effective at extremely low bit rate scenarios because of the uniform quantization. Although learning-based PCC approaches can achieve superior compression performance, they need to train multiple models for different bit rate, which greatly increases the training complexity and memory storage. To tackle these challenges, a novel FoldingNet-based Point Cloud Geometry Compression (FN-PCGC) framework is proposed in this paper. Firstly, the point cloud is divided into several descriptions by a Multiple-Description Generation (MDG) module. Then a point-based Auto-Encoder with the Multi-scale Feature Extraction (MFE) is introduced to compress all the descriptions. Experimental results show that the proposed method outperforms the MPEG G-PCC and Draco with about 30% ~ 80% gain on average.
传统的点云压缩(PCC)方法在极低比特率的情况下由于量化的均匀性而不有效。基于学习的PCC方法虽然可以获得较好的压缩性能,但需要针对不同的比特率训练多个模型,这大大增加了训练复杂度和存储空间。为了解决这些问题,本文提出了一种新的基于foldingnet的点云几何压缩(FN-PCGC)框架。首先,通过多描述生成(Multiple-Description Generation, MDG)模块将点云划分为多个描述;然后引入基于点的多尺度特征提取(MFE)自编码器对所有描述进行压缩。实验结果表明,该方法优于MPEG - pcc和Draco,平均增益约为30% ~ 80%。
{"title":"Foldingnet-Based Geometry Compression of Point Cloud with Multi Descriptions","authors":"Xiaoqi Ma, Qian Yin, Xinfeng Zhang, Lv Tang","doi":"10.1109/ICMEW56448.2022.9859339","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859339","url":null,"abstract":"Traditional point cloud compression (PCC) methods are not effective at extremely low bit rate scenarios because of the uniform quantization. Although learning-based PCC approaches can achieve superior compression performance, they need to train multiple models for different bit rate, which greatly increases the training complexity and memory storage. To tackle these challenges, a novel FoldingNet-based Point Cloud Geometry Compression (FN-PCGC) framework is proposed in this paper. Firstly, the point cloud is divided into several descriptions by a Multiple-Description Generation (MDG) module. Then a point-based Auto-Encoder with the Multi-scale Feature Extraction (MFE) is introduced to compress all the descriptions. Experimental results show that the proposed method outperforms the MPEG G-PCC and Draco with about 30% ~ 80% gain on average.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129072510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Accelerating Brain Research using Explainable Artificial Intelligence 使用可解释的人工智能加速大脑研究
Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859322
Jing-Lun Chou, Ya-Lin Huang, Chia-Ying Hsieh, Jian-Xue Huang, Chunshan Wei
In this demo, we present ExBrainable, an open-source application dedicated to modeling, evaluating and visualizing explainable CNN-based models on EEG data for brain/neuroscience research. We have implemented the functions including EEG data loading, model training, evaluation and parameter visualization. The application is also built with a model base including representative convolutional neural network architectures for users to implement without any programming. With its easy-to-use graphic user interface (GUI), it is completely available for investigators of different disciplines with limited resource and limited programming skill. Starting with preprocessed EEG data, users can quickly train the desired model, evaluate the performance, and finally visualize features learned by the model with no pain.
在这个演示中,我们展示了ExBrainable,一个开源应用程序,致力于建模、评估和可视化基于脑电图数据的可解释cnn模型,用于脑/神经科学研究。实现了脑电数据加载、模型训练、评价和参数可视化等功能。该应用程序还建立了一个模型库,包括代表性的卷积神经网络架构,供用户在没有任何编程的情况下实现。它具有易于使用的图形用户界面(GUI),完全适用于资源有限和编程技能有限的不同学科的研究人员。从预处理的EEG数据开始,用户可以快速训练所需的模型,评估性能,最后将模型学习到的特征可视化。
{"title":"Accelerating Brain Research using Explainable Artificial Intelligence","authors":"Jing-Lun Chou, Ya-Lin Huang, Chia-Ying Hsieh, Jian-Xue Huang, Chunshan Wei","doi":"10.1109/ICMEW56448.2022.9859322","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859322","url":null,"abstract":"In this demo, we present ExBrainable, an open-source application dedicated to modeling, evaluating and visualizing explainable CNN-based models on EEG data for brain/neuroscience research. We have implemented the functions including EEG data loading, model training, evaluation and parameter visualization. The application is also built with a model base including representative convolutional neural network architectures for users to implement without any programming. With its easy-to-use graphic user interface (GUI), it is completely available for investigators of different disciplines with limited resource and limited programming skill. Starting with preprocessed EEG data, users can quickly train the desired model, evaluate the performance, and finally visualize features learned by the model with no pain.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133658190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
MICW: A Multi-Instrument Music Generation Model Based on the Improved Compound Word 基于改进复合词的多乐器音乐生成模型
Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859531
Yi-Jr Liao, Wang Yue, Yuqing Jian, Zijun Wang, Yuchong Gao, Chenhao Lu
In this work, we address the task of multi-instrument music generation. Notably, along with the development of artificial neural networks, deep learning has become a leading technique to accelerate the automatic music generation and is featured in many previous papers like MuseGan[1], MusicBert[2], and PopMAG[3]. However, seldom of them implement a well-designed representation of multi-instrumental music, and no model perfectly introduces a prior knowledge of music theory. In this paper, we leverage the Compound Word[4] and R-drop[5] method to work on multi-instrument music generation tasks. Objective and subjective evaluations show that the generated music has cost less training time, and achieved prominent music quality.
在这项工作中,我们解决了多乐器音乐生成的任务。值得注意的是,随着人工神经网络的发展,深度学习已经成为加速自动音乐生成的主要技术,并且在许多先前的论文中都有介绍,如MuseGan[1], MusicBert[2]和PopMAG[3]。然而,它们很少实现多乐器音乐的精心设计的表示,也没有一个模型完美地引入了音乐理论的先验知识。在本文中,我们利用Compound Word[4]和R-drop[5]方法来完成多乐器音乐生成任务。客观评价和主观评价表明,生成的音乐训练时间少,音乐质量突出。
{"title":"MICW: A Multi-Instrument Music Generation Model Based on the Improved Compound Word","authors":"Yi-Jr Liao, Wang Yue, Yuqing Jian, Zijun Wang, Yuchong Gao, Chenhao Lu","doi":"10.1109/ICMEW56448.2022.9859531","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859531","url":null,"abstract":"In this work, we address the task of multi-instrument music generation. Notably, along with the development of artificial neural networks, deep learning has become a leading technique to accelerate the automatic music generation and is featured in many previous papers like MuseGan[1], MusicBert[2], and PopMAG[3]. However, seldom of them implement a well-designed representation of multi-instrumental music, and no model perfectly introduces a prior knowledge of music theory. In this paper, we leverage the Compound Word[4] and R-drop[5] method to work on multi-instrument music generation tasks. Objective and subjective evaluations show that the generated music has cost less training time, and achieved prominent music quality.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132587647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Low-Cost Virtual 2D Spokes-Character Advertising Framework 一个低成本的虚拟2D代言人角色广告框架
Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859278
Jiarun Zhang, Zhao Li, Jialun Zhang, Zhiqiang Zhang
Live-streaming advertising has achieved huge success in modern retail platforms. However, small-scaled merchants are neither economically nor technically capable of having their own spokes-person. Addressing the need for the massive online interactive advertising, this paper proposes an economic-efficient approach, Virtual spokes-Character Advertising (VSCA). VSCA generates 2-D Virtual spokes-Character advertising video and provides it to the merchants as a supplementary marketing method. VSCA first generates the simplified natural language description of the merchandise from its original long title using text generation methods and then passes it to the Text-to-Speech model for the audio description. Secondly, VSCA remits the audio to our remodeled two-phases lip-syncing network to generate virtual advertising videos about the given merchandise. With our novelly designed two-phases lip-syncing network, it is the first in the industry able to generate lip-syncing video of given audio with human face image input instead of video input. As the industry’s first application on 2D spokes-character advertising, VSCA has its large potential in real world applications.
直播广告在现代零售平台上取得了巨大的成功。然而,小商家既没有经济能力,也没有技术能力拥有自己的代言人。针对大规模网络互动广告的需求,本文提出了一种经济高效的方式——虚拟代言人广告(VSCA)。VSCA生成二维虚拟代言人广告视频,作为一种补充营销手段提供给商家。VSCA首先使用文本生成方法从商品的原始长标题生成简化的自然语言描述,然后将其传递给文本到语音模型进行音频描述。其次,VSCA将音频发送到我们改造的两阶段对口型网络,以生成关于给定商品的虚拟广告视频。凭借我们新颖设计的两阶段对口型网络,它是业界第一个能够以人脸图像输入代替视频输入的给定音频生成对口型视频的网络。作为业界首个2D代言人广告应用,VSCA在现实世界的应用潜力巨大。
{"title":"A Low-Cost Virtual 2D Spokes-Character Advertising Framework","authors":"Jiarun Zhang, Zhao Li, Jialun Zhang, Zhiqiang Zhang","doi":"10.1109/ICMEW56448.2022.9859278","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859278","url":null,"abstract":"Live-streaming advertising has achieved huge success in modern retail platforms. However, small-scaled merchants are neither economically nor technically capable of having their own spokes-person. Addressing the need for the massive online interactive advertising, this paper proposes an economic-efficient approach, Virtual spokes-Character Advertising (VSCA). VSCA generates 2-D Virtual spokes-Character advertising video and provides it to the merchants as a supplementary marketing method. VSCA first generates the simplified natural language description of the merchandise from its original long title using text generation methods and then passes it to the Text-to-Speech model for the audio description. Secondly, VSCA remits the audio to our remodeled two-phases lip-syncing network to generate virtual advertising videos about the given merchandise. With our novelly designed two-phases lip-syncing network, it is the first in the industry able to generate lip-syncing video of given audio with human face image input instead of video input. As the industry’s first application on 2D spokes-character advertising, VSCA has its large potential in real world applications.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128173414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1