首页 > 最新文献

2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)最新文献

英文 中文
3D Winograd Layer with Regular Mask Pruning 3D Winograd层与常规蒙版修剪
Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859539
Ziran Qin, Huanyu He, Weiyao Lin
The high computational cost and huge amount of parameters of 3D Convolutional Neural Networks(CNN) limit the deployment of 3D CNN-based models to lightweight devices with weak computing ability. Applying Winograd layer that combines the idea of pruning and Winograd algorithms to 3D convolution is a promising solution. However, the irregular and unstructured sparsity of 3D Winograd layer makes it difficult to implement efficient compression. In this paper, we propose Regular Mask Pruning to obtain the regular sparse 3D Winograd layer that can be easily compressed. Furthermore, we present Masked 3D Winograd Layer, which can store the compressed parameters of regular sparse Winograd Layer and reduce the number of multiplications in inference. For C3D-based Winograd model, we obtain a regular sparsity of 38.7% without accuracy loss. By our method, the parameters of Winograd layers and the multiplication operations of 3D convolution can be reduced up to 5.32× and 18× respectively.
3D卷积神经网络(CNN)的高计算成本和大量参数限制了基于3D CNN的模型部署在计算能力较弱的轻量级设备上。将Winograd层结合了剪枝和Winograd算法的思想应用于三维卷积是一个很有前途的解决方案。然而,三维Winograd层的不规则性和非结构化稀疏性使其难以实现有效的压缩。在本文中,我们提出了正则蒙版剪枝,以获得易于压缩的规则稀疏3D Winograd层。在此基础上,提出了一种屏蔽三维维诺格拉德层,该层可以存储正则稀疏维诺格拉德层压缩后的参数,减少推理中的乘法次数。对于基于c3d的Winograd模型,我们在没有精度损失的情况下获得了38.7%的规则稀疏度。采用该方法,Winograd层的参数和三维卷积的乘法运算分别可减少5.32倍和18倍。
{"title":"3D Winograd Layer with Regular Mask Pruning","authors":"Ziran Qin, Huanyu He, Weiyao Lin","doi":"10.1109/ICMEW56448.2022.9859539","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859539","url":null,"abstract":"The high computational cost and huge amount of parameters of 3D Convolutional Neural Networks(CNN) limit the deployment of 3D CNN-based models to lightweight devices with weak computing ability. Applying Winograd layer that combines the idea of pruning and Winograd algorithms to 3D convolution is a promising solution. However, the irregular and unstructured sparsity of 3D Winograd layer makes it difficult to implement efficient compression. In this paper, we propose Regular Mask Pruning to obtain the regular sparse 3D Winograd layer that can be easily compressed. Furthermore, we present Masked 3D Winograd Layer, which can store the compressed parameters of regular sparse Winograd Layer and reduce the number of multiplications in inference. For C3D-based Winograd model, we obtain a regular sparsity of 38.7% without accuracy loss. By our method, the parameters of Winograd layers and the multiplication operations of 3D convolution can be reduced up to 5.32× and 18× respectively.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131690031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Superpixel-Based Optimization for Point Cloud Reconstruction from Light Field 基于超像素的光场点云重建优化
Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859491
Heming Zhao, Yu Liu, Linhui Wei, Yumei Wang
As one of the most popular immersive data formats, point cloud has attracted attention due to its flexibility and simplicity. The advantage of multi-view in light field can capture rich scene information and be effectively applied to 3D point cloud reconstruction. However, existing point cloud reconstruction methods based on light field depth estimation often cause outlier points at the edges of objects, which greatly disturbs the visual effects. In this paper, we propose a superpixel-based optimization scheme for point cloud, which is reconstructed from light field. We use the superpixel-based edge detection algorithm and designed joint bilateral filter to optimize the blurred edge of the depth map, which is obtained from the EPI-based depth estimation. Then, minimal residual outlier points are removed by statistical outlier filter after point cloud generating. Experimental results show that the proposed method increases at least 25.8% in level of details (LoD) compared with several state-of-the-art methods for the real-world and synthetic light field datasets. Besides, the proposed method can restore outlier points reliably and retain the sharp features of point cloud.
点云作为最流行的沉浸式数据格式之一,以其灵活性和简单性而备受关注。光场多视点的优势可以捕获丰富的场景信息,有效地应用于三维点云重建。然而,现有的基于光场深度估计的点云重建方法往往会在物体边缘产生离群点,极大地干扰了视觉效果。本文提出了一种基于超像素的点云优化方案,该方案由光场重构而成。采用基于超像素的边缘检测算法,设计联合双边滤波器对基于epi的深度估计得到的深度图的模糊边缘进行优化。然后,在点云生成后,通过统计离群点滤波去除最小的残差点。实验结果表明,与几种最先进的方法相比,该方法在真实世界和合成光场数据集上的细节水平(LoD)提高了至少25.8%。此外,该方法可以可靠地恢复离群点,并保留点云的鲜明特征。
{"title":"Superpixel-Based Optimization for Point Cloud Reconstruction from Light Field","authors":"Heming Zhao, Yu Liu, Linhui Wei, Yumei Wang","doi":"10.1109/ICMEW56448.2022.9859491","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859491","url":null,"abstract":"As one of the most popular immersive data formats, point cloud has attracted attention due to its flexibility and simplicity. The advantage of multi-view in light field can capture rich scene information and be effectively applied to 3D point cloud reconstruction. However, existing point cloud reconstruction methods based on light field depth estimation often cause outlier points at the edges of objects, which greatly disturbs the visual effects. In this paper, we propose a superpixel-based optimization scheme for point cloud, which is reconstructed from light field. We use the superpixel-based edge detection algorithm and designed joint bilateral filter to optimize the blurred edge of the depth map, which is obtained from the EPI-based depth estimation. Then, minimal residual outlier points are removed by statistical outlier filter after point cloud generating. Experimental results show that the proposed method increases at least 25.8% in level of details (LoD) compared with several state-of-the-art methods for the real-world and synthetic light field datasets. Besides, the proposed method can restore outlier points reliably and retain the sharp features of point cloud.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125451298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High Performant AV1 for VOD Applications 高性能AV1视频点播应用
Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859523
Yunqing Wang, Chi-Yo Tsai, Jingning Han, Yaowu Xu
The AV1 video compression format is developed by the Alliance for Open Media (AOMedia) industry consortium, and achieves more than a 30% reduction in bit rate compared to its predecessor VP9 for the same decoded video quality. However, the encoder complexity of AV1 is much higher than VP9. In this paper, we discussed the optimization technologies used to reduce AV1 encoder complexity to the complexity level of VP9 encoder while still achieving 22% bit rate savings. The optimized libaom AV1 encoder offers a superb solution for video-on-demand (VOD) applications, reducing the encoding cost and generating huge bandwidth and storage savings.
AV1视频压缩格式是由开放媒体联盟(AOMedia)工业联盟开发的,在相同的解码视频质量下,与它的前身VP9相比,它的比特率降低了30%以上。然而,AV1的编码器复杂度远高于VP9。在本文中,我们讨论了用于将AV1编码器的复杂性降低到VP9编码器的复杂性水平的优化技术,同时仍然实现22%的比特率节省。优化后的libaom AV1编码器为视频点播(VOD)应用提供了极好的解决方案,降低了编码成本,并节省了巨大的带宽和存储。
{"title":"High Performant AV1 for VOD Applications","authors":"Yunqing Wang, Chi-Yo Tsai, Jingning Han, Yaowu Xu","doi":"10.1109/ICMEW56448.2022.9859523","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859523","url":null,"abstract":"The AV1 video compression format is developed by the Alliance for Open Media (AOMedia) industry consortium, and achieves more than a 30% reduction in bit rate compared to its predecessor VP9 for the same decoded video quality. However, the encoder complexity of AV1 is much higher than VP9. In this paper, we discussed the optimization technologies used to reduce AV1 encoder complexity to the complexity level of VP9 encoder while still achieving 22% bit rate savings. The optimized libaom AV1 encoder offers a superb solution for video-on-demand (VOD) applications, reducing the encoding cost and generating huge bandwidth and storage savings.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116895490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interaction Guided Hand-Held Object Detection 交互引导手持对象检测
Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859337
Kaiyuan Dong, Yuang Zhang, Aixin Zhang
General object detection methods detect various objects independently without paying attention to the relationship between objects. Independent detection of hands and hand-held objects does not take the hand-object relation into account, so the object detector does not achieve optimal performance in such scenes with associated objects. The detection of hand-held objects is an important topic for its wide applications, but few works have been done to model this specific scenario. In this paper, we uses information about the interaction between the hand and the hand-held object as well as causal information when observing the hand-held object as a person to build a novel hand-held object detection model. Experimental results show that our method greatly improves hand-held object detection performance.
一般的物体检测方法都是独立检测各种物体,而不考虑物体之间的关系。手和手持物体的独立检测没有考虑手-物体的关系,因此在有关联物体的场景中,物体检测器无法达到最佳性能。手持物体的检测是其广泛应用的一个重要课题,但很少有研究对这一特定场景进行建模。在本文中,我们利用手与手持物体之间的交互信息以及作为人观察手持物体时的因果信息,构建了一种新的手持物体检测模型。实验结果表明,该方法大大提高了手持目标检测性能。
{"title":"Interaction Guided Hand-Held Object Detection","authors":"Kaiyuan Dong, Yuang Zhang, Aixin Zhang","doi":"10.1109/ICMEW56448.2022.9859337","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859337","url":null,"abstract":"General object detection methods detect various objects independently without paying attention to the relationship between objects. Independent detection of hands and hand-held objects does not take the hand-object relation into account, so the object detector does not achieve optimal performance in such scenes with associated objects. The detection of hand-held objects is an important topic for its wide applications, but few works have been done to model this specific scenario. In this paper, we uses information about the interaction between the hand and the hand-held object as well as causal information when observing the hand-held object as a person to build a novel hand-held object detection model. Experimental results show that our method greatly improves hand-held object detection performance.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126913543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Multi-Attribute Joint Point Cloud Super-Resolution with Adversarial Feature Graph Networks 基于对抗特征图网络的多属性联合点云超分辨率
Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859457
Yichen Zhou, Xinfeng Zhang, Shanshe Wang, Lin Li
3D point cloud super-resolution (PCSR) plays an important role in many applications, which can infer a dense geometric shape from a sparse one. However, existing PCSR methods only leverage the geometric properties to predict dense geometric coordinates without considering the importance of correlated attributes in the prediction of complex geometric structures. In this paper, we propose a novel PCSR network by leveraging color attributes to improve the reconstruction quality of dense geometric shape. In the proposed network, we utilize graph convolutions to obtain cross-domain structure representation for point cloud from both geometric coordinates and color attributes, which is constructed by aggregating local points based on the similarity of cross-domain features. Furthermore, we propose a shape-aware loss function to cooperate with network training, which constrains the point cloud generation from both overall and detailed aspects. Extensive experimental results show that our proposed method outperforms the state-of-the-art methods from both objective and subjective quality.
三维点云超分辨率(PCSR)可以从稀疏的几何形状推断出密集的几何形状,在许多应用中发挥着重要作用。然而,现有的PCSR方法仅利用几何属性来预测密集几何坐标,而没有考虑相关属性在复杂几何结构预测中的重要性。本文提出了一种新的PCSR网络,利用颜色属性来提高密集几何形状的重建质量。在本文提出的网络中,我们利用图卷积从几何坐标和颜色属性中获得点云的跨域结构表示,该网络是基于跨域特征的相似性对局部点进行聚合而构建的。此外,我们提出了一个形状感知损失函数来配合网络训练,从整体和细节两个方面约束点云的生成。大量的实验结果表明,我们提出的方法在客观和主观质量上都优于目前最先进的方法。
{"title":"Multi-Attribute Joint Point Cloud Super-Resolution with Adversarial Feature Graph Networks","authors":"Yichen Zhou, Xinfeng Zhang, Shanshe Wang, Lin Li","doi":"10.1109/ICMEW56448.2022.9859457","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859457","url":null,"abstract":"3D point cloud super-resolution (PCSR) plays an important role in many applications, which can infer a dense geometric shape from a sparse one. However, existing PCSR methods only leverage the geometric properties to predict dense geometric coordinates without considering the importance of correlated attributes in the prediction of complex geometric structures. In this paper, we propose a novel PCSR network by leveraging color attributes to improve the reconstruction quality of dense geometric shape. In the proposed network, we utilize graph convolutions to obtain cross-domain structure representation for point cloud from both geometric coordinates and color attributes, which is constructed by aggregating local points based on the similarity of cross-domain features. Furthermore, we propose a shape-aware loss function to cooperate with network training, which constrains the point cloud generation from both overall and detailed aspects. Extensive experimental results show that our proposed method outperforms the state-of-the-art methods from both objective and subjective quality.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128549045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
LFC-SASR: Light Field Coding Using Spatial and Angular Super-Resolution LFC-SASR:利用空间和角度超分辨率的光场编码
Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859373
Ekrem Çetinkaya, Hadi Amirpour, C. Timmerer
Light field imaging enables post-capture actions such as refocusing and changing view perspective by capturing both spatial and angular information. However, capturing richer information of the 3D scene results in a huge amount of data. To improve the compression efficiency of the existing light field compression methods, we investigate the impact of light field super-resolution approaches (both spatial and angular super-resolution) on the compression efficiency. To this end, firstly, we downscale light field images over (i) spatial resolution, (ii) angular resolution, and (iii) spatial-angular resolution and encode them using Versatile Video Coding (VVC). We then apply a set of light field super-resolution deep neural networks to reconstruct light field images in their full spatial-angular resolution and compare their compression efficiency. Experimental results show that encoding the low angular resolution light field image and applying angular super-resolution yield bitrate savings of 51.16% and 53.41% to maintain the same PSNR and SSIM, respectively, compared to encoding the light field image in high-resolution.
光场成像可以通过捕获空间和角度信息来实现诸如重新聚焦和改变视角等捕获后操作。然而,捕获更丰富的3D场景信息导致数据量巨大。为了提高现有光场压缩方法的压缩效率,我们研究了光场超分辨率方法(空间超分辨率和角度超分辨率)对压缩效率的影响。为此,首先,我们降低了(i)空间分辨率,(ii)角分辨率和(iii)空间角分辨率的光场图像,并使用通用视频编码(VVC)对它们进行编码。然后,我们应用一组光场超分辨率深度神经网络对光场图像进行全空间角分辨率重构,并比较其压缩效率。实验结果表明,与高分辨率光场图像编码相比,低角度分辨率光场图像编码和应用角度超分辨率光场图像编码在保持相同的PSNR和SSIM的情况下,分别节省了51.16%和53.41%的比特率。
{"title":"LFC-SASR: Light Field Coding Using Spatial and Angular Super-Resolution","authors":"Ekrem Çetinkaya, Hadi Amirpour, C. Timmerer","doi":"10.1109/ICMEW56448.2022.9859373","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859373","url":null,"abstract":"Light field imaging enables post-capture actions such as refocusing and changing view perspective by capturing both spatial and angular information. However, capturing richer information of the 3D scene results in a huge amount of data. To improve the compression efficiency of the existing light field compression methods, we investigate the impact of light field super-resolution approaches (both spatial and angular super-resolution) on the compression efficiency. To this end, firstly, we downscale light field images over (i) spatial resolution, (ii) angular resolution, and (iii) spatial-angular resolution and encode them using Versatile Video Coding (VVC). We then apply a set of light field super-resolution deep neural networks to reconstruct light field images in their full spatial-angular resolution and compare their compression efficiency. Experimental results show that encoding the low angular resolution light field image and applying angular super-resolution yield bitrate savings of 51.16% and 53.41% to maintain the same PSNR and SSIM, respectively, compared to encoding the light field image in high-resolution.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"166 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132756321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Detic-Track: Robust Detection and Tracking of Objects in Video 图像跟踪:视频中物体的鲁棒检测和跟踪
Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859437
Hannes Fassold
The automatic detection and tracking of objects in a video is crucial for many video understanding tasks. We propose a novel deep learning based algorithm for object detection and tracking, which is able to detect more than 1,000 object classes and tracks them robustly, even for challenging content. The robustness of the tracking is due to the usage of optical flow information. Additionally, we utilize only the part of the bounding box corresponding to the object shape for the tracking.
视频中物体的自动检测和跟踪对于许多视频理解任务至关重要。我们提出了一种新的基于深度学习的对象检测和跟踪算法,它能够检测超过1000个对象类并对它们进行鲁棒跟踪,即使是具有挑战性的内容。跟踪的鲁棒性是由于利用了光流信息。此外,我们只利用与物体形状相对应的边界框部分进行跟踪。
{"title":"Detic-Track: Robust Detection and Tracking of Objects in Video","authors":"Hannes Fassold","doi":"10.1109/ICMEW56448.2022.9859437","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859437","url":null,"abstract":"The automatic detection and tracking of objects in a video is crucial for many video understanding tasks. We propose a novel deep learning based algorithm for object detection and tracking, which is able to detect more than 1,000 object classes and tracks them robustly, even for challenging content. The robustness of the tracking is due to the usage of optical flow information. Additionally, we utilize only the part of the bounding box corresponding to the object shape for the tracking.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124104539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Decide the Next Pitch: A Pitch Prediction Model Using Attention-Based LSTM 决定下一个音调:使用基于注意力的LSTM的音调预测模型
Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859411
Chih-Chang Yu, Chih-Ching Chang, Hsu-Yung Cheng
Information collection and analysis have played a very important role in high-level baseball competitions. Knowing opponent’s possible strategies or weakness can help own team plan adequate countermeasures. The purpose of this study is to explore how artificial intelligence technology can be applied to this domain. This study focuses on the pitching events in baseball. The goal is to predict the pitch types that a pitcher may throw in the next pitch according to the situation on the field. To achieve this, we mine discriminative features from baseball statistics and propose a stacked long-term and short-term memory model (LSTM) with attention mechanism. Experimental data come from the pitching data of 201 pitchers in Major League Baseball from 2016 to 2021. By collecting information of pitchers’ pitching statistics and on-field situations, results show that the average accuracy rate reaches 76.7%, outperforming conventional machine learning prediction models.
情报收集与分析在高水平棒球比赛中起着非常重要的作用。了解对手可能的策略或弱点可以帮助自己的球队制定适当的对策。本研究的目的是探讨人工智能技术如何应用于该领域。本研究以棒球投球项目为研究对象。目标是根据场上的情况预测投手在下一个投球中可能投出的投球类型。为了实现这一目标,我们从棒球统计数据中挖掘判别特征,并提出了一个具有注意机制的堆叠长短期记忆模型(LSTM)。实验数据来自2016 - 2021年美国职业棒球大联盟201名投手的投球数据。通过收集投手投球统计数据和现场情况信息,结果表明平均准确率达到76.7%,优于传统的机器学习预测模型。
{"title":"Decide the Next Pitch: A Pitch Prediction Model Using Attention-Based LSTM","authors":"Chih-Chang Yu, Chih-Ching Chang, Hsu-Yung Cheng","doi":"10.1109/ICMEW56448.2022.9859411","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859411","url":null,"abstract":"Information collection and analysis have played a very important role in high-level baseball competitions. Knowing opponent’s possible strategies or weakness can help own team plan adequate countermeasures. The purpose of this study is to explore how artificial intelligence technology can be applied to this domain. This study focuses on the pitching events in baseball. The goal is to predict the pitch types that a pitcher may throw in the next pitch according to the situation on the field. To achieve this, we mine discriminative features from baseball statistics and propose a stacked long-term and short-term memory model (LSTM) with attention mechanism. Experimental data come from the pitching data of 201 pitchers in Major League Baseball from 2016 to 2021. By collecting information of pitchers’ pitching statistics and on-field situations, results show that the average accuracy rate reaches 76.7%, outperforming conventional machine learning prediction models.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122532050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Kinship Face Synthesis Evaluation Website with Gamified Mechanism 基于游戏化机制的亲属面孔综合评价网站
Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859503
Hsiu-Chieh Lee, Che-Hsien Lin, Li-Chen Cheng, S. Hsu, Jun-Cheng Chen, Chih-Yu Wang
Kinship face synthesis has been drawing significant attention recently. Nevertheless, it is challenging to evaluate the performance of the generated images due to lacking ground truths and objective metrics. In this paper, we propose an interactive online website for kinship synthesis model evaluation. The website not only showcases the result of the model but also collects feedback data from the users for analysis and model improvement. The website allows users to 1) build their kinship dataset and generate kinship images with our models, and 2) play games to distinguish between a real child face image and synthesized child images. Through the incentives triggered by the gamified rating mechanism, we expect the collected feedback data will be more promising.
最近,亲属面孔合成备受关注。然而,由于缺乏基础事实和客观指标,评估生成图像的性能是具有挑战性的。在本文中,我们提出了一个交互式的在线网站,用于亲属关系综合模型的评估。该网站不仅展示了模型的结果,还收集了用户的反馈数据进行分析和模型改进。该网站允许用户1)使用我们的模型建立自己的亲属关系数据集并生成亲属关系图像,2)通过游戏区分真实的儿童面部图像和合成的儿童图像。通过游戏化的评分机制所引发的激励,我们期望收集到的反馈数据会更有希望。
{"title":"Kinship Face Synthesis Evaluation Website with Gamified Mechanism","authors":"Hsiu-Chieh Lee, Che-Hsien Lin, Li-Chen Cheng, S. Hsu, Jun-Cheng Chen, Chih-Yu Wang","doi":"10.1109/ICMEW56448.2022.9859503","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859503","url":null,"abstract":"Kinship face synthesis has been drawing significant attention recently. Nevertheless, it is challenging to evaluate the performance of the generated images due to lacking ground truths and objective metrics. In this paper, we propose an interactive online website for kinship synthesis model evaluation. The website not only showcases the result of the model but also collects feedback data from the users for analysis and model improvement. The website allows users to 1) build their kinship dataset and generate kinship images with our models, and 2) play games to distinguish between a real child face image and synthesized child images. Through the incentives triggered by the gamified rating mechanism, we expect the collected feedback data will be more promising.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116616657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Visually-Aware Food Analysis System for Diet Management 用于饮食管理的视觉感知食物分析系统
Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859471
Hang Wu, Xi Chen, Xuelong Li, Haokai Ma, Yuze Zheng, Xiangxian Li, Xiangxu Meng, Lei Meng
This demo illustrates a visually-aware food analysis (VAFA) system for socially-engaged diet management. VAFA is able to receive multimedia inputs, such as the images of food with/without a description to record a user’s daily diet. Such information will be passed to AI algorithms for food classification, ingredient recognition, and nutrition analysis, to produce a nutrition report for the user. Moreover, VAFA profiles the users’ eating habits to make personalized recipe recommendation and identify the social communities with similar eating preferences. VAFA is empowered by state-of-the-art AI algorithms and a large-scale dataset with 300K users, 400K recipes, and over 10M user-recipe interactions.
这个演示演示了一个用于社会参与饮食管理的视觉感知食物分析(VAFA)系统。VAFA能够接收多媒体输入,例如带/不带描述的食物图像,以记录用户的日常饮食。这些信息将被传递给人工智能算法进行食品分类、成分识别和营养分析,为用户生成营养报告。此外,VAFA分析用户的饮食习惯,进行个性化的食谱推荐,并识别具有相似饮食偏好的社会群体。VAFA由最先进的人工智能算法和拥有30万用户、40万食谱和超过1000万用户食谱交互的大规模数据集提供支持。
{"title":"A Visually-Aware Food Analysis System for Diet Management","authors":"Hang Wu, Xi Chen, Xuelong Li, Haokai Ma, Yuze Zheng, Xiangxian Li, Xiangxu Meng, Lei Meng","doi":"10.1109/ICMEW56448.2022.9859471","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859471","url":null,"abstract":"This demo illustrates a visually-aware food analysis (VAFA) system for socially-engaged diet management. VAFA is able to receive multimedia inputs, such as the images of food with/without a description to record a user’s daily diet. Such information will be passed to AI algorithms for food classification, ingredient recognition, and nutrition analysis, to produce a nutrition report for the user. Moreover, VAFA profiles the users’ eating habits to make personalized recipe recommendation and identify the social communities with similar eating preferences. VAFA is empowered by state-of-the-art AI algorithms and a large-scale dataset with 300K users, 400K recipes, and over 10M user-recipe interactions.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128948195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1