The 2022 low-power deep learning semantic segmentation model compression competition for traffic scene in Asian countries held in IEEE ICME2022 Grand Challenges focuses on the semantic segmentation technologies in autonomous driving scenarios. The competition aims to semantically segment objects in traffic with low power and high mean intersection over union (mIOU) in the Asia countries (e.g., Taiwan), which contain several harsh driving environments. The target segmented objects include dashed white line, dashed yellow line, single white line, single yellow line, double dashed white line, double white line, double yellow line, main lane, and alter lane. There are 35,500 annotated images provided for model training revised from Berkeley Deep Drive 100K and 130 annotated images provided for example from Asian road conditions. Additional 2,012 testing images are used in the contest evaluation process, in which 1,200 of them are used in the qualification stage competition, and the rest are used in the final stage competition. There are in total 203 registered teams joining this competition, and the top 15 teams with the highest mIOU entered the final stage competition, from which 8 teams submitted the final results. The overall best model belongs to team “okt2077”, followed by team “asdggg” and team “AVCLab.” A special award for the best INT8 model development award is absent.
在IEEE ICME2022大挑战中举行的2022年亚洲国家交通场景低功耗深度学习语义分割模型压缩竞赛,重点关注自动驾驶场景中的语义分割技术。该竞赛旨在对亚洲国家(如台湾)中具有低功率和高平均交叉路口(mIOU)的交通对象进行语义分割,这些国家包含几个恶劣的驾驶环境。目标分割对象包括白虚线、黄虚线、单白线、单黄线、双白虚线、双白线、双黄线、主要车道、改变车道。为模型训练提供了35500张来自Berkeley Deep Drive 100K的注释图像,并提供了130张来自亚洲路况的注释图像。另外,在比赛评审过程中使用了2012张测试图像,其中1200张用于资格赛阶段的比赛,其余的用于决赛阶段的比赛。本次比赛共有203支报名队伍参加,mIOU得分最高的前15支队伍进入决赛阶段比赛,其中8支队伍提交了最终成绩。整体最佳模型为“okt2077”团队,其次为“asdggg”团队和“AVCLab”团队。没有设立最佳INT8车型开发特别奖。
{"title":"Summary of the 2022 Low-Power Deep Learning Semantic Segmentation Model Compression Competition for Traffic Scene In Asian Countries","authors":"Yu-Shu Ni, Chia-Chi Tsai, Chih-Cheng Chen, Po-Yu Chen, Hsien-Kai Kuo, Man-Yu Lee, Kuo Chin-Chuan, Zhe-Ln Hu, Po-Chi Hu, Ted T. Kuo, Jenq-Neng Hwang, Jiun-In Guo","doi":"10.1109/ICMEW56448.2022.9859367","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859367","url":null,"abstract":"The 2022 low-power deep learning semantic segmentation model compression competition for traffic scene in Asian countries held in IEEE ICME2022 Grand Challenges focuses on the semantic segmentation technologies in autonomous driving scenarios. The competition aims to semantically segment objects in traffic with low power and high mean intersection over union (mIOU) in the Asia countries (e.g., Taiwan), which contain several harsh driving environments. The target segmented objects include dashed white line, dashed yellow line, single white line, single yellow line, double dashed white line, double white line, double yellow line, main lane, and alter lane. There are 35,500 annotated images provided for model training revised from Berkeley Deep Drive 100K and 130 annotated images provided for example from Asian road conditions. Additional 2,012 testing images are used in the contest evaluation process, in which 1,200 of them are used in the qualification stage competition, and the rest are used in the final stage competition. There are in total 203 registered teams joining this competition, and the top 15 teams with the highest mIOU entered the final stage competition, from which 8 teams submitted the final results. The overall best model belongs to team “okt2077”, followed by team “asdggg” and team “AVCLab.” A special award for the best INT8 model development award is absent.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121533987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-18DOI: 10.1109/ICMEW56448.2022.9859482
Haifeng Ma, Ke Lu, Jian Xue, Zehai Niu, Pengcheng Gao
Transformer-based architecture has achieved great results in sequence to sequence tasks and vision tasks including 3D human pose estimation. However, transformer based 3D human pose estimation method is not as strong as RNN and CNN in terms of local information acquisition. Additionally, local information plays a major role in obtaining 3D positional relationships. In this paper, we propose a method that combines local human body parts and global skeleton joints using a temporal transformer to finely track the temporal motion of human body parts. First, we encode positional and temporal information, then we use a local to global temporal transformer to obtain local and global information, and finally we obtain the target 3D human pose. To evaluate the effectiveness of our method, we quantitatively and qualitatively evaluated our method on two popular and standard benchmark datasets: Human3.6M and HumanEva-I. Extensive experiments demonstrated that we achieved state-of-the-art performance on Human3.6M with 2D ground truth as input.
{"title":"Local to Global Transformer for Video Based 3d Human Pose Estimation","authors":"Haifeng Ma, Ke Lu, Jian Xue, Zehai Niu, Pengcheng Gao","doi":"10.1109/ICMEW56448.2022.9859482","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859482","url":null,"abstract":"Transformer-based architecture has achieved great results in sequence to sequence tasks and vision tasks including 3D human pose estimation. However, transformer based 3D human pose estimation method is not as strong as RNN and CNN in terms of local information acquisition. Additionally, local information plays a major role in obtaining 3D positional relationships. In this paper, we propose a method that combines local human body parts and global skeleton joints using a temporal transformer to finely track the temporal motion of human body parts. First, we encode positional and temporal information, then we use a local to global temporal transformer to obtain local and global information, and finally we obtain the target 3D human pose. To evaluate the effectiveness of our method, we quantitatively and qualitatively evaluated our method on two popular and standard benchmark datasets: Human3.6M and HumanEva-I. Extensive experiments demonstrated that we achieved state-of-the-art performance on Human3.6M with 2D ground truth as input.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127352885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-18DOI: 10.1109/ICMEW56448.2022.9859395
Fangqi Li, Shilin Wang, Alan Wee-Chung Liew
With the wide application of deep learning models, it is important to verify an author’s possession over a deep neural network model by watermarks and protect the model. The development of distributed learning paradigms such as federated learning raises new challenges for model protection. Each author should be able to conduct independent verification and trace traitors. To meet those requirements, we propose a watermarking protocol, Merkle-Sign to meet the prerequisites for ownership verification in federated learning. Our work paves the way for generalizing watermark as a practical security mechanism for protecting deep learning models in distributed learning platforms.
{"title":"Watermarking Protocol for Deep Neural Network Ownership Regulation in Federated Learning","authors":"Fangqi Li, Shilin Wang, Alan Wee-Chung Liew","doi":"10.1109/ICMEW56448.2022.9859395","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859395","url":null,"abstract":"With the wide application of deep learning models, it is important to verify an author’s possession over a deep neural network model by watermarks and protect the model. The development of distributed learning paradigms such as federated learning raises new challenges for model protection. Each author should be able to conduct independent verification and trace traitors. To meet those requirements, we propose a watermarking protocol, Merkle-Sign to meet the prerequisites for ownership verification in federated learning. Our work paves the way for generalizing watermark as a practical security mechanism for protecting deep learning models in distributed learning platforms.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123474391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-18DOI: 10.1109/ICMEW56448.2022.9859362
Jing-Yuan Huang, Grace Theodore, You-Shin Tsai, Jerry Chin-Han Goh, Mu-Hang Lin, Kuan-Wei Tseng, Y. Hung
Multisensory experience enables Virtual Reality (VR) to have a great potential to reduce stress. We explore four different senses, including sight, hearing, smell, and touch, that can promote relaxation in VR. In particular, we construct an immersive virtual scene, which is combined with selffamiliar vocal guidance, precisely-delivered scent, and a haptic breathing stuffed animal, to provide visual, auditory, olfactory, and tactile feedback in VR. Each component in our system achieves high fidelity so that, when integrated, the user can enjoy an effective relaxation experience.
{"title":"Exploring Multisensory Feedback for Virtual Reality Relaxation","authors":"Jing-Yuan Huang, Grace Theodore, You-Shin Tsai, Jerry Chin-Han Goh, Mu-Hang Lin, Kuan-Wei Tseng, Y. Hung","doi":"10.1109/ICMEW56448.2022.9859362","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859362","url":null,"abstract":"Multisensory experience enables Virtual Reality (VR) to have a great potential to reduce stress. We explore four different senses, including sight, hearing, smell, and touch, that can promote relaxation in VR. In particular, we construct an immersive virtual scene, which is combined with selffamiliar vocal guidance, precisely-delivered scent, and a haptic breathing stuffed animal, to provide visual, auditory, olfactory, and tactile feedback in VR. Each component in our system achieves high fidelity so that, when integrated, the user can enjoy an effective relaxation experience.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115641914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-18DOI: 10.1109/ICMEW56448.2022.9859459
Hongfei Wang, Wei Zhong, Lin Ma, Long Ye, Qin Zhang
In the field of musical emotion evaluation, the existing methods usually use subjective experiments, which are demanding on the experimental environment and lack of unified evaluation standard. This paper proposes an emotional quality evaluation method for generated music from the perspective of music emotion recognition. In the proposed method, we analyze the correlation between audio features and emotion category of music, and choose MFCC and Mel spectrum as the most significant audio features. And then the emotion recognition model is constructed based on residual convolutional network to predict the emotion category of generated music. In the experiments, we apply the proposed model to evaluate the emotional quality of generated music. The experimental results show that our model can achieve higher recognition accuracy and thus exhibits strong reliability for the objective emotional quality evaluation of generated music.
{"title":"Emotional Quality Evaluation for Generated Music Based on Emotion Recognition Model","authors":"Hongfei Wang, Wei Zhong, Lin Ma, Long Ye, Qin Zhang","doi":"10.1109/ICMEW56448.2022.9859459","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859459","url":null,"abstract":"In the field of musical emotion evaluation, the existing methods usually use subjective experiments, which are demanding on the experimental environment and lack of unified evaluation standard. This paper proposes an emotional quality evaluation method for generated music from the perspective of music emotion recognition. In the proposed method, we analyze the correlation between audio features and emotion category of music, and choose MFCC and Mel spectrum as the most significant audio features. And then the emotion recognition model is constructed based on residual convolutional network to predict the emotion category of generated music. In the experiments, we apply the proposed model to evaluate the emotional quality of generated music. The experimental results show that our model can achieve higher recognition accuracy and thus exhibits strong reliability for the objective emotional quality evaluation of generated music.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"177 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114422260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-18DOI: 10.1109/ICMEW56448.2022.9859385
Yen-Ting Lee, Cheng-te Li, Shou-De Lin
This paper aims to rephrase a sentence with a given condition, and the generated sentence should be similar to the origin sentence and satisfy the given condition without parallel training corpus. We propose a conditional sentence VAE (CS-VAE) model to solve the task. CS-VAE is trained as an autoencoder, along with the condition control on the generated sentence with the same semantics. With the experimental demonstration supported, CS-VAE is proven to effectively solve the task with high-quality sentences.
{"title":"Conditional Sentence Rephrasing without Parallel Training Corpus","authors":"Yen-Ting Lee, Cheng-te Li, Shou-De Lin","doi":"10.1109/ICMEW56448.2022.9859385","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859385","url":null,"abstract":"This paper aims to rephrase a sentence with a given condition, and the generated sentence should be similar to the origin sentence and satisfy the given condition without parallel training corpus. We propose a conditional sentence VAE (CS-VAE) model to solve the task. CS-VAE is trained as an autoencoder, along with the condition control on the generated sentence with the same semantics. With the experimental demonstration supported, CS-VAE is proven to effectively solve the task with high-quality sentences.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123587384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-18DOI: 10.1109/ICMEW56448.2022.9859499
Wenhao Gao, Xiaobing Li, Cong Jin, Tie Yun
Music analysis and understanding has always been the work of professionals. In order to help ordinary people congnize and perceive music, we put forward the Music Question Answering task in this paper. The goal of this task is to provide accurate answers given music and related questions. To this end, we made MQAdataset based on MagnaTagATune, which contains seven basic categories. According to the main source of the questions, all questions are divided into basic questions and depth questions. We tested on several models and analyzed the experimental results. The best model, Musicnn-MALiMo (Spectrogram,i=4), obtained 71.13% accuracy.
{"title":"Music Question Answering:Cognize and Perceive Music","authors":"Wenhao Gao, Xiaobing Li, Cong Jin, Tie Yun","doi":"10.1109/ICMEW56448.2022.9859499","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859499","url":null,"abstract":"Music analysis and understanding has always been the work of professionals. In order to help ordinary people congnize and perceive music, we put forward the Music Question Answering task in this paper. The goal of this task is to provide accurate answers given music and related questions. To this end, we made MQAdataset based on MagnaTagATune, which contains seven basic categories. According to the main source of the questions, all questions are divided into basic questions and depth questions. We tested on several models and analyzed the experimental results. The best model, Musicnn-MALiMo (Spectrogram,i=4), obtained 71.13% accuracy.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125573174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-18DOI: 10.1109/ICMEW56448.2022.9859497
Chih-Chung Hsu, Cheih Lee, Shen-Chieh Tai, Yun Jiang
Semantic segmentation techniques have become an attractive research field for autonomous driving. However, it is well-known that the computational complexity of the conventional semantic segmentation is relatively high compared to other computer vision applications. Fast inference of the semantic segmentation for autonomous driving is highly desired. A lightweight convolutional neural network, the Bilateral segmentation network (BiSeNet), is adopted in this paper. However, the performance of the conventional BiSeNet is not so reliable that the model quantization could lead to an even worse result. Therefore, we proposed an augmented training strategy to significantly improve the semantic segmentation task’s performance. First, heavy data augmentation, including CutOut, deformable distortion, and step-wise hard example mining, is used in the training phase to boost the performance of the feature representation learning. Second, the L1 and L2 norm regularization are also used in the model training to prevent the possible overfitting issue. Then, the post-quantization is performed on the TensorFlow-Lite model to significantly reduce the model size and computational complexity. The comprehensive experiments verified that the proposed method is effective and efficient for autonomous driving applications over other state-of-the-art methods.
{"title":"Augmented-Training-Aware Bisenet for Real-Time Semantic Segmentation","authors":"Chih-Chung Hsu, Cheih Lee, Shen-Chieh Tai, Yun Jiang","doi":"10.1109/ICMEW56448.2022.9859497","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859497","url":null,"abstract":"Semantic segmentation techniques have become an attractive research field for autonomous driving. However, it is well-known that the computational complexity of the conventional semantic segmentation is relatively high compared to other computer vision applications. Fast inference of the semantic segmentation for autonomous driving is highly desired. A lightweight convolutional neural network, the Bilateral segmentation network (BiSeNet), is adopted in this paper. However, the performance of the conventional BiSeNet is not so reliable that the model quantization could lead to an even worse result. Therefore, we proposed an augmented training strategy to significantly improve the semantic segmentation task’s performance. First, heavy data augmentation, including CutOut, deformable distortion, and step-wise hard example mining, is used in the training phase to boost the performance of the feature representation learning. Second, the L1 and L2 norm regularization are also used in the model training to prevent the possible overfitting issue. Then, the post-quantization is performed on the TensorFlow-Lite model to significantly reduce the model size and computational complexity. The comprehensive experiments verified that the proposed method is effective and efficient for autonomous driving applications over other state-of-the-art methods.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128703239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-18DOI: 10.1109/ICMEW56448.2022.9859320
K. Muchtar, Muhammad Rizky Munggaran, Adhiguna Mahendra, Khairul Anwar, Chih-Yang Lin
Over the last ten years, integrated video surveillance systems have become increasingly important in protecting public safety. Because a single surveillance camera continuously collects events in a specific field of view at all times of day and night, a system that can create a summary that concisely captures key elements of the incoming frames is required. To be more specific, due to time constraints, the enormous amount of video footage cannot be properly examined for analysis. As a result, it is vital to compile a summary of what happened on the scene and look for anomalous events in the footage. A unified approach for detecting and summarizing anomalous events is proposed. To detect the event and compute the anomaly scores, a 3D deep learning approach is used. Afterward, the scores are utilized to visualize and localize the anomalous regions. Finally, the blob analysis technique is used to extract the anomalous regions. To verify the results, quantitative and qualitative evaluations are provided. Experiments indicate that the proposed summarizing method keeps crucial information while producing competitive results. More qualitative results can be found through our project channel: https://youtu.be/eMPMjiGlCQI
{"title":"A Unified Video Summarization for Video Anomalies Through Deep Learning","authors":"K. Muchtar, Muhammad Rizky Munggaran, Adhiguna Mahendra, Khairul Anwar, Chih-Yang Lin","doi":"10.1109/ICMEW56448.2022.9859320","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859320","url":null,"abstract":"Over the last ten years, integrated video surveillance systems have become increasingly important in protecting public safety. Because a single surveillance camera continuously collects events in a specific field of view at all times of day and night, a system that can create a summary that concisely captures key elements of the incoming frames is required. To be more specific, due to time constraints, the enormous amount of video footage cannot be properly examined for analysis. As a result, it is vital to compile a summary of what happened on the scene and look for anomalous events in the footage. A unified approach for detecting and summarizing anomalous events is proposed. To detect the event and compute the anomaly scores, a 3D deep learning approach is used. Afterward, the scores are utilized to visualize and localize the anomalous regions. Finally, the blob analysis technique is used to extract the anomalous regions. To verify the results, quantitative and qualitative evaluations are provided. Experiments indicate that the proposed summarizing method keeps crucial information while producing competitive results. More qualitative results can be found through our project channel: https://youtu.be/eMPMjiGlCQI","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115990454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-18DOI: 10.1109/ICMEW56448.2022.9859412
Yu-xin Zhang, Fan Tang, Weiming Dong, Changsheng Xu
Knowing the representative artists can help the public better understand the characteristics of an art movement. In this paper, we propose the concept of artist representativity to assess how an artist can represent the characteristics of an art movement. We begin by presenting a novel approach to learn art-movement-related representations of artworks that enable the style and content features of artworks to be expressed. We then propose an artwork-based artist representation method, which considers the importance and quantity imbalance of artworks. Finally, we develop an artist representativity calculating method based on bi-level graph-based learning. Experiments demonstrate the effectiveness of our approach in predicting the artist representativity within an art movement.
{"title":"Quantification of Artist Representativity within an Art Movement","authors":"Yu-xin Zhang, Fan Tang, Weiming Dong, Changsheng Xu","doi":"10.1109/ICMEW56448.2022.9859412","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859412","url":null,"abstract":"Knowing the representative artists can help the public better understand the characteristics of an art movement. In this paper, we propose the concept of artist representativity to assess how an artist can represent the characteristics of an art movement. We begin by presenting a novel approach to learn art-movement-related representations of artworks that enable the style and content features of artworks to be expressed. We then propose an artwork-based artist representation method, which considers the importance and quantity imbalance of artworks. Finally, we develop an artist representativity calculating method based on bi-level graph-based learning. Experiments demonstrate the effectiveness of our approach in predicting the artist representativity within an art movement.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126649939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}