多模态表示学习:进展、趋势和挑战

2019 International Conference on Machine Learning and Cybernetics (ICMLC) Pub Date : 2019-07-01 DOI:10.1109/ICMLC48188.2019.8949228

Sufang Zhang, Jun-Hai Zhai, Bo-Jun Xie, Yan Zhan, Xin Wang

{"title":"多模态表示学习:进展、趋势和挑战","authors":"Sufang Zhang, Jun-Hai Zhai, Bo-Jun Xie, Yan Zhan, Xin Wang","doi":"10.1109/ICMLC48188.2019.8949228","DOIUrl":null,"url":null,"abstract":"Representation learning is the base and crucial for consequential tasks, such as classification, regression, and recognition. The goal of representation learning is to automatically learning good features with deep models. Multimodal representation learning is a special representation learning, which automatically learns good features from multiple modalities, and these modalities are not independent, there are correlations and associations among modalities. Furthermore, multimodal data are usually heterogeneous. Due to the characteristics, multimodal representation learning poses many difficulties: how to combine multimodal data from heterogeneous sources; how to jointly learning features from multimodal data; how to effectively describe the correlations and associations, etc. These difficulties triggered great interest of researchers along with the upsurge of deep learning, many deep multimodal learning methods have been proposed by different researchers. In this paper, we present an overview of deep multimodal learning, especially the approaches proposed within the last decades. We provide potential readers with advances, trends and challenges, which can be very helpful to researchers in the field of machine, especially for the ones engaging in the study of multimodal deep machine learning.","PeriodicalId":221349,"journal":{"name":"2019 International Conference on Machine Learning and Cybernetics (ICMLC)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Multimodal Representation Learning: Advances, Trends and Challenges\",\"authors\":\"Sufang Zhang, Jun-Hai Zhai, Bo-Jun Xie, Yan Zhan, Xin Wang\",\"doi\":\"10.1109/ICMLC48188.2019.8949228\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Representation learning is the base and crucial for consequential tasks, such as classification, regression, and recognition. The goal of representation learning is to automatically learning good features with deep models. Multimodal representation learning is a special representation learning, which automatically learns good features from multiple modalities, and these modalities are not independent, there are correlations and associations among modalities. Furthermore, multimodal data are usually heterogeneous. Due to the characteristics, multimodal representation learning poses many difficulties: how to combine multimodal data from heterogeneous sources; how to jointly learning features from multimodal data; how to effectively describe the correlations and associations, etc. These difficulties triggered great interest of researchers along with the upsurge of deep learning, many deep multimodal learning methods have been proposed by different researchers. In this paper, we present an overview of deep multimodal learning, especially the approaches proposed within the last decades. We provide potential readers with advances, trends and challenges, which can be very helpful to researchers in the field of machine, especially for the ones engaging in the study of multimodal deep machine learning.\",\"PeriodicalId\":221349,\"journal\":{\"name\":\"2019 International Conference on Machine Learning and Cybernetics (ICMLC)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on Machine Learning and Cybernetics (ICMLC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLC48188.2019.8949228\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Machine Learning and Cybernetics (ICMLC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLC48188.2019.8949228","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

摘要

表征学习是分类、回归和识别等后续任务的基础和关键。表示学习的目标是利用深度模型自动学习好的特征。多模态表征学习是一种特殊的表征学习，它自动地从多个模态中学习到好的特征，并且这些模态之间不是独立的，而是相互关联的。此外，多模态数据通常是异构的。由于这些特点，多模态表示学习带来了许多困难:如何组合来自异构源的多模态数据;如何从多模态数据中联合学习特征;如何有效地描述相关性和关联性等。随着深度学习的兴起，这些困难引起了研究者的极大兴趣，不同的研究者提出了许多深度多模态学习方法。在本文中，我们介绍了深度多模态学习的概述，特别是在过去的几十年里提出的方法。我们为潜在的读者提供了一些进展、趋势和挑战，这对机器领域的研究人员，特别是从事多模态深度机器学习研究的研究人员非常有帮助。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Multimodal Representation Learning: Advances, Trends and Challenges

Representation learning is the base and crucial for consequential tasks, such as classification, regression, and recognition. The goal of representation learning is to automatically learning good features with deep models. Multimodal representation learning is a special representation learning, which automatically learns good features from multiple modalities, and these modalities are not independent, there are correlations and associations among modalities. Furthermore, multimodal data are usually heterogeneous. Due to the characteristics, multimodal representation learning poses many difficulties: how to combine multimodal data from heterogeneous sources; how to jointly learning features from multimodal data; how to effectively describe the correlations and associations, etc. These difficulties triggered great interest of researchers along with the upsurge of deep learning, many deep multimodal learning methods have been proposed by different researchers. In this paper, we present an overview of deep multimodal learning, especially the approaches proposed within the last decades. We provide potential readers with advances, trends and challenges, which can be very helpful to researchers in the field of machine, especially for the ones engaging in the study of multimodal deep machine learning.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 International Conference on Machine Learning and Cybernetics (ICMLC)

自引率

0.00%

发文量