首页 > 最新文献

International Journal of Multimedia Information Retrieval最新文献

英文 中文
A local representation-enhanced recurrent convolutional network for image captioning 图像标注的局部表示增强递归卷积网络
IF 5.6 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-04-12 DOI: 10.1007/s13735-022-00231-y
Xiaoyi Wang, Jun Huang
{"title":"A local representation-enhanced recurrent convolutional network for image captioning","authors":"Xiaoyi Wang, Jun Huang","doi":"10.1007/s13735-022-00231-y","DOIUrl":"https://doi.org/10.1007/s13735-022-00231-y","url":null,"abstract":"","PeriodicalId":48501,"journal":{"name":"International Journal of Multimedia Information Retrieval","volume":"30 1","pages":"149 - 157"},"PeriodicalIF":5.6,"publicationDate":"2022-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78893857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multimodal Quasi-AutoRegression: forecasting the visual popularity of new fashion products 多模态准自回归:预测新时尚产品的视觉流行度
IF 5.6 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-04-08 DOI: 10.48550/arXiv.2204.04014
Stefanos Papadopoulos, C. Koutlis, S. Papadopoulos, Y. Kompatsiaris
Estimating the preferences of consumers is of utmost importance for the fashion industry as appropriately leveraging this information can be beneficial in terms of profit. Trend detection in fashion is a challenging task due to the fast pace of change in the fashion industry. Moreover, forecasting the visual popularity of new garment designs is even more demanding due to lack of historical data. To this end, we propose MuQAR, a Multimodal Quasi-AutoRegressive deep learning architecture that combines two modules: (1) a multimodal multilayer perceptron processing categorical, visual and textual features of the product and (2) a Quasi-AutoRegressive neural network modelling the “target” time series of the product’s attributes along with the “exogenous” time series of all other attributes. We utilize computer vision, image classification and image captioning, for automatically extracting visual features and textual descriptions from the images of new products. Product design in fashion is initially expressed visually and these features represent the products’ unique characteristics without interfering with the creative process of its designers by requiring additional inputs (e.g. manually written texts). We employ the product’s target attributes time series as a proxy of temporal popularity patterns, mitigating the lack of historical data, while exogenous time series help capture trends among interrelated attributes. We perform an extensive ablation analysis on two large-scale image fashion datasets, Mallzee-P and SHIFT15m to assess the adequacy of MuQAR and also use the Amazon Reviews: Home and Kitchen dataset to assess generalization to other domains. A comparative study on the VISUELLE dataset shows that MuQAR is capable of competing and surpassing the domain’s current state of the art by 4.65% and 4.8% in terms of WAPE and MAE, respectively.
估计消费者的偏好对时尚行业来说是至关重要的,因为适当地利用这些信息对利润是有益的。由于时尚行业的快速变化,时尚趋势检测是一项具有挑战性的任务。此外,由于缺乏历史数据,预测新服装设计的视觉流行度更加困难。为此,我们提出了MuQAR,一种多模态准自回归深度学习架构,它结合了两个模块:(1)处理产品的分类、视觉和文本特征的多模态多层感知器;(2)对产品属性的“目标”时间序列以及所有其他属性的“外生”时间序列进行建模的准自回归神经网络。我们利用计算机视觉、图像分类和图像字幕,从新产品图像中自动提取视觉特征和文本描述。时尚产品设计最初是通过视觉来表达的,这些特征代表了产品的独特特征,而不需要额外的输入(例如手工书写文本)来干扰设计师的创作过程。我们使用产品的目标属性时间序列作为时间流行模式的代理,减轻了历史数据的缺乏,而外生时间序列有助于捕获相关属性之间的趋势。我们对两个大型图像时尚数据集Mallzee-P和SHIFT15m进行了广泛的消纳分析,以评估MuQAR的充分性,并使用亚马逊评论:家庭和厨房数据集评估对其他领域的泛化。对VISUELLE数据集的比较研究表明,MuQAR能够在WAPE和MAE方面分别竞争并超过该领域目前的技术水平4.65%和4.8%。
{"title":"Multimodal Quasi-AutoRegression: forecasting the visual popularity of new fashion products","authors":"Stefanos Papadopoulos, C. Koutlis, S. Papadopoulos, Y. Kompatsiaris","doi":"10.48550/arXiv.2204.04014","DOIUrl":"https://doi.org/10.48550/arXiv.2204.04014","url":null,"abstract":"Estimating the preferences of consumers is of utmost importance for the fashion industry as appropriately leveraging this information can be beneficial in terms of profit. Trend detection in fashion is a challenging task due to the fast pace of change in the fashion industry. Moreover, forecasting the visual popularity of new garment designs is even more demanding due to lack of historical data. To this end, we propose MuQAR, a Multimodal Quasi-AutoRegressive deep learning architecture that combines two modules: (1) a multimodal multilayer perceptron processing categorical, visual and textual features of the product and (2) a Quasi-AutoRegressive neural network modelling the “target” time series of the product’s attributes along with the “exogenous” time series of all other attributes. We utilize computer vision, image classification and image captioning, for automatically extracting visual features and textual descriptions from the images of new products. Product design in fashion is initially expressed visually and these features represent the products’ unique characteristics without interfering with the creative process of its designers by requiring additional inputs (e.g. manually written texts). We employ the product’s target attributes time series as a proxy of temporal popularity patterns, mitigating the lack of historical data, while exogenous time series help capture trends among interrelated attributes. We perform an extensive ablation analysis on two large-scale image fashion datasets, Mallzee-P and SHIFT15m to assess the adequacy of MuQAR and also use the Amazon Reviews: Home and Kitchen dataset to assess generalization to other domains. A comparative study on the VISUELLE dataset shows that MuQAR is capable of competing and surpassing the domain’s current state of the art by 4.65% and 4.8% in terms of WAPE and MAE, respectively.","PeriodicalId":48501,"journal":{"name":"International Journal of Multimedia Information Retrieval","volume":"61 1","pages":"717-729"},"PeriodicalIF":5.6,"publicationDate":"2022-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84560561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
PDS-Net: A novel point and depth-wise separable convolution for real-time object detection PDS-Net:一种用于实时目标检测的新颖的点和深度可分离卷积
IF 5.6 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-03-24 DOI: 10.1007/s13735-022-00229-6
M. Junayed, Md Baharul Islam, H. Imani, Tarkan Aydin
{"title":"PDS-Net: A novel point and depth-wise separable convolution for real-time object detection","authors":"M. Junayed, Md Baharul Islam, H. Imani, Tarkan Aydin","doi":"10.1007/s13735-022-00229-6","DOIUrl":"https://doi.org/10.1007/s13735-022-00229-6","url":null,"abstract":"","PeriodicalId":48501,"journal":{"name":"International Journal of Multimedia Information Retrieval","volume":"8 1","pages":"171 - 188"},"PeriodicalIF":5.6,"publicationDate":"2022-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76468263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Caption TLSTMs: combining transformer with LSTMs for image captioning tlstm:结合变压器和lstm进行图像字幕
IF 5.6 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-03-23 DOI: 10.1007/s13735-022-00228-7
Jie Yan, Yuxiang Xie, Xidao Luan, Yanming Guo, Quanzhi Gong, Suru Feng
{"title":"Caption TLSTMs: combining transformer with LSTMs for image captioning","authors":"Jie Yan, Yuxiang Xie, Xidao Luan, Yanming Guo, Quanzhi Gong, Suru Feng","doi":"10.1007/s13735-022-00228-7","DOIUrl":"https://doi.org/10.1007/s13735-022-00228-7","url":null,"abstract":"","PeriodicalId":48501,"journal":{"name":"International Journal of Multimedia Information Retrieval","volume":"9 1","pages":"111 - 121"},"PeriodicalIF":5.6,"publicationDate":"2022-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79134772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Few2Decide: towards a robust model via using few neuron connections to decide Few2Decide:通过使用较少的神经元连接来决定一个鲁棒模型
IF 5.6 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-01-30 DOI: 10.1007/s13735-021-00223-4
Jian Li, Yanming Guo, Songyang Lao, Xiang Zhao, Liang Bai, Haoran Wang
{"title":"Few2Decide: towards a robust model via using few neuron connections to decide","authors":"Jian Li, Yanming Guo, Songyang Lao, Xiang Zhao, Liang Bai, Haoran Wang","doi":"10.1007/s13735-021-00223-4","DOIUrl":"https://doi.org/10.1007/s13735-021-00223-4","url":null,"abstract":"","PeriodicalId":48501,"journal":{"name":"International Journal of Multimedia Information Retrieval","volume":"45 1","pages":"189 - 198"},"PeriodicalIF":5.6,"publicationDate":"2022-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79015916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Enhancing the performance of 3D auto-correlation gradient features in depth action classification 增强三维自相关梯度特征在深度动作分类中的性能
IF 5.6 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-01-16 DOI: 10.1007/s13735-021-00226-1
Mohammad Farhad Bulbul, S. Islam, Zannatul Azme, Preksha Pareek, Md. Humaun Kabir, Hazrat Ali
{"title":"Enhancing the performance of 3D auto-correlation gradient features in depth action classification","authors":"Mohammad Farhad Bulbul, S. Islam, Zannatul Azme, Preksha Pareek, Md. Humaun Kabir, Hazrat Ali","doi":"10.1007/s13735-021-00226-1","DOIUrl":"https://doi.org/10.1007/s13735-021-00226-1","url":null,"abstract":"","PeriodicalId":48501,"journal":{"name":"International Journal of Multimedia Information Retrieval","volume":"360 1","pages":"61 - 76"},"PeriodicalIF":5.6,"publicationDate":"2022-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78106335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Generative adversarial networks and its applications in the biomedical image segmentation: a comprehensive survey. 生成对抗网络及其在生物医学图像分割中的应用综述。
IF 5.6 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-01-01 DOI: 10.1007/s13735-022-00240-x
Ahmed Iqbal, Muhammad Sharif, Mussarat Yasmin, Mudassar Raza, Shabib Aftab

Recent advancements with deep generative models have proven significant potential in the task of image synthesis, detection, segmentation, and classification. Segmenting the medical images is considered a primary challenge in the biomedical imaging field. There have been various GANs-based models proposed in the literature to resolve medical segmentation challenges. Our research outcome has identified 151 papers; after the twofold screening, 138 papers are selected for the final survey. A comprehensive survey is conducted on GANs network application to medical image segmentation, primarily focused on various GANs-based models, performance metrics, loss function, datasets, augmentation methods, paper implementation, and source codes. Secondly, this paper provides a detailed overview of GANs network application in different human diseases segmentation. We conclude our research with critical discussion, limitations of GANs, and suggestions for future directions. We hope this survey is beneficial and increases awareness of GANs network implementations for biomedical image segmentation tasks.

深度生成模型的最新进展已经在图像合成、检测、分割和分类任务中证明了巨大的潜力。医学图像的分割被认为是生物医学成像领域的一个主要挑战。文献中提出了各种基于高斯的模型来解决医疗分割的挑战。我们的研究成果鉴定了151篇论文;经过两次筛选,最终选出138篇论文进行最终调查。对gan网络在医学图像分割中的应用进行了全面的综述,主要集中在各种基于gan的模型、性能指标、损失函数、数据集、增强方法、论文实现和源代码。其次,详细概述了gan网络在不同人类疾病分割中的应用。我们以批判性的讨论、gan的局限性以及对未来发展方向的建议来结束我们的研究。我们希望这项调查是有益的,并提高对gan网络在生物医学图像分割任务中的实现的认识。
{"title":"Generative adversarial networks and its applications in the biomedical image segmentation: a comprehensive survey.","authors":"Ahmed Iqbal,&nbsp;Muhammad Sharif,&nbsp;Mussarat Yasmin,&nbsp;Mudassar Raza,&nbsp;Shabib Aftab","doi":"10.1007/s13735-022-00240-x","DOIUrl":"https://doi.org/10.1007/s13735-022-00240-x","url":null,"abstract":"<p><p>Recent advancements with deep generative models have proven significant potential in the task of image synthesis, detection, segmentation, and classification. Segmenting the medical images is considered a primary challenge in the biomedical imaging field. There have been various GANs-based models proposed in the literature to resolve medical segmentation challenges. Our research outcome has identified 151 papers; after the twofold screening, 138 papers are selected for the final survey. A comprehensive survey is conducted on GANs network application to medical image segmentation, primarily focused on various GANs-based models, performance metrics, loss function, datasets, augmentation methods, paper implementation, and source codes. Secondly, this paper provides a detailed overview of GANs network application in different human diseases segmentation. We conclude our research with critical discussion, limitations of GANs, and suggestions for future directions. We hope this survey is beneficial and increases awareness of GANs network implementations for biomedical image segmentation tasks.</p>","PeriodicalId":48501,"journal":{"name":"International Journal of Multimedia Information Retrieval","volume":"11 3","pages":"333-368"},"PeriodicalIF":5.6,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9264294/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10253310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
A review on deep learning in medical image analysis. 深度学习在医学图像分析中的研究进展。
IF 5.6 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-01-01 Epub Date: 2021-09-04 DOI: 10.1007/s13735-021-00218-1
S Suganyadevi, V Seethalakshmi, K Balasamy

Ongoing improvements in AI, particularly concerning deep learning techniques, are assisting to identify, classify, and quantify patterns in clinical images. Deep learning is the quickest developing field in artificial intelligence and is effectively utilized lately in numerous areas, including medication. A brief outline is given on studies carried out on the region of application: neuro, brain, retinal, pneumonic, computerized pathology, bosom, heart, breast, bone, stomach, and musculoskeletal. For information exploration, knowledge deployment, and knowledge-based prediction, deep learning networks can be successfully applied to big data. In the field of medical image processing methods and analysis, fundamental information and state-of-the-art approaches with deep learning are presented in this paper. The primary goals of this paper are to present research on medical image processing as well as to define and implement the key guidelines that are identified and addressed.

人工智能的持续改进,特别是在深度学习技术方面,正在帮助识别、分类和量化临床图像中的模式。深度学习是人工智能中发展最快的领域,最近在包括医疗在内的许多领域得到了有效的应用。简要概述了在应用领域进行的研究:神经、脑、视网膜、肺炎、计算机病理学、胸部、心脏、乳房、骨骼、胃和肌肉骨骼。对于信息探索、知识部署和基于知识的预测,深度学习网络可以成功地应用于大数据。在医学图像处理方法和分析领域,本文介绍了深度学习的基本信息和最新方法。本文的主要目标是介绍医学图像处理的研究,以及定义和实施所确定和解决的关键指导方针。
{"title":"A review on deep learning in medical image analysis.","authors":"S Suganyadevi,&nbsp;V Seethalakshmi,&nbsp;K Balasamy","doi":"10.1007/s13735-021-00218-1","DOIUrl":"https://doi.org/10.1007/s13735-021-00218-1","url":null,"abstract":"<p><p>Ongoing improvements in AI, particularly concerning deep learning techniques, are assisting to identify, classify, and quantify patterns in clinical images. Deep learning is the quickest developing field in artificial intelligence and is effectively utilized lately in numerous areas, including medication. A brief outline is given on studies carried out on the region of application: neuro, brain, retinal, pneumonic, computerized pathology, bosom, heart, breast, bone, stomach, and musculoskeletal. For information exploration, knowledge deployment, and knowledge-based prediction, deep learning networks can be successfully applied to big data. In the field of medical image processing methods and analysis, fundamental information and state-of-the-art approaches with deep learning are presented in this paper. The primary goals of this paper are to present research on medical image processing as well as to define and implement the key guidelines that are identified and addressed.</p>","PeriodicalId":48501,"journal":{"name":"International Journal of Multimedia Information Retrieval","volume":"11 1","pages":"19-38"},"PeriodicalIF":5.6,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8417661/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39409372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 77
A unified approach of detecting misleading images via tracing its instances on web and analyzing its past context for the verification of multimedia content. 一种统一的方法,通过跟踪其在网络上的实例和分析其过去的上下文来检测误导图像,以验证多媒体内容。
IF 5.6 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-01-01 Epub Date: 2022-07-11 DOI: 10.1007/s13735-022-00235-8
Deepika Varshney, Dinesh Kumar Vishwakarma

The verification of multimedia content over social media is one of the challenging and crucial issues in the current scenario and gaining prominence in an age where user-generated content and online social web-platforms are the leading sources in shaping and propagating news stories. As these sources allow users to share their opinions without restriction, opportunistic users often post misleading/unreliable content on social media such as Twitter, Facebook, etc. At present, to lure users toward the news story, the text is often attached with some multimedia content (images/videos/audios). Verifying these contents to maintain the credibility and reliability of social media information is of paramount importance. Motivated by this, we proposed a generalized system that supports the automatic classification of images into credible or misleading. In this paper, we investigated machine learning-based as well as deep learning-based approaches utilized to verify misleading multimedia content, where the available image traces are used to identify the credibility of the content. The experiment is performed on the real-world dataset (Media-eval-2015 dataset) collected from Twitter. It also demonstrates the efficiency of our proposed approach and features using both Machine and Deep Learning Model (Bi-directional LSTM). The experiment result reveals that the Microsoft BING image search engine is quite effective in retrieving titles and performs better than our study's Google image search engine. It also shows that gathering clues from attached multimedia content (image) is more effective than detecting only posted content-based features.

社交媒体上多媒体内容的验证是当前情况下具有挑战性和关键的问题之一,并且在用户生成内容和在线社交网络平台成为塑造和传播新闻故事的主要来源的时代日益突出。由于这些来源允许用户不受限制地分享他们的观点,机会主义用户经常在Twitter、Facebook等社交媒体上发布误导性/不可靠的内容。目前,为了吸引用户对新闻故事的兴趣,文本通常会附带一些多媒体内容(图像/视频/音频)。核实这些内容以保持社交媒体信息的可信度和可靠性至关重要。基于此,我们提出了一种支持图像自动分类为可信和误导性的广义系统。在本文中,我们研究了用于验证误导性多媒体内容的基于机器学习和基于深度学习的方法,其中可用的图像痕迹用于识别内容的可信度。实验是在从Twitter收集的真实数据集(Media-eval-2015数据集)上进行的。它还展示了我们提出的方法和使用机器和深度学习模型(双向LSTM)的特征的效率。实验结果表明,微软BING图像搜索引擎在检索标题方面非常有效,性能优于我们研究的Google图像搜索引擎。它还表明,从附加的多媒体内容(图像)中收集线索比仅检测发布的基于内容的特征更有效。
{"title":"A unified approach of detecting misleading images via tracing its instances on web and analyzing its past context for the verification of multimedia content.","authors":"Deepika Varshney,&nbsp;Dinesh Kumar Vishwakarma","doi":"10.1007/s13735-022-00235-8","DOIUrl":"https://doi.org/10.1007/s13735-022-00235-8","url":null,"abstract":"<p><p>The verification of multimedia content over social media is one of the challenging and crucial issues in the current scenario and gaining prominence in an age where user-generated content and online social web-platforms are the leading sources in shaping and propagating news stories. As these sources allow users to share their opinions without restriction, opportunistic users often post misleading/unreliable content on social media such as Twitter, Facebook, etc. At present, to lure users toward the news story, the text is often attached with some multimedia content (images/videos/audios). Verifying these contents to maintain the credibility and reliability of social media information is of paramount importance. Motivated by this, we proposed a generalized system that supports the automatic classification of images into credible or misleading. In this paper, we investigated machine learning-based as well as deep learning-based approaches utilized to verify misleading multimedia content, where the available image traces are used to identify the credibility of the content. The experiment is performed on the real-world dataset (Media-eval-2015 dataset) collected from Twitter. It also demonstrates the efficiency of our proposed approach and features using both Machine and Deep Learning Model (Bi-directional LSTM). The experiment result reveals that the Microsoft BING image search engine is quite effective in retrieving titles and performs better than our study's Google image search engine. It also shows that gathering clues from attached multimedia content (image) is more effective than detecting only posted content-based features.</p>","PeriodicalId":48501,"journal":{"name":"International Journal of Multimedia Information Retrieval","volume":" ","pages":"445-459"},"PeriodicalIF":5.6,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9272873/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40601801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Interactive video retrieval evaluation at a distance: comparing sixteen interactive video search systems in a remote setting at the 10th Video Browser Showdown. 远程交互式视频检索评估:在第 10 届视频浏览器对决中比较远程环境下的 16 个交互式视频搜索系统。
IF 3.6 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-01-01 Epub Date: 2022-01-26 DOI: 10.1007/s13735-021-00225-2
Silvan Heller, Viktor Gsteiger, Werner Bailer, Cathal Gurrin, Björn Þór Jónsson, Jakub Lokoč, Andreas Leibetseder, František Mejzlík, Ladislav Peška, Luca Rossetto, Konstantin Schall, Klaus Schoeffmann, Heiko Schuldt, Florian Spiess, Ly-Duyen Tran, Lucia Vadicamo, Patrik Veselý, Stefanos Vrochidis, Jiaxin Wu

The Video Browser Showdown addresses difficult video search challenges through an annual interactive evaluation campaign attracting research teams focusing on interactive video retrieval. The campaign aims to provide insights into the performance of participating interactive video retrieval systems, tested by selected search tasks on large video collections. For the first time in its ten year history, the Video Browser Showdown 2021 was organized in a fully remote setting and hosted a record number of sixteen scoring systems. In this paper, we describe the competition setting, tasks and results and give an overview of state-of-the-art methods used by the competing systems. By looking at query result logs provided by ten systems, we analyze differences in retrieval model performances and browsing times before a correct submission. Through advances in data gathering methodology and tools, we provide a comprehensive analysis of ad-hoc video search tasks, discuss results, task design and methodological challenges. We highlight that almost all top performing systems utilize some sort of joint embedding for text-image retrieval and enable specification of temporal context in queries for known-item search. Whereas a combination of these techniques drive the currently top performing systems, we identify several future challenges for interactive video search engines and the Video Browser Showdown competition itself.

视频浏览器对决 "通过年度互动评估活动,吸引专注于互动视频检索的研究团队,解决视频搜索的难题。该活动旨在通过在大型视频库中执行选定的搜索任务来测试参与活动的交互式视频检索系统的性能。2021 年 "视频浏览器对决 "在其十年的历史上首次采用了完全远程的组织方式,并创下了 16 个评分系统参赛的记录。在本文中,我们将介绍比赛设置、任务和结果,并概述参赛系统所使用的最先进方法。通过查看十个系统提供的查询结果日志,我们分析了检索模型性能和正确提交前浏览时间的差异。通过数据收集方法和工具的进步,我们对临时视频搜索任务进行了全面分析,讨论了结果、任务设计和方法挑战。我们强调,几乎所有性能最佳的系统都利用某种联合嵌入技术进行文本-图像检索,并在已知项目搜索的查询中指定时间上下文。这些技术的结合推动了目前表现最出色的系统,同时我们也为交互式视频搜索引擎和视频浏览器对决竞赛本身指出了几个未来的挑战。
{"title":"Interactive video retrieval evaluation at a distance: comparing sixteen interactive video search systems in a remote setting at the 10th Video Browser Showdown.","authors":"Silvan Heller, Viktor Gsteiger, Werner Bailer, Cathal Gurrin, Björn Þór Jónsson, Jakub Lokoč, Andreas Leibetseder, František Mejzlík, Ladislav Peška, Luca Rossetto, Konstantin Schall, Klaus Schoeffmann, Heiko Schuldt, Florian Spiess, Ly-Duyen Tran, Lucia Vadicamo, Patrik Veselý, Stefanos Vrochidis, Jiaxin Wu","doi":"10.1007/s13735-021-00225-2","DOIUrl":"10.1007/s13735-021-00225-2","url":null,"abstract":"<p><p>The Video Browser Showdown addresses difficult video search challenges through an annual interactive evaluation campaign attracting research teams focusing on interactive video retrieval. The campaign aims to provide insights into the performance of participating interactive video retrieval systems, tested by selected search tasks on large video collections. For the first time in its ten year history, the Video Browser Showdown 2021 was organized in a fully remote setting and hosted a record number of sixteen scoring systems. In this paper, we describe the competition setting, tasks and results and give an overview of state-of-the-art methods used by the competing systems. By looking at query result logs provided by ten systems, we analyze differences in retrieval model performances and browsing times before a correct submission. Through advances in data gathering methodology and tools, we provide a comprehensive analysis of ad-hoc video search tasks, discuss results, task design and methodological challenges. We highlight that almost all top performing systems utilize some sort of joint embedding for text-image retrieval and enable specification of temporal context in queries for known-item search. Whereas a combination of these techniques drive the currently top performing systems, we identify several future challenges for interactive video search engines and the Video Browser Showdown competition itself.</p>","PeriodicalId":48501,"journal":{"name":"International Journal of Multimedia Information Retrieval","volume":"11 1","pages":"1-18"},"PeriodicalIF":3.6,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8791088/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39872573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Multimedia Information Retrieval
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1