首页 > 最新文献

International Journal of Multimedia Information Retrieval最新文献

英文 中文
InceptionDepth-wiseYOLOv2: improved implementation of YOLO framework for pedestrian detection InceptionDepth-wiseYOLOv2:行人检测YOLO框架的改进实现
IF 5.6 3区 计算机科学 Pub Date : 2022-05-11 DOI: 10.1007/s13735-022-00239-4
S. Panigrahi, U. Raju
{"title":"InceptionDepth-wiseYOLOv2: improved implementation of YOLO framework for pedestrian detection","authors":"S. Panigrahi, U. Raju","doi":"10.1007/s13735-022-00239-4","DOIUrl":"https://doi.org/10.1007/s13735-022-00239-4","url":null,"abstract":"","PeriodicalId":48501,"journal":{"name":"International Journal of Multimedia Information Retrieval","volume":null,"pages":null},"PeriodicalIF":5.6,"publicationDate":"2022-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72541972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
RGBD deep multi-scale network for background subtraction RGBD深度多尺度背景减除网络
IF 5.6 3区 计算机科学 Pub Date : 2022-05-10 DOI: 10.1007/s13735-022-00232-x
Ihssane Houhou, A. Zitouni, Y. Ruichek, Salah Eddine Bekhouche, M. Kas, A. Taleb-Ahmed
{"title":"RGBD deep multi-scale network for background subtraction","authors":"Ihssane Houhou, A. Zitouni, Y. Ruichek, Salah Eddine Bekhouche, M. Kas, A. Taleb-Ahmed","doi":"10.1007/s13735-022-00232-x","DOIUrl":"https://doi.org/10.1007/s13735-022-00232-x","url":null,"abstract":"","PeriodicalId":48501,"journal":{"name":"International Journal of Multimedia Information Retrieval","volume":null,"pages":null},"PeriodicalIF":5.6,"publicationDate":"2022-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76581868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Music emotion recognition based on segment-level two-stage learning 基于片段级两阶段学习的音乐情感识别
IF 5.6 3区 计算机科学 Pub Date : 2022-04-25 DOI: 10.1007/s13735-022-00230-z
Na He, Sam Ferguson
{"title":"Music emotion recognition based on segment-level two-stage learning","authors":"Na He, Sam Ferguson","doi":"10.1007/s13735-022-00230-z","DOIUrl":"https://doi.org/10.1007/s13735-022-00230-z","url":null,"abstract":"","PeriodicalId":48501,"journal":{"name":"International Journal of Multimedia Information Retrieval","volume":null,"pages":null},"PeriodicalIF":5.6,"publicationDate":"2022-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86966104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
DC-GNN: drop channel graph neural network for object classification and part segmentation in the point cloud DC-GNN:用于点云中物体分类和部分分割的丢通道图神经网络
IF 5.6 3区 计算机科学 Pub Date : 2022-04-21 DOI: 10.1007/s13735-022-00236-7
M. Meraz, Md Afzal Ansari, M. Javed, P. Chakraborty
{"title":"DC-GNN: drop channel graph neural network for object classification and part segmentation in the point cloud","authors":"M. Meraz, Md Afzal Ansari, M. Javed, P. Chakraborty","doi":"10.1007/s13735-022-00236-7","DOIUrl":"https://doi.org/10.1007/s13735-022-00236-7","url":null,"abstract":"","PeriodicalId":48501,"journal":{"name":"International Journal of Multimedia Information Retrieval","volume":null,"pages":null},"PeriodicalIF":5.6,"publicationDate":"2022-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87395394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Multi-sensor human activity recognition using CNN and GRU 基于CNN和GRU的多传感器人体活动识别
IF 5.6 3区 计算机科学 Pub Date : 2022-04-19 DOI: 10.1007/s13735-022-00234-9
Ohoud Nafea, Wadood Abdul, G. Muhammad
{"title":"Multi-sensor human activity recognition using CNN and GRU","authors":"Ohoud Nafea, Wadood Abdul, G. Muhammad","doi":"10.1007/s13735-022-00234-9","DOIUrl":"https://doi.org/10.1007/s13735-022-00234-9","url":null,"abstract":"","PeriodicalId":48501,"journal":{"name":"International Journal of Multimedia Information Retrieval","volume":null,"pages":null},"PeriodicalIF":5.6,"publicationDate":"2022-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83267424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Siamese coding network and pair similarity prediction for near-duplicate image detection 近重复图像检测的暹罗编码网络和对相似度预测
IF 5.6 3区 计算机科学 Pub Date : 2022-04-12 DOI: 10.1007/s13735-022-00233-w
M. Fisichella
{"title":"Siamese coding network and pair similarity prediction for near-duplicate image detection","authors":"M. Fisichella","doi":"10.1007/s13735-022-00233-w","DOIUrl":"https://doi.org/10.1007/s13735-022-00233-w","url":null,"abstract":"","PeriodicalId":48501,"journal":{"name":"International Journal of Multimedia Information Retrieval","volume":null,"pages":null},"PeriodicalIF":5.6,"publicationDate":"2022-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85549138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A local representation-enhanced recurrent convolutional network for image captioning 图像标注的局部表示增强递归卷积网络
IF 5.6 3区 计算机科学 Pub Date : 2022-04-12 DOI: 10.1007/s13735-022-00231-y
Xiaoyi Wang, Jun Huang
{"title":"A local representation-enhanced recurrent convolutional network for image captioning","authors":"Xiaoyi Wang, Jun Huang","doi":"10.1007/s13735-022-00231-y","DOIUrl":"https://doi.org/10.1007/s13735-022-00231-y","url":null,"abstract":"","PeriodicalId":48501,"journal":{"name":"International Journal of Multimedia Information Retrieval","volume":null,"pages":null},"PeriodicalIF":5.6,"publicationDate":"2022-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78893857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multimodal Quasi-AutoRegression: forecasting the visual popularity of new fashion products 多模态准自回归:预测新时尚产品的视觉流行度
IF 5.6 3区 计算机科学 Pub Date : 2022-04-08 DOI: 10.48550/arXiv.2204.04014
Stefanos Papadopoulos, C. Koutlis, S. Papadopoulos, Y. Kompatsiaris
Estimating the preferences of consumers is of utmost importance for the fashion industry as appropriately leveraging this information can be beneficial in terms of profit. Trend detection in fashion is a challenging task due to the fast pace of change in the fashion industry. Moreover, forecasting the visual popularity of new garment designs is even more demanding due to lack of historical data. To this end, we propose MuQAR, a Multimodal Quasi-AutoRegressive deep learning architecture that combines two modules: (1) a multimodal multilayer perceptron processing categorical, visual and textual features of the product and (2) a Quasi-AutoRegressive neural network modelling the “target” time series of the product’s attributes along with the “exogenous” time series of all other attributes. We utilize computer vision, image classification and image captioning, for automatically extracting visual features and textual descriptions from the images of new products. Product design in fashion is initially expressed visually and these features represent the products’ unique characteristics without interfering with the creative process of its designers by requiring additional inputs (e.g. manually written texts). We employ the product’s target attributes time series as a proxy of temporal popularity patterns, mitigating the lack of historical data, while exogenous time series help capture trends among interrelated attributes. We perform an extensive ablation analysis on two large-scale image fashion datasets, Mallzee-P and SHIFT15m to assess the adequacy of MuQAR and also use the Amazon Reviews: Home and Kitchen dataset to assess generalization to other domains. A comparative study on the VISUELLE dataset shows that MuQAR is capable of competing and surpassing the domain’s current state of the art by 4.65% and 4.8% in terms of WAPE and MAE, respectively.
估计消费者的偏好对时尚行业来说是至关重要的,因为适当地利用这些信息对利润是有益的。由于时尚行业的快速变化,时尚趋势检测是一项具有挑战性的任务。此外,由于缺乏历史数据,预测新服装设计的视觉流行度更加困难。为此,我们提出了MuQAR,一种多模态准自回归深度学习架构,它结合了两个模块:(1)处理产品的分类、视觉和文本特征的多模态多层感知器;(2)对产品属性的“目标”时间序列以及所有其他属性的“外生”时间序列进行建模的准自回归神经网络。我们利用计算机视觉、图像分类和图像字幕,从新产品图像中自动提取视觉特征和文本描述。时尚产品设计最初是通过视觉来表达的,这些特征代表了产品的独特特征,而不需要额外的输入(例如手工书写文本)来干扰设计师的创作过程。我们使用产品的目标属性时间序列作为时间流行模式的代理,减轻了历史数据的缺乏,而外生时间序列有助于捕获相关属性之间的趋势。我们对两个大型图像时尚数据集Mallzee-P和SHIFT15m进行了广泛的消纳分析,以评估MuQAR的充分性,并使用亚马逊评论:家庭和厨房数据集评估对其他领域的泛化。对VISUELLE数据集的比较研究表明,MuQAR能够在WAPE和MAE方面分别竞争并超过该领域目前的技术水平4.65%和4.8%。
{"title":"Multimodal Quasi-AutoRegression: forecasting the visual popularity of new fashion products","authors":"Stefanos Papadopoulos, C. Koutlis, S. Papadopoulos, Y. Kompatsiaris","doi":"10.48550/arXiv.2204.04014","DOIUrl":"https://doi.org/10.48550/arXiv.2204.04014","url":null,"abstract":"Estimating the preferences of consumers is of utmost importance for the fashion industry as appropriately leveraging this information can be beneficial in terms of profit. Trend detection in fashion is a challenging task due to the fast pace of change in the fashion industry. Moreover, forecasting the visual popularity of new garment designs is even more demanding due to lack of historical data. To this end, we propose MuQAR, a Multimodal Quasi-AutoRegressive deep learning architecture that combines two modules: (1) a multimodal multilayer perceptron processing categorical, visual and textual features of the product and (2) a Quasi-AutoRegressive neural network modelling the “target” time series of the product’s attributes along with the “exogenous” time series of all other attributes. We utilize computer vision, image classification and image captioning, for automatically extracting visual features and textual descriptions from the images of new products. Product design in fashion is initially expressed visually and these features represent the products’ unique characteristics without interfering with the creative process of its designers by requiring additional inputs (e.g. manually written texts). We employ the product’s target attributes time series as a proxy of temporal popularity patterns, mitigating the lack of historical data, while exogenous time series help capture trends among interrelated attributes. We perform an extensive ablation analysis on two large-scale image fashion datasets, Mallzee-P and SHIFT15m to assess the adequacy of MuQAR and also use the Amazon Reviews: Home and Kitchen dataset to assess generalization to other domains. A comparative study on the VISUELLE dataset shows that MuQAR is capable of competing and surpassing the domain’s current state of the art by 4.65% and 4.8% in terms of WAPE and MAE, respectively.","PeriodicalId":48501,"journal":{"name":"International Journal of Multimedia Information Retrieval","volume":null,"pages":null},"PeriodicalIF":5.6,"publicationDate":"2022-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84560561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
PDS-Net: A novel point and depth-wise separable convolution for real-time object detection PDS-Net:一种用于实时目标检测的新颖的点和深度可分离卷积
IF 5.6 3区 计算机科学 Pub Date : 2022-03-24 DOI: 10.1007/s13735-022-00229-6
M. Junayed, Md Baharul Islam, H. Imani, Tarkan Aydin
{"title":"PDS-Net: A novel point and depth-wise separable convolution for real-time object detection","authors":"M. Junayed, Md Baharul Islam, H. Imani, Tarkan Aydin","doi":"10.1007/s13735-022-00229-6","DOIUrl":"https://doi.org/10.1007/s13735-022-00229-6","url":null,"abstract":"","PeriodicalId":48501,"journal":{"name":"International Journal of Multimedia Information Retrieval","volume":null,"pages":null},"PeriodicalIF":5.6,"publicationDate":"2022-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76468263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Caption TLSTMs: combining transformer with LSTMs for image captioning tlstm:结合变压器和lstm进行图像字幕
IF 5.6 3区 计算机科学 Pub Date : 2022-03-23 DOI: 10.1007/s13735-022-00228-7
Jie Yan, Yuxiang Xie, Xidao Luan, Yanming Guo, Quanzhi Gong, Suru Feng
{"title":"Caption TLSTMs: combining transformer with LSTMs for image captioning","authors":"Jie Yan, Yuxiang Xie, Xidao Luan, Yanming Guo, Quanzhi Gong, Suru Feng","doi":"10.1007/s13735-022-00228-7","DOIUrl":"https://doi.org/10.1007/s13735-022-00228-7","url":null,"abstract":"","PeriodicalId":48501,"journal":{"name":"International Journal of Multimedia Information Retrieval","volume":null,"pages":null},"PeriodicalIF":5.6,"publicationDate":"2022-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79134772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
International Journal of Multimedia Information Retrieval
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1