The Visual Computer最新文献

英文中文

Image super-resolution method based on the interactive fusion of transformer and CNN features 基于变压器与CNN特征交互融合的图像超分辨率方法

The Visual Computer

Pub Date : 2023-11-03 DOI: 10.1007/s00371-023-03138-9

Jianxin Wang, Yongsong Zou, Osama Alfarraj, Pradip Kumar Sharma, Wael Said, Jin Wang

引用次数: 0

DCBFusion: an infrared and visible image fusion method through detail enhancement, contrast reserve and brightness balance DCBFusion:一种通过细节增强、对比度保留和亮度平衡实现红外和可见光图像融合的方法

The Visual Computer

Pub Date : 2023-11-02 DOI: 10.1007/s00371-023-03134-z

Shenghui Sun, Kechen Song, Yi Man, Hongwen Dong, Yunhui Yan

引用次数: 0

Human body construction based on combination of parametric and nonparametric reconstruction methods 基于参数与非参数相结合的人体构造方法

The Visual Computer

Pub Date : 2023-11-01 DOI: 10.1007/s00371-023-03122-3

Xihang Li, Guiqin Li, Tiancai Li, Peter Mitrouchev

引用次数: 0

A deep learning approach for anomaly detection in large-scale Hajj crowds 大规模朝觐人群异常检测的深度学习方法

The Visual Computer

Pub Date : 2023-11-01 DOI: 10.1007/s00371-023-03124-1

Amnah Aldayri, Waleed Albattah

引用次数: 0

Multiple instance learning-based two-stage metric learning network for whole slide image classification 基于多实例学习的两阶段度量学习网络全幻灯片图像分类

The Visual Computer

Pub Date : 2023-11-01 DOI: 10.1007/s00371-023-03131-2

Xiaoyu Li, Bei Yang, Tiandong Chen, Zheng Gao, Huijie Li

引用次数: 0

Annotate and retrieve in vivo images using hybrid self-organizing map 使用混合自组织地图对活体图像进行注释和检索

The Visual Computer

Pub Date : 2023-10-31 DOI: 10.1007/s00371-023-03126-z

Parminder Kaur, Avleen Malhi, Husanbir Pannu

Abstract Multimodal retrieval has gained much attention lately due to its effectiveness over uni-modal retrieval. For instance, visual features often under-constrain the description of an image in content-based retrieval; however, another modality, such as collateral text, can be introduced to abridge the semantic gap and make the retrieval process more efficient. This article proposes the application of cross-modal fusion and retrieval on real in vivo gastrointestinal images and linguistic cues, as the visual features alone are insufficient for image description and to assist gastroenterologists. So, a cross-modal information retrieval approach has been proposed to retrieve related images given text and vice versa while handling the heterogeneity gap issue among the modalities. The technique comprises two stages: (1) individual modality feature learning; and (2) fusion of two trained networks. In the first stage, two self-organizing maps (SOMs) are trained separately using images and texts, which are clustered in the respective SOMs based on their similarity. In the second (fusion) stage, the trained SOMs are integrated using an associative network to enable cross-modal retrieval. The underlying learning techniques of the associative network include Hebbian learning and Oja learning (Improved Hebbian learning). The introduced framework can annotate images with keywords and illustrate keywords with images, and it can also be extended to incorporate more diverse modalities. Extensive experimentation has been performed on real gastrointestinal images obtained from a known gastroenterologist that have collateral keywords with each image. The obtained results proved the efficacy of the algorithm and its significance in aiding gastroenterologists in quick and pertinent decision making.

摘要多模态检索由于其相对于单模态检索的有效性，近年来受到了广泛的关注。例如，在基于内容的检索中，视觉特征往往对图像的描述约束不足;然而，可以引入另一种形态，如附属文本，来弥补语义差距，使检索过程更有效。本文提出将跨模态融合和检索应用于真实的体内胃肠道图像和语言线索，因为仅凭视觉特征不足以进行图像描述和辅助胃肠病学家。为此，提出了一种跨模态信息检索方法，在处理模态间的异质性差距问题的同时，对给定文本的相关图像进行检索，反之亦然。该技术包括两个阶段:(1)个体模态特征学习;(2)两个训练好的网络融合。在第一阶段，使用图像和文本分别训练两个自组织地图(som)，并根据图像和文本的相似性将其聚类到各自的som中。在第二阶段(融合)，训练的SOMs使用关联网络进行整合，以实现跨模态检索。联想网络的基础学习技术包括Hebbian学习和Oja学习(Improved Hebbian learning)。引入的框架可以用关键字对图像进行注释，也可以用图像对关键字进行说明，并且还可以扩展到更多样化的模式。广泛的实验已经进行了真正的胃肠图像从一个已知的胃肠病学家，有附带关键词与每个图像。得到的结果证明了算法的有效性及其在帮助胃肠病学家快速和有针对性的决策方面的意义。

{"title":"Annotate and retrieve in vivo images using hybrid self-organizing map","authors":"Parminder Kaur, Avleen Malhi, Husanbir Pannu","doi":"10.1007/s00371-023-03126-z","DOIUrl":"https://doi.org/10.1007/s00371-023-03126-z","url":null,"abstract":"Abstract Multimodal retrieval has gained much attention lately due to its effectiveness over uni-modal retrieval. For instance, visual features often under-constrain the description of an image in content-based retrieval; however, another modality, such as collateral text, can be introduced to abridge the semantic gap and make the retrieval process more efficient. This article proposes the application of cross-modal fusion and retrieval on real in vivo gastrointestinal images and linguistic cues, as the visual features alone are insufficient for image description and to assist gastroenterologists. So, a cross-modal information retrieval approach has been proposed to retrieve related images given text and vice versa while handling the heterogeneity gap issue among the modalities. The technique comprises two stages: (1) individual modality feature learning; and (2) fusion of two trained networks. In the first stage, two self-organizing maps (SOMs) are trained separately using images and texts, which are clustered in the respective SOMs based on their similarity. In the second (fusion) stage, the trained SOMs are integrated using an associative network to enable cross-modal retrieval. The underlying learning techniques of the associative network include Hebbian learning and Oja learning (Improved Hebbian learning). The introduced framework can annotate images with keywords and illustrate keywords with images, and it can also be extended to incorporate more diverse modalities. Extensive experimentation has been performed on real gastrointestinal images obtained from a known gastroenterologist that have collateral keywords with each image. The obtained results proved the efficacy of the algorithm and its significance in aiding gastroenterologists in quick and pertinent decision making.","PeriodicalId":227044,"journal":{"name":"The Visual Computer","volume":"2002 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135813127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Understanding of multiple bending-sloping arched scenes based on angle projections 基于角度投影的多个弯曲倾斜拱形场景的理解

The Visual Computer

Pub Date : 2023-10-30 DOI: 10.1007/s00371-023-03133-0

Luping Wang, Hui Wei

引用次数: 0

Making paper labels smart for augmented wine recognition 使纸质标签智能增强葡萄酒识别

The Visual Computer

Pub Date : 2023-10-27 DOI: 10.1007/s00371-023-03119-y

Alessia Angeli, Lorenzo Stacchio, Lorenzo Donatiello, Alessandro Giacchè, Gustavo Marfia

Abstract An invisible layer of knowledge is progressively growing with the emergence of situated visualizations and reality-based information retrieval systems. In essence, digital content will overlap with real-world entities, eventually providing insights into the surrounding environment and useful information for the user. The implementation of such a vision may appear close, but many subtle details separate us from its fulfillment. This kind of implementation, as the overlap between rendered virtual annotations and the camera’s real-world view, requires different computer vision paradigms for object recognition and tracking which often require high computing power and large-scale datasets of images. Nevertheless, these resources are not always available, and in some specific domains, the lack of an appropriate reference dataset could be disruptive for a considered task. In this particular scenario, we here consider the problem of wine recognition to support an augmented reading of their labels. In fact, images of wine bottle labels may not be available as wineries periodically change their designs, product information regulations may vary, and specific bottles may be rare, making the label recognition process hard or even impossible. In this work, we present augmented wine recognition, an augmented reality system that exploits optical character recognition paradigms to interpret and exploit the text within a wine label, without requiring any reference image. Our experiments show that such a framework can overcome the limitations posed by image retrieval-based systems while exhibiting a comparable performance.

随着情境可视化和基于现实的信息检索系统的出现，一个无形的知识层正在逐渐增长。从本质上讲，数字内容将与现实世界的实体重叠，最终为用户提供对周围环境的洞察和有用的信息。这样一个愿景的实现可能看起来很接近，但许多微妙的细节使我们无法实现它。这种实现，由于渲染的虚拟注释和相机的真实世界视图之间的重叠，需要不同的计算机视觉范式来进行对象识别和跟踪，这通常需要高计算能力和大规模的图像数据集。然而，这些资源并不总是可用的，并且在某些特定领域，缺乏适当的参考数据集可能会破坏所考虑的任务。在这个特殊的场景中，我们在这里考虑葡萄酒识别问题，以支持对其标签的增强读取。事实上，酒瓶标签的图像可能无法获得，因为酿酒厂会定期更改其设计，产品信息法规可能会有所不同，特定的瓶子可能会很罕见，这使得标签识别过程变得困难甚至不可能。在这项工作中，我们提出了增强葡萄酒识别，这是一种增强现实系统，它利用光学字符识别范式来解释和利用葡萄酒标签中的文本，而不需要任何参考图像。我们的实验表明，这样的框架可以克服基于图像检索的系统所带来的限制，同时表现出相当的性能。

{"title":"Making paper labels smart for augmented wine recognition","authors":"Alessia Angeli, Lorenzo Stacchio, Lorenzo Donatiello, Alessandro Giacchè, Gustavo Marfia","doi":"10.1007/s00371-023-03119-y","DOIUrl":"https://doi.org/10.1007/s00371-023-03119-y","url":null,"abstract":"Abstract An invisible layer of knowledge is progressively growing with the emergence of situated visualizations and reality-based information retrieval systems. In essence, digital content will overlap with real-world entities, eventually providing insights into the surrounding environment and useful information for the user. The implementation of such a vision may appear close, but many subtle details separate us from its fulfillment. This kind of implementation, as the overlap between rendered virtual annotations and the camera’s real-world view, requires different computer vision paradigms for object recognition and tracking which often require high computing power and large-scale datasets of images. Nevertheless, these resources are not always available, and in some specific domains, the lack of an appropriate reference dataset could be disruptive for a considered task. In this particular scenario, we here consider the problem of wine recognition to support an augmented reading of their labels. In fact, images of wine bottle labels may not be available as wineries periodically change their designs, product information regulations may vary, and specific bottles may be rare, making the label recognition process hard or even impossible. In this work, we present augmented wine recognition, an augmented reality system that exploits optical character recognition paradigms to interpret and exploit the text within a wine label, without requiring any reference image. Our experiments show that such a framework can overcome the limitations posed by image retrieval-based systems while exhibiting a comparable performance.","PeriodicalId":227044,"journal":{"name":"The Visual Computer","volume":"23 2-4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136318478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A rotation robust shape transformer for cartoon character recognition 一种用于卡通人物识别的旋转鲁棒形状变压器

The Visual Computer

Pub Date : 2023-10-27 DOI: 10.1007/s00371-023-03123-2

Qi Jia, Xinyu Chen, Yi Wang, Xin Fan, Haibin Ling, Longin Jan Latecki

引用次数: 0

A nightshade crop leaf disease detection using enhance-nightshade-CNN for ground truth data 利用增强-茄类- cnn对地面真值数据进行茄类作物叶片病害检测

The Visual Computer

Pub Date : 2023-10-27 DOI: 10.1007/s00371-023-03127-y

Barkha M. Joshi, Hetal Bhavsar

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

The Visual Computer

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀