{"title":"Visual Instance Retrieval for Cultural Heritage Artifacts using Feature Pyramid\n Network","authors":"Luepol Pipanmekaporn, Suwatchai Kamonsantiroj","doi":"10.54941/ahfe1002933","DOIUrl":null,"url":null,"abstract":"Digitized photographs are commonly employed by archaeologists to assist in\n uncovering ancient artefacts. However, locating a specific image within a vast\n collection remains a significant obstacle. The metadata associated with images is often\n sparse, marking keyword-based searches difficult. In this paper, we propose a new visual\n search method to improve retrieval performance by utilizing visual descriptors generated\n from a feature pyramid network. This network is a convolutional neural network (CNN)\n model that incorporates additional modules for feature extraction and enhancement. The\n first module encodes an image into regional features through spatial pyramid pooling,\n while the second module emphasizes distinctive spatial features. Additionally, we\n introduce a two-stage feature attention to enhance feature quality and a compact\n descriptor is then formed by aggregating these features for searching the image. We\n tested our proposed method on benchmark datasets and a public vast collection of\n Thailand’s ancient artefacts. Results from our experiments show that the proposed method\n achieves 77.9% of mean average precision, which outperforms existing CNN-based visual\n descriptors.","PeriodicalId":383834,"journal":{"name":"Human Interaction and Emerging Technologies (IHIET-AI 2023): Artificial\n Intelligence and Future Applications","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Human Interaction and Emerging Technologies (IHIET-AI 2023): Artificial\n Intelligence and Future Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.54941/ahfe1002933","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Digitized photographs are commonly employed by archaeologists to assist in
uncovering ancient artefacts. However, locating a specific image within a vast
collection remains a significant obstacle. The metadata associated with images is often
sparse, marking keyword-based searches difficult. In this paper, we propose a new visual
search method to improve retrieval performance by utilizing visual descriptors generated
from a feature pyramid network. This network is a convolutional neural network (CNN)
model that incorporates additional modules for feature extraction and enhancement. The
first module encodes an image into regional features through spatial pyramid pooling,
while the second module emphasizes distinctive spatial features. Additionally, we
introduce a two-stage feature attention to enhance feature quality and a compact
descriptor is then formed by aggregating these features for searching the image. We
tested our proposed method on benchmark datasets and a public vast collection of
Thailand’s ancient artefacts. Results from our experiments show that the proposed method
achieves 77.9% of mean average precision, which outperforms existing CNN-based visual
descriptors.