Yu-Chai WAN , Xia-Bi LIU , Fei-Fei HAN , Kun-Qi TONG , Yu LIU
{"title":"在线学习二分类器改进谷歌图像搜索结果","authors":"Yu-Chai WAN , Xia-Bi LIU , Fei-Fei HAN , Kun-Qi TONG , Yu LIU","doi":"10.1016/S1874-1029(14)60018-5","DOIUrl":null,"url":null,"abstract":"<div><p>It is promising to improve web image search results through exploiting the results' visual contents for learning a binary classifier which is used to refine the results' relevance degrees to the given query. This paper proposes an algorithm framework as a solution to this problem and investigates the key issue of training data selection under the framework. The training data selection process is divided into two stages: initial selection for triggering the classifier learning and dynamic selection in the iterations of classifier learning. We investigate two main ways of initial training data selection, including clustering based and ranking based, and compare automatic training data selection schemes with manual manner. Furthermore, support vector machines and the max-min pseudo-probability (MMP) based Bayesian classifier are employed to support image classification, respectively. By varying these factors in the framework, we implement eight algorithms and tested them on keyword based image search results from Google search engine. The experimental results confirm that how to select the training data from noisy search results is really a key issue in the problem considered in this paper and show that the proposed algorithm is effective to improve Google search results, especially at top ranks, thus is helpful to reduce the user labor in finding the desired images by browsing the ranking in depth. Even so, it is still worth meditative to make automatic training data selection scheme better towards perfect human annotation.</p></div>","PeriodicalId":35798,"journal":{"name":"自动化学报","volume":"40 8","pages":"Pages 1699-1708"},"PeriodicalIF":0.0000,"publicationDate":"2014-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S1874-1029(14)60018-5","citationCount":"2","resultStr":"{\"title\":\"Online Learning a Binary Classifier for Improving Google Image Search Results\",\"authors\":\"Yu-Chai WAN , Xia-Bi LIU , Fei-Fei HAN , Kun-Qi TONG , Yu LIU\",\"doi\":\"10.1016/S1874-1029(14)60018-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>It is promising to improve web image search results through exploiting the results' visual contents for learning a binary classifier which is used to refine the results' relevance degrees to the given query. This paper proposes an algorithm framework as a solution to this problem and investigates the key issue of training data selection under the framework. The training data selection process is divided into two stages: initial selection for triggering the classifier learning and dynamic selection in the iterations of classifier learning. We investigate two main ways of initial training data selection, including clustering based and ranking based, and compare automatic training data selection schemes with manual manner. Furthermore, support vector machines and the max-min pseudo-probability (MMP) based Bayesian classifier are employed to support image classification, respectively. By varying these factors in the framework, we implement eight algorithms and tested them on keyword based image search results from Google search engine. The experimental results confirm that how to select the training data from noisy search results is really a key issue in the problem considered in this paper and show that the proposed algorithm is effective to improve Google search results, especially at top ranks, thus is helpful to reduce the user labor in finding the desired images by browsing the ranking in depth. Even so, it is still worth meditative to make automatic training data selection scheme better towards perfect human annotation.</p></div>\",\"PeriodicalId\":35798,\"journal\":{\"name\":\"自动化学报\",\"volume\":\"40 8\",\"pages\":\"Pages 1699-1708\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1016/S1874-1029(14)60018-5\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"自动化学报\",\"FirstCategoryId\":\"1093\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1874102914600185\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"自动化学报","FirstCategoryId":"1093","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1874102914600185","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Computer Science","Score":null,"Total":0}
Online Learning a Binary Classifier for Improving Google Image Search Results
It is promising to improve web image search results through exploiting the results' visual contents for learning a binary classifier which is used to refine the results' relevance degrees to the given query. This paper proposes an algorithm framework as a solution to this problem and investigates the key issue of training data selection under the framework. The training data selection process is divided into two stages: initial selection for triggering the classifier learning and dynamic selection in the iterations of classifier learning. We investigate two main ways of initial training data selection, including clustering based and ranking based, and compare automatic training data selection schemes with manual manner. Furthermore, support vector machines and the max-min pseudo-probability (MMP) based Bayesian classifier are employed to support image classification, respectively. By varying these factors in the framework, we implement eight algorithms and tested them on keyword based image search results from Google search engine. The experimental results confirm that how to select the training data from noisy search results is really a key issue in the problem considered in this paper and show that the proposed algorithm is effective to improve Google search results, especially at top ranks, thus is helpful to reduce the user labor in finding the desired images by browsing the ranking in depth. Even so, it is still worth meditative to make automatic training data selection scheme better towards perfect human annotation.
自动化学报Computer Science-Computer Graphics and Computer-Aided Design
CiteScore
4.80
自引率
0.00%
发文量
6655
期刊介绍:
ACTA AUTOMATICA SINICA is a joint publication of Chinese Association of Automation and the Institute of Automation, the Chinese Academy of Sciences. The objective is the high quality and rapid publication of the articles, with a strong focus on new trends, original theoretical and experimental research and developments, emerging technology, and industrial standards in automation.