{"title":"An approach for accessing data from hidden web using intelligent agent technology","authors":"Lohit Singh, Dilip Kumar Sharma","doi":"10.1109/IADCC.2013.6514329","DOIUrl":null,"url":null,"abstract":"There is large amount of information available on web, which is hidden from users. This is because such information is not able to be accessed or indexed by traditional search engines. These search engines are only able to crawl information by following hypertext links. The forms which require login or any authorization process can be ignored by them. Hidden web refers to that deepest part of the Web which is not available for traditional Web crawlers. Obtaining the content from Hidden web is a challenging task. Today many web sites are containing pages that are dynamic in nature. This dynamic nature of web pages creates a problem for retrieving information for traditional web crawlers. The effort done to solve the given problem is discussed in brief. Then, a comparative study among the earlier defined architecture, considering various parameters, is also shown. By analyzing above methods a framework is proposed which uses an intelligent agent technology for accessing the hidden web.","PeriodicalId":325901,"journal":{"name":"2013 3rd IEEE International Advance Computing Conference (IACC)","volume":"140 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 3rd IEEE International Advance Computing Conference (IACC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IADCC.2013.6514329","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
There is large amount of information available on web, which is hidden from users. This is because such information is not able to be accessed or indexed by traditional search engines. These search engines are only able to crawl information by following hypertext links. The forms which require login or any authorization process can be ignored by them. Hidden web refers to that deepest part of the Web which is not available for traditional Web crawlers. Obtaining the content from Hidden web is a challenging task. Today many web sites are containing pages that are dynamic in nature. This dynamic nature of web pages creates a problem for retrieving information for traditional web crawlers. The effort done to solve the given problem is discussed in brief. Then, a comparative study among the earlier defined architecture, considering various parameters, is also shown. By analyzing above methods a framework is proposed which uses an intelligent agent technology for accessing the hidden web.