{"title":"Information extraction based on probing algorithm with Bayesian approach","authors":"J. Davidson, I. Jacob, K. G. Srinivasagam","doi":"10.1109/ICICES.2014.7033761","DOIUrl":null,"url":null,"abstract":"Document Annotation is the task of adding metadata information in the document which is useful in information extraction. Document annotation has emerged as a different stream in data mining. Majority of algorithms are concentrated on query workload. This paper uses Probing algorithm with Bayesian approach which identifies the attribute based on query workload, text frequency and content of the previous text annotation such as content value. This method has been implemented in datasets that facilitates data annotation and prioritizes the values of the attributes by ranking scheme. Query cost is also low when compared to other approach. The experimental analysis shows a better performance while comparing with other methods because probability theory provides a principled foundation for such reasoning under uncertainty.","PeriodicalId":13713,"journal":{"name":"International Conference on Information Communication and Embedded Systems (ICICES2014)","volume":"57 1","pages":"1-4"},"PeriodicalIF":0.0000,"publicationDate":"2014-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Information Communication and Embedded Systems (ICICES2014)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICES.2014.7033761","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Document Annotation is the task of adding metadata information in the document which is useful in information extraction. Document annotation has emerged as a different stream in data mining. Majority of algorithms are concentrated on query workload. This paper uses Probing algorithm with Bayesian approach which identifies the attribute based on query workload, text frequency and content of the previous text annotation such as content value. This method has been implemented in datasets that facilitates data annotation and prioritizes the values of the attributes by ranking scheme. Query cost is also low when compared to other approach. The experimental analysis shows a better performance while comparing with other methods because probability theory provides a principled foundation for such reasoning under uncertainty.