Pub Date : 2011-12-01DOI: 10.1109/ICDIM.2011.6093344
R. Tang, S. Fong, S. Sarasvady
Collaborative inter-organizational system (C-IOS) is defined as information technology-based systems that engage multiple business partners for achieving some common value-added goals. In the past, many papers from the literature addressed a large number of techniques on collaborative agents. The techniques range from basic information exchange to sophisticated negotiation. Specifically, for C-IOS's two important tasks namely Experts Finding (EF) and Query Answering (QA) are required in collaboration. These two specific tasks facilitate supply-chain mediation and may be subsequent procurement negotiation. EF concerns about finding or match-making the right personnel in an organization as a committee member for fulfilling a part of the job. QA screens initially whether the resources and commitments are potentially available. The two tasks supposedly would have executed prior to any further collaboration, and the communication is cross organizations. This paper contributes a design of C-IOS that supports EF and QA for inter-organizational collaboration. The underlying technical framework is by Rule Responder which is a powerful tool for creating virtual organizations as multi-agent systems that support collaborative teams on the Semantic Web. A use case of hosting an academic conference among different organizations is illustrated with our proposed concepts in this paper.
{"title":"Expert finding and query answering for Collaborative Inter-Organizational system by using Rule Responder","authors":"R. Tang, S. Fong, S. Sarasvady","doi":"10.1109/ICDIM.2011.6093344","DOIUrl":"https://doi.org/10.1109/ICDIM.2011.6093344","url":null,"abstract":"Collaborative inter-organizational system (C-IOS) is defined as information technology-based systems that engage multiple business partners for achieving some common value-added goals. In the past, many papers from the literature addressed a large number of techniques on collaborative agents. The techniques range from basic information exchange to sophisticated negotiation. Specifically, for C-IOS's two important tasks namely Experts Finding (EF) and Query Answering (QA) are required in collaboration. These two specific tasks facilitate supply-chain mediation and may be subsequent procurement negotiation. EF concerns about finding or match-making the right personnel in an organization as a committee member for fulfilling a part of the job. QA screens initially whether the resources and commitments are potentially available. The two tasks supposedly would have executed prior to any further collaboration, and the communication is cross organizations. This paper contributes a design of C-IOS that supports EF and QA for inter-organizational collaboration. The underlying technical framework is by Rule Responder which is a powerful tool for creating virtual organizations as multi-agent systems that support collaborative teams on the Semantic Web. A use case of hosting an academic conference among different organizations is illustrated with our proposed concepts in this paper.","PeriodicalId":355775,"journal":{"name":"2011 Sixth International Conference on Digital Information Management","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130710529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/ICDIM.2011.6093332
N. Parimala, Anu Saini
WSDL is used to describe the interface of a service, in XML format. The interface describes the functional properties as well as non functional properties. We are concerned with specifying ‘criteria’ as a non functional property of a web service. For this we have extend WSDL to X-WSDL. In order to add criteria information we extend the WSDL (Web Service Definition Language) schema by adding a new element ‘criteriaservice’ this is available in the new namespace. Using this ‘criteriaservcie’ element it is possible to specify the criteria along with a service in an X-WSDL document. The WSDL document is also extended by adding new attributes ‘criteria name’ and ‘description’ to service element. Using this extension it is possible to specify the criteria along with the service in X-WSDL document. The criteria are specified by the user when invoking a service. As a result, we are providing support to discover a more appropriate service according to his/her requirement.
{"title":"Web service with criteria: Extending WSDL","authors":"N. Parimala, Anu Saini","doi":"10.1109/ICDIM.2011.6093332","DOIUrl":"https://doi.org/10.1109/ICDIM.2011.6093332","url":null,"abstract":"WSDL is used to describe the interface of a service, in XML format. The interface describes the functional properties as well as non functional properties. We are concerned with specifying ‘criteria’ as a non functional property of a web service. For this we have extend WSDL to X-WSDL. In order to add criteria information we extend the WSDL (Web Service Definition Language) schema by adding a new element ‘criteriaservice’ this is available in the new namespace. Using this ‘criteriaservcie’ element it is possible to specify the criteria along with a service in an X-WSDL document. The WSDL document is also extended by adding new attributes ‘criteria name’ and ‘description’ to service element. Using this extension it is possible to specify the criteria along with the service in X-WSDL document. The criteria are specified by the user when invoking a service. As a result, we are providing support to discover a more appropriate service according to his/her requirement.","PeriodicalId":355775,"journal":{"name":"2011 Sixth International Conference on Digital Information Management","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130952448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/ICDIM.2011.6093355
F. Shahbodin, M. Yusoff, C. K. Mohd
This paper highlights how ICT could be integrated in the process of teaching and learning in the Problem Based Learning (PBL) environment. The main focus is integrating the ICT components such as multimedia and internet technologies as a tool for PBL learning environment, and utilizing the PBL approach for the delivering instructions in the teaching and learning process at UTeM. This paper also shares findings on the effectiveness of PBLAssess which have been developed in this study. Fifty-six respondents (second year students) enrolled for the Human Computer Interaction course are selected for this study. Two research instruments are developed for the purpose of evaluating students' performances and preferences which include a set of questionnaire and prototype known as PBLAssess. Further some of the current work on integrating ICT and PBL learning environment are also shared. Understanding both the current state of art for PBL and future prospects are the key issues in setting an agenda for future research and development in PBL.
{"title":"ICT + PBL = holistic learning solution: UTeM's experience","authors":"F. Shahbodin, M. Yusoff, C. K. Mohd","doi":"10.1109/ICDIM.2011.6093355","DOIUrl":"https://doi.org/10.1109/ICDIM.2011.6093355","url":null,"abstract":"This paper highlights how ICT could be integrated in the process of teaching and learning in the Problem Based Learning (PBL) environment. The main focus is integrating the ICT components such as multimedia and internet technologies as a tool for PBL learning environment, and utilizing the PBL approach for the delivering instructions in the teaching and learning process at UTeM. This paper also shares findings on the effectiveness of PBLAssess which have been developed in this study. Fifty-six respondents (second year students) enrolled for the Human Computer Interaction course are selected for this study. Two research instruments are developed for the purpose of evaluating students' performances and preferences which include a set of questionnaire and prototype known as PBLAssess. Further some of the current work on integrating ICT and PBL learning environment are also shared. Understanding both the current state of art for PBL and future prospects are the key issues in setting an agenda for future research and development in PBL.","PeriodicalId":355775,"journal":{"name":"2011 Sixth International Conference on Digital Information Management","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129989252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/ICDIM.2011.6093334
A. C. Lin, Ho Minh Tuan, Dean K. Sheu
A progressive die is an effective tool for efficient and economical production of sheet metal parts in large quantities. Nowadays, progressive die designers still spend much of their time on choosing better layouts among feasible ones. This study employs Pro/Web.Link, Hyper Text Markup Language (HTML) and JavaScript to develop an application which helps evaluate automatically strip layouts in Pro/Engineer software environment. This paper proposes solutions for calculating total evaluation score of the strip layout based on four factors: station number factor, moment balancing factor, strip stability factor and feed height factor.
{"title":"Programming for evaluating strip layout of progressive dies","authors":"A. C. Lin, Ho Minh Tuan, Dean K. Sheu","doi":"10.1109/ICDIM.2011.6093334","DOIUrl":"https://doi.org/10.1109/ICDIM.2011.6093334","url":null,"abstract":"A progressive die is an effective tool for efficient and economical production of sheet metal parts in large quantities. Nowadays, progressive die designers still spend much of their time on choosing better layouts among feasible ones. This study employs Pro/Web.Link, Hyper Text Markup Language (HTML) and JavaScript to develop an application which helps evaluate automatically strip layouts in Pro/Engineer software environment. This paper proposes solutions for calculating total evaluation score of the strip layout based on four factors: station number factor, moment balancing factor, strip stability factor and feed height factor.","PeriodicalId":355775,"journal":{"name":"2011 Sixth International Conference on Digital Information Management","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125586860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/ICDIM.2011.6093327
Nada Bajnaid, R. Benlamri, B. Cogan
In this paper, we propose an ontological design for developing a context-aware e-learning system that supports learners developing Software Quality Assurance (SQA) compliant software. The learning process is driven by the type of software product the learner is dealing with, as well as, its SQA requirements and corresponding SQA techniques and procedures. The paper shows a global ontology design to embed knowledge related to the learner, SQA domain in general, and product-based SQA requirement and procedures. Reasoning tools are provided to infer knowledge that can provide more-modular and just-in-time contextual SQA resources for the task in hand. A learning scenario is shown to illustrate the system's ability to deal with SQA requirements facing the learner in the software development process.
{"title":"Context-aware SQA e-learning system","authors":"Nada Bajnaid, R. Benlamri, B. Cogan","doi":"10.1109/ICDIM.2011.6093327","DOIUrl":"https://doi.org/10.1109/ICDIM.2011.6093327","url":null,"abstract":"In this paper, we propose an ontological design for developing a context-aware e-learning system that supports learners developing Software Quality Assurance (SQA) compliant software. The learning process is driven by the type of software product the learner is dealing with, as well as, its SQA requirements and corresponding SQA techniques and procedures. The paper shows a global ontology design to embed knowledge related to the learner, SQA domain in general, and product-based SQA requirement and procedures. Reasoning tools are provided to infer knowledge that can provide more-modular and just-in-time contextual SQA resources for the task in hand. A learning scenario is shown to illustrate the system's ability to deal with SQA requirements facing the learner in the software development process.","PeriodicalId":355775,"journal":{"name":"2011 Sixth International Conference on Digital Information Management","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124763406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/ICDIM.2011.6093335
Gabriele Di Chiara, L. Paolino, Marco Romano, M. Sebillo, G. Tortora, G. Vitiello, A. Ginige
We have developed a multimodal interface; Framy to effectively display large 2-dimenetional data sets such as geographical data on a mobile interface. We have now extended this to be used by visually-impaired users. A pilot study that we conducted, and the interviews with a group of potential stakeholders, helped us to detect some critical problems with the current interface, derive further requirements specific for visually impaired mobile users and re-design Framy accordingly.
{"title":"The Framy user interface for visually-impaired users","authors":"Gabriele Di Chiara, L. Paolino, Marco Romano, M. Sebillo, G. Tortora, G. Vitiello, A. Ginige","doi":"10.1109/ICDIM.2011.6093335","DOIUrl":"https://doi.org/10.1109/ICDIM.2011.6093335","url":null,"abstract":"We have developed a multimodal interface; Framy to effectively display large 2-dimenetional data sets such as geographical data on a mobile interface. We have now extended this to be used by visually-impaired users. A pilot study that we conducted, and the interviews with a group of potential stakeholders, helped us to detect some critical problems with the current interface, derive further requirements specific for visually impaired mobile users and re-design Framy accordingly.","PeriodicalId":355775,"journal":{"name":"2011 Sixth International Conference on Digital Information Management","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123537426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/ICDIM.2011.6093361
Bafrin Zarei, R. Ghanbarzadeh, Poorya Khodabande, Hadi Toofani
The widespread and increasing application of Particle Swarm Optimizer (PSO) algorithms in both theoretical and practical fields leads to further considerations and new developments for improving its efficiency. To achieve this purpose in this paper a new method is introduced to enhance the convergence rate and reduce the computational time of PSO by combining the PSO including mutation concept (MPSO) and the Hierarchical Particle Swarm Optimizer (HPSO). Therefore the new approach is called MHPSO: a composition of MPSO and HPSO which act simultaneously in the optimization process. In addition some benchmark examples are analyzed using the presented method; consequently, the results are compared to other procedures which illustrate better outcomes and high performance of MHPSO.
{"title":"MHPSO: A new method to enhance the Particle Swarm Optimizer","authors":"Bafrin Zarei, R. Ghanbarzadeh, Poorya Khodabande, Hadi Toofani","doi":"10.1109/ICDIM.2011.6093361","DOIUrl":"https://doi.org/10.1109/ICDIM.2011.6093361","url":null,"abstract":"The widespread and increasing application of Particle Swarm Optimizer (PSO) algorithms in both theoretical and practical fields leads to further considerations and new developments for improving its efficiency. To achieve this purpose in this paper a new method is introduced to enhance the convergence rate and reduce the computational time of PSO by combining the PSO including mutation concept (MPSO) and the Hierarchical Particle Swarm Optimizer (HPSO). Therefore the new approach is called MHPSO: a composition of MPSO and HPSO which act simultaneously in the optimization process. In addition some benchmark examples are analyzed using the presented method; consequently, the results are compared to other procedures which illustrate better outcomes and high performance of MHPSO.","PeriodicalId":355775,"journal":{"name":"2011 Sixth International Conference on Digital Information Management","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125406574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/ICDIM.2011.6093356
Zhuojia Xu, X. Yi
Recently, a new research area, named Privacy-preserving Distributed Data Mining (PPDDM) has emerged. It aims at solving the following problem: a number of participants want to jointly conduct a data mining task based on the private data sets held by each of the participants. This problem setting has captured attention and interests of researchers, practitioners and developers from the communities of both data mining and information security. They have made great progress in designing and developing solutions to address this scenario. However, researchers and practitioners are now faced with a challenge on how to devise a standard on synthesizing and evaluating various PPDDM protocols, because they have been confused by the excessive number of techniques developed so far. In this paper, we put forward a framework to synthesize and characterize existing PPDDM protocols so as to provide a standard and systematic approach of understanding PPDDM-related problems, analyzing PPDDM requirements and designing effective and efficient PPDDM protocols.
{"title":"Classification of Privacy-preserving Distributed Data Mining protocols","authors":"Zhuojia Xu, X. Yi","doi":"10.1109/ICDIM.2011.6093356","DOIUrl":"https://doi.org/10.1109/ICDIM.2011.6093356","url":null,"abstract":"Recently, a new research area, named Privacy-preserving Distributed Data Mining (PPDDM) has emerged. It aims at solving the following problem: a number of participants want to jointly conduct a data mining task based on the private data sets held by each of the participants. This problem setting has captured attention and interests of researchers, practitioners and developers from the communities of both data mining and information security. They have made great progress in designing and developing solutions to address this scenario. However, researchers and practitioners are now faced with a challenge on how to devise a standard on synthesizing and evaluating various PPDDM protocols, because they have been confused by the excessive number of techniques developed so far. In this paper, we put forward a framework to synthesize and characterize existing PPDDM protocols so as to provide a standard and systematic approach of understanding PPDDM-related problems, analyzing PPDDM requirements and designing effective and efficient PPDDM protocols.","PeriodicalId":355775,"journal":{"name":"2011 Sixth International Conference on Digital Information Management","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129930638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/ICDIM.2011.6093368
Jean-Pierre Nziga
Network Intrusion Detection Systems (NIDS) monitor internet traffic to detect malicious activities including but not limited to denial of service attacks, network accesses by unauthorized users, attempts to gain additional privileges and port scans. The amount of data that must be analyzed by NIDS is too large. Prior studies developed feature selection and feature extraction techniques to reduce the size of data. None has focused on finding exactly by how much the dataset should be reduced. Dimensionality reduction is a field in machine learning that consists on mapping high dimensional data into lower dimension while preserving important features of the original dataset. Dimensionality reduction techniques have been used to reduce the amount of data in applications such as speech signals, digital photographs, fMRI scans, DNA microarrays, Hyper spectral data. The purpose of this paper is to find the finite amount of data required for successful intrusion detection. This evaluation is necessary to improve the efficiency of NIDS in identifying existing attack patterns and recognizing new intrusion in real-time. Two dimensionality reduction techniques are used one linear technique (Principal Component Analysis) and one non-linear technique (Multidimensional Scaling). Data is then submitted to two classification algorithms J48 (C.45) and Naïve Bayes. This study was conducted using the KDD Cup 99 data. Experimental results show optimal performance with reduced datasets of 4 dimensions for J48 and 12 dimensions for Naïve Bayes.
网络入侵检测系统(NIDS)监控互联网流量以检测恶意活动,包括但不限于拒绝服务攻击、未经授权用户的网络访问、试图获得额外特权和端口扫描。NIDS必须分析的数据量太大。先前的研究开发了特征选择和特征提取技术来减小数据的大小。没有人专注于找出数据集应该减少多少。降维是机器学习中的一个领域,它包括将高维数据映射到低维数据,同时保留原始数据集的重要特征。降维技术已被用于减少语音信号、数码照片、功能磁共振成像扫描、DNA微阵列、超光谱数据等应用中的数据量。本文的目的是找到成功的入侵检测所需的有限数量的数据。这种评估对于提高网络入侵检测系统识别现有攻击模式和实时识别新入侵的效率是必要的。采用两种降维技术,一种是线性技术(主成分分析),一种是非线性技术(多维尺度)。然后将数据提交给两种分类算法J48 (C.45)和Naïve Bayes。本研究使用KDD Cup 99数据进行。实验结果表明,J48的约简数据集为4维,Naïve贝叶斯的约简数据集为12维时性能最佳。
{"title":"Minimal dataset for Network Intrusion Detection Systems via dimensionality reduction","authors":"Jean-Pierre Nziga","doi":"10.1109/ICDIM.2011.6093368","DOIUrl":"https://doi.org/10.1109/ICDIM.2011.6093368","url":null,"abstract":"Network Intrusion Detection Systems (NIDS) monitor internet traffic to detect malicious activities including but not limited to denial of service attacks, network accesses by unauthorized users, attempts to gain additional privileges and port scans. The amount of data that must be analyzed by NIDS is too large. Prior studies developed feature selection and feature extraction techniques to reduce the size of data. None has focused on finding exactly by how much the dataset should be reduced. Dimensionality reduction is a field in machine learning that consists on mapping high dimensional data into lower dimension while preserving important features of the original dataset. Dimensionality reduction techniques have been used to reduce the amount of data in applications such as speech signals, digital photographs, fMRI scans, DNA microarrays, Hyper spectral data. The purpose of this paper is to find the finite amount of data required for successful intrusion detection. This evaluation is necessary to improve the efficiency of NIDS in identifying existing attack patterns and recognizing new intrusion in real-time. Two dimensionality reduction techniques are used one linear technique (Principal Component Analysis) and one non-linear technique (Multidimensional Scaling). Data is then submitted to two classification algorithms J48 (C.45) and Naïve Bayes. This study was conducted using the KDD Cup 99 data. Experimental results show optimal performance with reduced datasets of 4 dimensions for J48 and 12 dimensions for Naïve Bayes.","PeriodicalId":355775,"journal":{"name":"2011 Sixth International Conference on Digital Information Management","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128215672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/ICDIM.2011.6093329
Sameendra Samarawickrama, L. Jayaratne
A focused crawler is a web crawler that traverse the web to explore information that is related to a particular topic of interest only. On the other hand, generic web crawlers try to search the entire web, which is impossible due to the size and the complexity of WWW. In this paper we make a survey of some of the latest focused web crawling approaches discussing each with their experimental results. We categorize them as focused crawling based on content analysis, focused crawling based on link analysis and focused crawling based on both the content and link analysis. We also give an insight to the future research and draw the overall conclusions.
{"title":"Automatic text classification and focused crawling","authors":"Sameendra Samarawickrama, L. Jayaratne","doi":"10.1109/ICDIM.2011.6093329","DOIUrl":"https://doi.org/10.1109/ICDIM.2011.6093329","url":null,"abstract":"A focused crawler is a web crawler that traverse the web to explore information that is related to a particular topic of interest only. On the other hand, generic web crawlers try to search the entire web, which is impossible due to the size and the complexity of WWW. In this paper we make a survey of some of the latest focused web crawling approaches discussing each with their experimental results. We categorize them as focused crawling based on content analysis, focused crawling based on link analysis and focused crawling based on both the content and link analysis. We also give an insight to the future research and draw the overall conclusions.","PeriodicalId":355775,"journal":{"name":"2011 Sixth International Conference on Digital Information Management","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114289671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}