{"title":"Web Harvesting for Data Retrieval on Scientific Journal Sites","authors":"I. G. S. Rahayuda, N. P. L. Santiari","doi":"10.32493/informatika.v6i1.10077","DOIUrl":null,"url":null,"abstract":"Publishing scientific articles online in journals is a must for researchers or academics. In choosing the journal of purpose, the researcher must look at important information on the journal's web, such as indexing, scope, fee, quarter and other information. This information is generally not collected in one page, but spread over several pages in a web journal. This will be complicated when researchers have to look at information in several journals, moreover, the information in these journals may change at any time. In this research, web harvesting design is conducted to retrieve information on web journals. With web harvesting, information that is spread across several pages can be collected into one, and researchers do not need to worry if the information has changed, because the information collected is the last or updated information. Harvesting technique is done by taking the page URL of the page, starting the source code from where the information is retrieved and end source code until the information stops being retrieved. Harvesting technique was successfully developed based on the web bootstrap framework. The test data is taken from several scientific journal webs. The information collected includes name, description, accreditation, indexing, scope, publication rate, publication charge, template and quarter. Based on tests carried out using black box testing, it is known that all the features made are as expected.","PeriodicalId":251854,"journal":{"name":"Jurnal Informatika Universitas Pamulang","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Jurnal Informatika Universitas Pamulang","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32493/informatika.v6i1.10077","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Publishing scientific articles online in journals is a must for researchers or academics. In choosing the journal of purpose, the researcher must look at important information on the journal's web, such as indexing, scope, fee, quarter and other information. This information is generally not collected in one page, but spread over several pages in a web journal. This will be complicated when researchers have to look at information in several journals, moreover, the information in these journals may change at any time. In this research, web harvesting design is conducted to retrieve information on web journals. With web harvesting, information that is spread across several pages can be collected into one, and researchers do not need to worry if the information has changed, because the information collected is the last or updated information. Harvesting technique is done by taking the page URL of the page, starting the source code from where the information is retrieved and end source code until the information stops being retrieved. Harvesting technique was successfully developed based on the web bootstrap framework. The test data is taken from several scientific journal webs. The information collected includes name, description, accreditation, indexing, scope, publication rate, publication charge, template and quarter. Based on tests carried out using black box testing, it is known that all the features made are as expected.