Dongcheng Peng, Tie-shan Li, Yang Wang, C. L. Philip Chen
{"title":"基于网络爬虫的航运求职信息收集方法研究","authors":"Dongcheng Peng, Tie-shan Li, Yang Wang, C. L. Philip Chen","doi":"10.1109/ICIST.2018.8426183","DOIUrl":null,"url":null,"abstract":"In recent years, with the increasing development of Artificial Intelligence, Big Data and Cloud computing, etc., the information on the Internet has been booming, so how to obtain target information efficiently and quickly has become an urgent problem to be solved. This article aims at the data collection and acquisition problem of shipping job hunting information under the network environment. In this study, two kinds of information collection methods for shipping job hunting based on web crawler are proposed. Based on the Python standard libraries and Scrapy crawl framework, corresponding web crawler program is designed and implemented to scrape the target information from target website and store the collected data into local file eventually. Through the amount of data crawled and time consuming comparative analysis, the result demonstrates that the data collection method based on the Scrapy crawler framework is simple to operate, easily extensible, featuring being targeted, with high efficiency and fast speed in collecting shipping job hunting information. Fortunately, the collected data can not only help researchers conduct subsequent data mining analysis, but also can provide data support for the follow-up shipping job hunting information database.","PeriodicalId":331555,"journal":{"name":"2018 Eighth International Conference on Information Science and Technology (ICIST)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Research on Information Collection Method of Shipping Job Hunting Based on Web Crawler\",\"authors\":\"Dongcheng Peng, Tie-shan Li, Yang Wang, C. L. Philip Chen\",\"doi\":\"10.1109/ICIST.2018.8426183\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, with the increasing development of Artificial Intelligence, Big Data and Cloud computing, etc., the information on the Internet has been booming, so how to obtain target information efficiently and quickly has become an urgent problem to be solved. This article aims at the data collection and acquisition problem of shipping job hunting information under the network environment. In this study, two kinds of information collection methods for shipping job hunting based on web crawler are proposed. Based on the Python standard libraries and Scrapy crawl framework, corresponding web crawler program is designed and implemented to scrape the target information from target website and store the collected data into local file eventually. Through the amount of data crawled and time consuming comparative analysis, the result demonstrates that the data collection method based on the Scrapy crawler framework is simple to operate, easily extensible, featuring being targeted, with high efficiency and fast speed in collecting shipping job hunting information. Fortunately, the collected data can not only help researchers conduct subsequent data mining analysis, but also can provide data support for the follow-up shipping job hunting information database.\",\"PeriodicalId\":331555,\"journal\":{\"name\":\"2018 Eighth International Conference on Information Science and Technology (ICIST)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 Eighth International Conference on Information Science and Technology (ICIST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICIST.2018.8426183\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Eighth International Conference on Information Science and Technology (ICIST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIST.2018.8426183","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Research on Information Collection Method of Shipping Job Hunting Based on Web Crawler
In recent years, with the increasing development of Artificial Intelligence, Big Data and Cloud computing, etc., the information on the Internet has been booming, so how to obtain target information efficiently and quickly has become an urgent problem to be solved. This article aims at the data collection and acquisition problem of shipping job hunting information under the network environment. In this study, two kinds of information collection methods for shipping job hunting based on web crawler are proposed. Based on the Python standard libraries and Scrapy crawl framework, corresponding web crawler program is designed and implemented to scrape the target information from target website and store the collected data into local file eventually. Through the amount of data crawled and time consuming comparative analysis, the result demonstrates that the data collection method based on the Scrapy crawler framework is simple to operate, easily extensible, featuring being targeted, with high efficiency and fast speed in collecting shipping job hunting information. Fortunately, the collected data can not only help researchers conduct subsequent data mining analysis, but also can provide data support for the follow-up shipping job hunting information database.