基于网络爬虫的航运求职信息收集方法研究

2018 Eighth International Conference on Information Science and Technology (ICIST) Pub Date : 2018-06-01 DOI:10.1109/ICIST.2018.8426183

Dongcheng Peng, Tie-shan Li, Yang Wang, C. L. Philip Chen

{"title":"基于网络爬虫的航运求职信息收集方法研究","authors":"Dongcheng Peng, Tie-shan Li, Yang Wang, C. L. Philip Chen","doi":"10.1109/ICIST.2018.8426183","DOIUrl":null,"url":null,"abstract":"In recent years, with the increasing development of Artificial Intelligence, Big Data and Cloud computing, etc., the information on the Internet has been booming, so how to obtain target information efficiently and quickly has become an urgent problem to be solved. This article aims at the data collection and acquisition problem of shipping job hunting information under the network environment. In this study, two kinds of information collection methods for shipping job hunting based on web crawler are proposed. Based on the Python standard libraries and Scrapy crawl framework, corresponding web crawler program is designed and implemented to scrape the target information from target website and store the collected data into local file eventually. Through the amount of data crawled and time consuming comparative analysis, the result demonstrates that the data collection method based on the Scrapy crawler framework is simple to operate, easily extensible, featuring being targeted, with high efficiency and fast speed in collecting shipping job hunting information. Fortunately, the collected data can not only help researchers conduct subsequent data mining analysis, but also can provide data support for the follow-up shipping job hunting information database.","PeriodicalId":331555,"journal":{"name":"2018 Eighth International Conference on Information Science and Technology (ICIST)","volume":"92 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Research on Information Collection Method of Shipping Job Hunting Based on Web Crawler\",\"authors\":\"Dongcheng Peng, Tie-shan Li, Yang Wang, C. L. Philip Chen\",\"doi\":\"10.1109/ICIST.2018.8426183\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, with the increasing development of Artificial Intelligence, Big Data and Cloud computing, etc., the information on the Internet has been booming, so how to obtain target information efficiently and quickly has become an urgent problem to be solved. This article aims at the data collection and acquisition problem of shipping job hunting information under the network environment. In this study, two kinds of information collection methods for shipping job hunting based on web crawler are proposed. Based on the Python standard libraries and Scrapy crawl framework, corresponding web crawler program is designed and implemented to scrape the target information from target website and store the collected data into local file eventually. Through the amount of data crawled and time consuming comparative analysis, the result demonstrates that the data collection method based on the Scrapy crawler framework is simple to operate, easily extensible, featuring being targeted, with high efficiency and fast speed in collecting shipping job hunting information. Fortunately, the collected data can not only help researchers conduct subsequent data mining analysis, but also can provide data support for the follow-up shipping job hunting information database.\",\"PeriodicalId\":331555,\"journal\":{\"name\":\"2018 Eighth International Conference on Information Science and Technology (ICIST)\",\"volume\":\"92 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 Eighth International Conference on Information Science and Technology (ICIST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICIST.2018.8426183\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Eighth International Conference on Information Science and Technology (ICIST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIST.2018.8426183","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 11

摘要

近年来，随着人工智能、大数据、云计算等技术的日益发展，互联网上的信息迅猛发展，如何高效、快速地获取目标信息成为一个亟待解决的问题。本文针对网络环境下航运求职信息的数据采集与获取问题进行了研究。本研究提出了两种基于网络爬虫的航运求职信息收集方法。基于Python标准库和Scrapy抓取框架，设计并实现了相应的web爬虫程序，从目标网站抓取目标信息，并最终将收集到的数据存储到本地文件中。通过抓取的数据量和耗时的对比分析，结果表明，基于Scrapy爬虫框架的数据收集方法操作简单，易于扩展，具有针对性强，收集航运求职信息效率高，速度快的特点。幸运的是，收集到的数据不仅可以帮助研究者进行后续的数据挖掘分析，还可以为后续的航运求职信息数据库提供数据支持。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Research on Information Collection Method of Shipping Job Hunting Based on Web Crawler

In recent years, with the increasing development of Artificial Intelligence, Big Data and Cloud computing, etc., the information on the Internet has been booming, so how to obtain target information efficiently and quickly has become an urgent problem to be solved. This article aims at the data collection and acquisition problem of shipping job hunting information under the network environment. In this study, two kinds of information collection methods for shipping job hunting based on web crawler are proposed. Based on the Python standard libraries and Scrapy crawl framework, corresponding web crawler program is designed and implemented to scrape the target information from target website and store the collected data into local file eventually. Through the amount of data crawled and time consuming comparative analysis, the result demonstrates that the data collection method based on the Scrapy crawler framework is simple to operate, easily extensible, featuring being targeted, with high efficiency and fast speed in collecting shipping job hunting information. Fortunately, the collected data can not only help researchers conduct subsequent data mining analysis, but also can provide data support for the follow-up shipping job hunting information database.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2018 Eighth International Conference on Information Science and Technology (ICIST)

自引率

0.00%

发文量

期刊最新文献

On the Optimal Design of Fractal Tuning Stub UWB Patch Antenna with Band-Notched Function A Quick Deterministic Replay Method Based on Dependence Pair A Compression Hashing Scheme for Large-Scale Face Retrieval The Study of Smart Elderly Care System A Hybrid Path-Planning Scheme for an Unmanned Surface Vehicle