基于网络爬虫的航运求职信息收集方法研究

Dongcheng Peng, Tie-shan Li, Yang Wang, C. L. Philip Chen
{"title":"基于网络爬虫的航运求职信息收集方法研究","authors":"Dongcheng Peng, Tie-shan Li, Yang Wang, C. L. Philip Chen","doi":"10.1109/ICIST.2018.8426183","DOIUrl":null,"url":null,"abstract":"In recent years, with the increasing development of Artificial Intelligence, Big Data and Cloud computing, etc., the information on the Internet has been booming, so how to obtain target information efficiently and quickly has become an urgent problem to be solved. This article aims at the data collection and acquisition problem of shipping job hunting information under the network environment. In this study, two kinds of information collection methods for shipping job hunting based on web crawler are proposed. Based on the Python standard libraries and Scrapy crawl framework, corresponding web crawler program is designed and implemented to scrape the target information from target website and store the collected data into local file eventually. Through the amount of data crawled and time consuming comparative analysis, the result demonstrates that the data collection method based on the Scrapy crawler framework is simple to operate, easily extensible, featuring being targeted, with high efficiency and fast speed in collecting shipping job hunting information. Fortunately, the collected data can not only help researchers conduct subsequent data mining analysis, but also can provide data support for the follow-up shipping job hunting information database.","PeriodicalId":331555,"journal":{"name":"2018 Eighth International Conference on Information Science and Technology (ICIST)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Research on Information Collection Method of Shipping Job Hunting Based on Web Crawler\",\"authors\":\"Dongcheng Peng, Tie-shan Li, Yang Wang, C. L. Philip Chen\",\"doi\":\"10.1109/ICIST.2018.8426183\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, with the increasing development of Artificial Intelligence, Big Data and Cloud computing, etc., the information on the Internet has been booming, so how to obtain target information efficiently and quickly has become an urgent problem to be solved. This article aims at the data collection and acquisition problem of shipping job hunting information under the network environment. In this study, two kinds of information collection methods for shipping job hunting based on web crawler are proposed. Based on the Python standard libraries and Scrapy crawl framework, corresponding web crawler program is designed and implemented to scrape the target information from target website and store the collected data into local file eventually. Through the amount of data crawled and time consuming comparative analysis, the result demonstrates that the data collection method based on the Scrapy crawler framework is simple to operate, easily extensible, featuring being targeted, with high efficiency and fast speed in collecting shipping job hunting information. Fortunately, the collected data can not only help researchers conduct subsequent data mining analysis, but also can provide data support for the follow-up shipping job hunting information database.\",\"PeriodicalId\":331555,\"journal\":{\"name\":\"2018 Eighth International Conference on Information Science and Technology (ICIST)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 Eighth International Conference on Information Science and Technology (ICIST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICIST.2018.8426183\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Eighth International Conference on Information Science and Technology (ICIST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIST.2018.8426183","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

摘要

近年来,随着人工智能、大数据、云计算等技术的日益发展,互联网上的信息迅猛发展,如何高效、快速地获取目标信息成为一个亟待解决的问题。本文针对网络环境下航运求职信息的数据采集与获取问题进行了研究。本研究提出了两种基于网络爬虫的航运求职信息收集方法。基于Python标准库和Scrapy抓取框架,设计并实现了相应的web爬虫程序,从目标网站抓取目标信息,并最终将收集到的数据存储到本地文件中。通过抓取的数据量和耗时的对比分析,结果表明,基于Scrapy爬虫框架的数据收集方法操作简单,易于扩展,具有针对性强,收集航运求职信息效率高,速度快的特点。幸运的是,收集到的数据不仅可以帮助研究者进行后续的数据挖掘分析,还可以为后续的航运求职信息数据库提供数据支持。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Research on Information Collection Method of Shipping Job Hunting Based on Web Crawler
In recent years, with the increasing development of Artificial Intelligence, Big Data and Cloud computing, etc., the information on the Internet has been booming, so how to obtain target information efficiently and quickly has become an urgent problem to be solved. This article aims at the data collection and acquisition problem of shipping job hunting information under the network environment. In this study, two kinds of information collection methods for shipping job hunting based on web crawler are proposed. Based on the Python standard libraries and Scrapy crawl framework, corresponding web crawler program is designed and implemented to scrape the target information from target website and store the collected data into local file eventually. Through the amount of data crawled and time consuming comparative analysis, the result demonstrates that the data collection method based on the Scrapy crawler framework is simple to operate, easily extensible, featuring being targeted, with high efficiency and fast speed in collecting shipping job hunting information. Fortunately, the collected data can not only help researchers conduct subsequent data mining analysis, but also can provide data support for the follow-up shipping job hunting information database.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
On the Optimal Design of Fractal Tuning Stub UWB Patch Antenna with Band-Notched Function A Quick Deterministic Replay Method Based on Dependence Pair A Compression Hashing Scheme for Large-Scale Face Retrieval The Study of Smart Elderly Care System A Hybrid Path-Planning Scheme for an Unmanned Surface Vehicle
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1