{"title":"IPOD:大规模工业和专业职业数据集","authors":"Junhua Liu, Yung Chuen Ng, Kwan Hui Lim","doi":"10.1145/3406865.3418329","DOIUrl":null,"url":null,"abstract":"In today's job market, occupational data mining and analysis is growing in importance as it enables companies to predict employee turnover, model career trajectories, screen through resumes and perform other human resource tasks. As such, there has been growing interest in utilizing occupational data mining and analysis, and a key requirement to facilitate these tasks is the need for an occupation-related dataset. However, most research use proprietary datasets or do not make their dataset publicly available, thus impeding development in this area. To solve this issue, we present the Industrial and Professional Occupation Dataset (IPOD), which comprises 475,073 job titles belonging to 192,295 Linkedin users. In addition to making IPOD publicly available, we also: (i) manually annotate each job title with its associated level of seniority, domain of work and location; and (ii) provide embedding for job titles and discuss various use cases. This dataset is publicly available at https://github.com/junhua/ipod.","PeriodicalId":93424,"journal":{"name":"CSCW '20 Companion : conference companion publication of the 2020 Conference on Computer Supported Cooperative Work and Social Computing : October 17-21, 2020, Virtual Event, USA. Conference on Computer-Supported Cooperative Work and So...","volume":"54 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"IPOD: A Large-scale Industrial and Professional Occupation Dataset\",\"authors\":\"Junhua Liu, Yung Chuen Ng, Kwan Hui Lim\",\"doi\":\"10.1145/3406865.3418329\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In today's job market, occupational data mining and analysis is growing in importance as it enables companies to predict employee turnover, model career trajectories, screen through resumes and perform other human resource tasks. As such, there has been growing interest in utilizing occupational data mining and analysis, and a key requirement to facilitate these tasks is the need for an occupation-related dataset. However, most research use proprietary datasets or do not make their dataset publicly available, thus impeding development in this area. To solve this issue, we present the Industrial and Professional Occupation Dataset (IPOD), which comprises 475,073 job titles belonging to 192,295 Linkedin users. In addition to making IPOD publicly available, we also: (i) manually annotate each job title with its associated level of seniority, domain of work and location; and (ii) provide embedding for job titles and discuss various use cases. This dataset is publicly available at https://github.com/junhua/ipod.\",\"PeriodicalId\":93424,\"journal\":{\"name\":\"CSCW '20 Companion : conference companion publication of the 2020 Conference on Computer Supported Cooperative Work and Social Computing : October 17-21, 2020, Virtual Event, USA. Conference on Computer-Supported Cooperative Work and So...\",\"volume\":\"54 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"CSCW '20 Companion : conference companion publication of the 2020 Conference on Computer Supported Cooperative Work and Social Computing : October 17-21, 2020, Virtual Event, USA. Conference on Computer-Supported Cooperative Work and So...\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3406865.3418329\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"CSCW '20 Companion : conference companion publication of the 2020 Conference on Computer Supported Cooperative Work and Social Computing : October 17-21, 2020, Virtual Event, USA. Conference on Computer-Supported Cooperative Work and So...","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3406865.3418329","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
IPOD: A Large-scale Industrial and Professional Occupation Dataset
In today's job market, occupational data mining and analysis is growing in importance as it enables companies to predict employee turnover, model career trajectories, screen through resumes and perform other human resource tasks. As such, there has been growing interest in utilizing occupational data mining and analysis, and a key requirement to facilitate these tasks is the need for an occupation-related dataset. However, most research use proprietary datasets or do not make their dataset publicly available, thus impeding development in this area. To solve this issue, we present the Industrial and Professional Occupation Dataset (IPOD), which comprises 475,073 job titles belonging to 192,295 Linkedin users. In addition to making IPOD publicly available, we also: (i) manually annotate each job title with its associated level of seniority, domain of work and location; and (ii) provide embedding for job titles and discuss various use cases. This dataset is publicly available at https://github.com/junhua/ipod.