Topic modelling applied on innovation studies of Flemish companies

IF 1.7 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Journal of Business Analytics Pub Date : 2023-03-03 DOI:10.1080/2573234X.2023.2186274
Annelien Crijns, Victor Vanhullebusch, Manon Reusens, Michael Reusens, B. Baesens
{"title":"Topic modelling applied on innovation studies of Flemish companies","authors":"Annelien Crijns, Victor Vanhullebusch, Manon Reusens, Michael Reusens, B. Baesens","doi":"10.1080/2573234X.2023.2186274","DOIUrl":null,"url":null,"abstract":"ABSTRACT Mapping innovation in companies for the purpose of official statistics is usually done through business surveys. However, this traditional approach faces several drawbacks like a lack of responses, response bias, low frequency, and high costs. Alternatively, text-based models trained on web-scraped text from company websites have been developed to complement or substitute traditional business surveys. This paper utilises web scraping and text-based models to map the business innovation in Flanders with a focus on identifying different types of innovation through topic modelling. More specifically, the scraped web texts are used to identify innovative economic sectors or topics, and to classify firms into these topics using Top2Vec and Lbl2Vec. We conclude that both models can be successfully combined to discover topics (or sectors) and classify companies into these topics which results in an additional parameter for mapping innovation in different regions.","PeriodicalId":36417,"journal":{"name":"Journal of Business Analytics","volume":"22 1","pages":"243 - 254"},"PeriodicalIF":1.7000,"publicationDate":"2023-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Business Analytics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/2573234X.2023.2186274","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 1

Abstract

ABSTRACT Mapping innovation in companies for the purpose of official statistics is usually done through business surveys. However, this traditional approach faces several drawbacks like a lack of responses, response bias, low frequency, and high costs. Alternatively, text-based models trained on web-scraped text from company websites have been developed to complement or substitute traditional business surveys. This paper utilises web scraping and text-based models to map the business innovation in Flanders with a focus on identifying different types of innovation through topic modelling. More specifically, the scraped web texts are used to identify innovative economic sectors or topics, and to classify firms into these topics using Top2Vec and Lbl2Vec. We conclude that both models can be successfully combined to discover topics (or sectors) and classify companies into these topics which results in an additional parameter for mapping innovation in different regions.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
主题模型在佛兰德公司创新研究中的应用
为了官方统计,通常通过商业调查来绘制公司的创新地图。然而,这种传统方法面临着一些缺点,如缺乏响应、响应偏差、频率低和成本高。另外,基于文本的模型从公司网站上抓取的文本进行训练,以补充或替代传统的商业调查。本文利用网络抓取和基于文本的模型来映射法兰德斯的商业创新,重点是通过主题建模来识别不同类型的创新。更具体地说,抓取的网络文本用于识别创新的经济部门或主题,并使用Top2Vec和Lbl2Vec将公司分类到这些主题中。我们的结论是,这两个模型可以成功地结合起来发现主题(或部门)并将公司分类到这些主题中,从而为绘制不同地区的创新提供了额外的参数。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Business Analytics
Journal of Business Analytics Business, Management and Accounting-Management Information Systems
CiteScore
2.50
自引率
0.00%
发文量
13
期刊最新文献
Exploring the relationship between YouTube video optimisation practices and video rankings for online marketing: a machine learning approach The era of business analytics: identifying and ranking the differences between business intelligence and data science from practitioners’ perspective using the Delphi method Intelligent decision support system using nested ensemble approach for customer churn in the hotel industry Introducing technological disruption: how breaking media attention on corporate events impacts online sentiment An adaptive and enhanced framework for daily stock market prediction using feature selection and ensemble learning algorithms
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1