Unlocking NACE Classification Embeddings with OpenAI for Enhanced Analysis and Processing

Andrea Vidali, Nicola Jean, Giacomo Le Pera
{"title":"Unlocking NACE Classification Embeddings with OpenAI for Enhanced Analysis and Processing","authors":"Andrea Vidali, Nicola Jean, Giacomo Le Pera","doi":"arxiv-2409.11524","DOIUrl":null,"url":null,"abstract":"The Statistical Classification of Economic Activities in the European\nCommunity (NACE) is the standard classification system for the categorization\nof economic and industrial activities within the European Union. This paper\nproposes a novel approach to transform the NACE classification into\nlow-dimensional embeddings, using state-of-the-art models and dimensionality\nreduction techniques. The primary challenge is the preservation of the\nhierarchical structure inherent within the original NACE classification while\nreducing the number of dimensions. To address this issue, we introduce custom\nmetrics designed to quantify the retention of hierarchical relationships\nthroughout the embedding and reduction processes. The evaluation of these\nmetrics demonstrates the effectiveness of the proposed methodology in retaining\nthe structural information essential for insightful analysis. This approach not\nonly facilitates the visual exploration of economic activity relationships, but\nalso increases the efficacy of downstream tasks, including clustering,\nclassification, integration with other classifications, and others. Through\nexperimental validation, the utility of our proposed framework in preserving\nhierarchical structures within the NACE classification is showcased, thereby\nproviding a valuable tool for researchers and policymakers to understand and\nleverage any hierarchical data.","PeriodicalId":501273,"journal":{"name":"arXiv - ECON - General Economics","volume":"43 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - ECON - General Economics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11524","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The Statistical Classification of Economic Activities in the European Community (NACE) is the standard classification system for the categorization of economic and industrial activities within the European Union. This paper proposes a novel approach to transform the NACE classification into low-dimensional embeddings, using state-of-the-art models and dimensionality reduction techniques. The primary challenge is the preservation of the hierarchical structure inherent within the original NACE classification while reducing the number of dimensions. To address this issue, we introduce custom metrics designed to quantify the retention of hierarchical relationships throughout the embedding and reduction processes. The evaluation of these metrics demonstrates the effectiveness of the proposed methodology in retaining the structural information essential for insightful analysis. This approach not only facilitates the visual exploration of economic activity relationships, but also increases the efficacy of downstream tasks, including clustering, classification, integration with other classifications, and others. Through experimental validation, the utility of our proposed framework in preserving hierarchical structures within the NACE classification is showcased, thereby providing a valuable tool for researchers and policymakers to understand and leverage any hierarchical data.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用 OpenAI 解锁 NACE 分类嵌入,增强分析和处理能力
欧洲共同体经济活动统计分类(NACE)是欧盟内部经济和工业活动分类的标准分类系统。本文提出了一种新方法,利用最先进的模型和降维技术将 NACE 分类转换为低维嵌入。主要挑战在于如何在减少维数的同时保留原始 NACE 分类中固有的层次结构。为了解决这个问题,我们引入了自定义指标,旨在量化整个嵌入和降维过程中层次关系的保留情况。对这些指标的评估证明了所提出的方法在保留对深入分析至关重要的结构信息方面的有效性。这种方法不仅有助于对经济活动关系进行可视化探索,还能提高下游任务的效率,包括聚类、分类、与其他分类的整合等。通过实验验证,展示了我们提出的框架在 NACE 分类中保留层次结构的实用性,从而为研究人员和政策制定者理解和利用任何层次数据提供了宝贵的工具。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
It depends: Varieties of defining growth dependence Experimental Evidence That Conversational Artificial Intelligence Can Steer Consumer Behavior Without Detection Cognitive Hierarchy in Day-to-day Network Flow Dynamics The long-term human capital and health impacts of a pollution reduction programme What Does ChatGPT Make of Historical Stock Returns? Extrapolation and Miscalibration in LLM Stock Return Forecasts
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1