{"title":"Unlocking NACE Classification Embeddings with OpenAI for Enhanced Analysis and Processing","authors":"Andrea Vidali, Nicola Jean, Giacomo Le Pera","doi":"arxiv-2409.11524","DOIUrl":null,"url":null,"abstract":"The Statistical Classification of Economic Activities in the European\nCommunity (NACE) is the standard classification system for the categorization\nof economic and industrial activities within the European Union. This paper\nproposes a novel approach to transform the NACE classification into\nlow-dimensional embeddings, using state-of-the-art models and dimensionality\nreduction techniques. The primary challenge is the preservation of the\nhierarchical structure inherent within the original NACE classification while\nreducing the number of dimensions. To address this issue, we introduce custom\nmetrics designed to quantify the retention of hierarchical relationships\nthroughout the embedding and reduction processes. The evaluation of these\nmetrics demonstrates the effectiveness of the proposed methodology in retaining\nthe structural information essential for insightful analysis. This approach not\nonly facilitates the visual exploration of economic activity relationships, but\nalso increases the efficacy of downstream tasks, including clustering,\nclassification, integration with other classifications, and others. Through\nexperimental validation, the utility of our proposed framework in preserving\nhierarchical structures within the NACE classification is showcased, thereby\nproviding a valuable tool for researchers and policymakers to understand and\nleverage any hierarchical data.","PeriodicalId":501273,"journal":{"name":"arXiv - ECON - General Economics","volume":"43 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - ECON - General Economics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11524","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The Statistical Classification of Economic Activities in the European
Community (NACE) is the standard classification system for the categorization
of economic and industrial activities within the European Union. This paper
proposes a novel approach to transform the NACE classification into
low-dimensional embeddings, using state-of-the-art models and dimensionality
reduction techniques. The primary challenge is the preservation of the
hierarchical structure inherent within the original NACE classification while
reducing the number of dimensions. To address this issue, we introduce custom
metrics designed to quantify the retention of hierarchical relationships
throughout the embedding and reduction processes. The evaluation of these
metrics demonstrates the effectiveness of the proposed methodology in retaining
the structural information essential for insightful analysis. This approach not
only facilitates the visual exploration of economic activity relationships, but
also increases the efficacy of downstream tasks, including clustering,
classification, integration with other classifications, and others. Through
experimental validation, the utility of our proposed framework in preserving
hierarchical structures within the NACE classification is showcased, thereby
providing a valuable tool for researchers and policymakers to understand and
leverage any hierarchical data.