Using Transformer-based Models for Taxonomy Enrichment and Sentence Classification

Parag Dakle, Shrikumar Patil, Sai Krishna Rallabandi, Chaitra V. Hegde, Preethi Raghavan
{"title":"Using Transformer-based Models for Taxonomy Enrichment and Sentence Classification","authors":"Parag Dakle, Shrikumar Patil, Sai Krishna Rallabandi, Chaitra V. Hegde, Preethi Raghavan","doi":"10.18653/v1/2022.finnlp-1.34","DOIUrl":null,"url":null,"abstract":"In this paper, we present a system that addresses the taxonomy enrichment problem for Environment, Social and Governance issues in the financial domain, as well as classifying sentences as sustainable or unsustainable, for FinSim4-ESG, a shared task for the FinNLP workshop at IJCAI-2022. We first created a derived dataset for taxonomy enrichment by using a sentence-BERT-based paraphrase detector (Reimers and Gurevych, 2019) (on the train set) to create positive and negative term-concept pairs. We then model the problem by fine-tuning the sentence-BERT-based paraphrase detector on this derived dataset, and use it as the encoder, and use a Logistic Regression classifier as the decoder, resulting in test Accuracy: 0.6 and Avg. Rank: 1.97. In case of the sentence classification task, the best-performing classifier (Accuracy: 0.92) consists of a pre-trained RoBERTa model (Liu et al., 2019a) as the encoder and a Feed Forward Neural Network classifier as the decoder.","PeriodicalId":331851,"journal":{"name":"Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/2022.finnlp-1.34","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

In this paper, we present a system that addresses the taxonomy enrichment problem for Environment, Social and Governance issues in the financial domain, as well as classifying sentences as sustainable or unsustainable, for FinSim4-ESG, a shared task for the FinNLP workshop at IJCAI-2022. We first created a derived dataset for taxonomy enrichment by using a sentence-BERT-based paraphrase detector (Reimers and Gurevych, 2019) (on the train set) to create positive and negative term-concept pairs. We then model the problem by fine-tuning the sentence-BERT-based paraphrase detector on this derived dataset, and use it as the encoder, and use a Logistic Regression classifier as the decoder, resulting in test Accuracy: 0.6 and Avg. Rank: 1.97. In case of the sentence classification task, the best-performing classifier (Accuracy: 0.92) consists of a pre-trained RoBERTa model (Liu et al., 2019a) as the encoder and a Feed Forward Neural Network classifier as the decoder.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于转换器的分类充实和句子分类模型
在本文中,我们提出了一个系统,用于FinSim4-ESG解决金融领域环境、社会和治理问题的分类丰富问题,以及将句子分类为可持续或不可持续,这是IJCAI-2022 FinNLP研讨会的共同任务。我们首先使用基于句子bert的释义检测器(Reimers和Gurevych, 2019)(在训练集上)创建了一个派生数据集,用于分类丰富,以创建积极和消极的术语概念对。然后,我们通过在该衍生数据集上微调基于句子bert的释义检测器来建模问题,并将其用作编码器,并使用逻辑回归分类器作为解码器,从而得到测试精度:0.6和平均秩:1.97。在句子分类任务中,表现最好的分类器(准确率:0.92)由预训练的RoBERTa模型(Liu et al., 2019a)作为编码器和前馈神经网络分类器作为解码器组成。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
AstBERT: Enabling Language Model for Financial Code Understanding with Abstract Syntax Trees Automatic Term and Sentence Classification Via Augmented Term and Pre-trained language model in ESG Taxonomy texts Ranking Environment, Social And Governance Related Concepts And Assessing Sustainability Aspect of Financial Texts TweetFinSent: A Dataset of Stock Sentiments on Twitter Using Transformer-based Models for Taxonomy Enrichment and Sentence Classification
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1