aiai at the FinSBD-3 task: Structure Boundary Detection of Noisy Financial Texts in English and French Using Data Augmentation and Hybrid Deep Learning Model
{"title":"aiai at the FinSBD-3 task: Structure Boundary Detection of Noisy Financial Texts in English and French Using Data Augmentation and Hybrid Deep Learning Model","authors":"Ke Tian, Hua Chen","doi":"10.1145/3442442.3451380","DOIUrl":null,"url":null,"abstract":"Both authors contributed equally to this research. This paper presents the method that we tackled the FinSBD-3 shared task (structure boundary detection) to extract the boundaries of sentences, lists, and items, including structure elements like footer, header, tables from noisy unstructured English and French financial texts. The deep attention model based on word embedding using data augmentation and BERT model named as hybrid deep learning model to detect the sentence, list-item, footer, header, tables boundaries in noisy English and French texts and classify the list-item sentences into list & different item types using deep attention model. The experiment is shown that the proposed method could be an effective solution to deal with the FinSBD-3 shared task. The submitted result ranks first based on the task metrics in the final leader board.","PeriodicalId":129420,"journal":{"name":"Companion Proceedings of the Web Conference 2021","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Companion Proceedings of the Web Conference 2021","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3442442.3451380","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Both authors contributed equally to this research. This paper presents the method that we tackled the FinSBD-3 shared task (structure boundary detection) to extract the boundaries of sentences, lists, and items, including structure elements like footer, header, tables from noisy unstructured English and French financial texts. The deep attention model based on word embedding using data augmentation and BERT model named as hybrid deep learning model to detect the sentence, list-item, footer, header, tables boundaries in noisy English and French texts and classify the list-item sentences into list & different item types using deep attention model. The experiment is shown that the proposed method could be an effective solution to deal with the FinSBD-3 shared task. The submitted result ranks first based on the task metrics in the final leader board.