Luciano Barbosa, P. R. Cavalin, Victor Guimarães, Matthias Kormaksson
{"title":"Blue Man Group no ASSIN: Usando Representações Distribuídas para Similaridade Semântica e Inferência Textual","authors":"Luciano Barbosa, P. R. Cavalin, Victor Guimarães, Matthias Kormaksson","doi":"10.21814/LM.8.2.231","DOIUrl":null,"url":null,"abstract":"In this paper, we present the methodology and the results obtained by our team, dubbed Blue Man Group, in the ASSIN (from the Portuguese Avaliacao de Similaridade Semântica e Inferencia Textual) competition, held at PROPOR 2016. Our team's strategy consisted of evaluating methods based on semantic word vectors, following two distinct directions: 1) to make use of low-dimensional, compact, feature sets, and 2) deep learning-based strategies dealing with high-dimensional feature vectors. Evaluation results demonstrated that the first strategy was more promising, so that the results from the second strategy have been discarded. As a result, by considering the best run of each of the six participant teams, we have been able to achieve the best accuracy and F1 values in entailment recognition, in the Brazilian Portuguese set, and the best F1 score considering also the Portuguse from Portugal set. In the semantic similarity task, our team was ranked second in the Brazilian Portuguese set, and third considering both sets.","PeriodicalId":41819,"journal":{"name":"Linguamatica","volume":"19 1","pages":"15-22"},"PeriodicalIF":0.3000,"publicationDate":"2016-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Linguamatica","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21814/LM.8.2.231","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"LINGUISTICS","Score":null,"Total":0}
引用次数: 8
Abstract
In this paper, we present the methodology and the results obtained by our team, dubbed Blue Man Group, in the ASSIN (from the Portuguese Avaliacao de Similaridade Semântica e Inferencia Textual) competition, held at PROPOR 2016. Our team's strategy consisted of evaluating methods based on semantic word vectors, following two distinct directions: 1) to make use of low-dimensional, compact, feature sets, and 2) deep learning-based strategies dealing with high-dimensional feature vectors. Evaluation results demonstrated that the first strategy was more promising, so that the results from the second strategy have been discarded. As a result, by considering the best run of each of the six participant teams, we have been able to achieve the best accuracy and F1 values in entailment recognition, in the Brazilian Portuguese set, and the best F1 score considering also the Portuguse from Portugal set. In the semantic similarity task, our team was ranked second in the Brazilian Portuguese set, and third considering both sets.
在本文中,我们介绍了我们的团队(称为Blue Man Group)在2016年PROPOR举行的ASSIN(来自葡萄牙Avaliacao de Similaridade sem ntica e interencia Textual)竞赛中获得的方法和结果。我们团队的策略包括基于语义词向量的评估方法,遵循两个不同的方向:1)利用低维、紧凑的特征集,以及2)处理高维特征向量的基于深度学习的策略。评价结果表明,第一种策略更有希望,因此第二种策略的结果被丢弃。因此,通过考虑六支参赛队伍的最佳运行,我们能够在蕴涵识别中获得最佳精度和F1值,在巴西葡萄牙集合中,以及在考虑来自葡萄牙的葡萄牙集合的最佳F1分数。在语义相似度任务中,我们的团队在巴西葡萄牙语组中排名第二,在两组中排名第三。