Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020最新文献

英文中文

Creativity Embedding: A Vector to Characterise and Classify Plausible Triples in Deep Learning NLP Models 创造力嵌入:深度学习NLP模型中表征和分类似然三元组的向量

Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020

Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.8768

Isabeau Oliveri, Luca Ardito, Giuseppe Rizzo, M. Morisio

English. In this paper we define the creativity embedding of a text based on four self-assessment creativity metrics, namely diversity, novelty, serendipity and magnitude, knowledge graphs, and neural networks. We use as basic unit the notion of triple (head, relation, tail). We investigate if additional information about creativity improves natural language processing tasks. In this work, we focus on triple plausibility task, exploiting BERT model and a WordNet11 dataset sample. Contrary to our hypothesis, we do not detect increase in the performance.

英语。本文基于多样性、新颖性、偶然性和重要性、知识图谱和神经网络这四个自我评估的创造力指标来定义文本的创造力嵌入。我们使用三元概念(头、关系、尾)作为基本单位。我们调查了关于创造力的额外信息是否能改善自然语言处理任务。在这项工作中，我们专注于三重合理性任务，利用BERT模型和WordNet11数据集样本。与我们的假设相反，我们没有发现性能的提高。

引用次数: 0

L'impatto emotivo della comunicazione istituzionale durante la pandemia di COVID-19: uno studio di Twitter Sentiment Analysis covi -19大流行期间机构交流的情感影响:Twitter上的一项感官分析研究

Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020

Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.8575

Gloria Gagliardi, Lorenzo Gregori, Alice Suozzi

This paper aims at investigating the impact of institutional communications during the health crisis due to Covid-19 pandemic in Italy, through the analysis of micro-blogging activities on Twitter by means of NLP techniques We performed a Sentiment Analysis on the TWITA corpus, to pinpoint potential correlations between opinion polarity (positive or negative) of the users and public speeches during the outbreak Our findings show changes in sentiment polarity related to three institutional speeches delivered by the Italian Prime Minister Giuseppe Conte on March, 4th, March, 9th, and April, 26th 2020 Copyright © 2020 for this paper by its authors

引用次数: 1

Predicting Movie-elicited Emotions from Dialogue in Screenplay Text: A Study on "Forrest Gump" 从剧本文本对话中预测电影情感——以《阿甘正传》为例

Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020

Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.8610

Benedetta Iavarone, F. Dell’Orletta

We present a new dataset of sentences1 extracted from the movie Forrest Gump, annotated with the emotions perceived by a group of subjects while watching the movie. We run experiments to predict these emotions using two classifiers, one based on a Support Vector Machine with linguistic and lexical features, the other based on BERT. The experiments showed that contextual embeddings are effective in predicting human-perceived emotions. Copyright c ©2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).

引用次数: 0

Hate Speech Detection with Machine-Translated Data: The Role of Annotation Scheme, Class Imbalance and Undersampling 基于机器翻译数据的仇恨语音检测:标注方案、类不平衡和欠采样的作用

Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020

Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.8345

Camilla Casula, Sara Tonelli

While using machine-translated data for supervised training can alleviate data sparseness problems when dealing with less-resourced languages, it is important that the source data are not only correctly translated, but also follow the same annotation scheme and possibly class balance as the smaller dataset in the target language. We therefore present an evaluation of hate speech detection in Italian using machine-translated data from English and comparing three settings, in order to understand the impact of training size, class distribution and annotation scheme.1

虽然在处理资源较少的语言时，使用机器翻译的数据进行监督训练可以缓解数据稀疏性问题，但重要的是源数据不仅要正确翻译，而且要遵循与目标语言中较小的数据集相同的注释方案和可能的类平衡。因此，我们使用机器翻译的英语数据对意大利语中的仇恨言论检测进行了评估，并比较了三种设置，以了解训练规模、类别分布和注释方案的影响

引用次数: 3

Automatic Induction of FrameNet lexical units in Italian 意大利语框架词汇单元的自动归纳

Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020

Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.8300

Silvia Brambilla, D. Croce, F. Tamburini, R. Basili

In this paper we investigate the applicability of automatic methods for frame induction to improve the coverage of IFrameNet, a novel lexical resource based on Frame Semantics in Italian. The experimental evaluations show that the adopted methods based on neural word embeddings pave the way for the assisted development of a large scale lexical resource for

本文研究了框架归纳自动方法的适用性，以提高基于框架语义的意大利语新词汇资源IFrameNet的覆盖率。实验结果表明，所采用的基于神经词嵌入的方法为大规模汉语词汇资源的辅助开发铺平了道路

引用次数: 1

UDante: First Steps Towards the Universal Dependencies Treebank of Dante's Latin Works 乌丹特:迈向但丁拉丁作品普遍依赖树库的第一步

Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020

Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.8653

F. M. Cecchini, R. Sprugnoli, Giovanni Moretti, M. Passarotti

English. This paper1 presents the early stages of the development of a new treebank containing all of Dante Alighieri’s Latin works. In particular, it describes the conversion of the original TEI-XML files to CoNLL-U, the creation of a gold standard, the process of training four annotators and the evaluation of the syntactic annotation in terms of inter-annotator agreement and LA, UAS and LAS. The aim is to release a new resource, in view of the celebrations for the 700th anniversary of Dante’s death, which can support the development of the Vocabolario Dantesco.

英语。本文介绍了一个包含但丁所有拉丁语作品的新树库的早期发展阶段。特别地，它描述了原始TEI-XML文件到CoNLL-U的转换、黄金标准的创建、培训四个注释器的过程以及根据注释器间协议和LA、UAS和LAS对语法注释进行评估。我们的目的是为了庆祝但丁逝世700周年，发行一种新的资源，以支持《但丁词汇》的发展。

引用次数: 18

Valutazione umana di DeepL a livello di frase per le traduzioni di testi specialistici dall'inglese verso l'italiano 迪普对从英语翻译成意大利语的专家文本的句子级别的人类评价

Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020

Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.8924

Mirko Tavosanis, S. Papa

The paper presents an evaluation of the performance of DeepL in the translation of specialized texts from English to Italian. The evaluation was carried out at sentence level, on a sample of 108 sentences 1 Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). Il testo è stato concepito unitariamente dagli autori, ma ai fini della ripartizione del lavoro si dichiara che sono opera di Sirio Papa i paragrafi 4, 7 e 8 e di Mirko Tavosanis i restanti paragrafi. taken from texts relating to the environment, energy, bio-medicine and drug science, and the translations produced were evaluated by translators in training, with disciplinary skills. The translation by DeepL was statistically rated at the same level of human translation in terms of adequacy and slightly lower in terms of fluency. Machine translation of the texts also received a higher score than that obtained in another analysis, carried out in a similar way, by machine translation of journalistic texts.

本文对深度学习在英语到意大利语的专业文本翻译中的表现进行了评估。在句子层面上对108个句子进行评估。本文版权所有©2020。在知识共享许可国际署名4.0 (CC BY 4.0)下允许使用。1 .我想è我的想法是统一的，我的想法是统一的，我的想法是统一的，我的想法是统一的，我的想法是统一的，我的想法是统一的，我的想法是统一的，我的想法是统一的。摘自与环境、能源、生物医学和药物科学有关的文本，翻译结果由受过培训的具有学科技能的翻译人员进行评估。据统计，DeepL的翻译在充分性方面与人类翻译处于同一水平，在流畅性方面略低。这些文本的机器翻译也比另一项以类似方式进行的新闻文本的机器翻译分析获得的分数更高。

引用次数: 1

The AEREST Reading Database AEREST阅读数据库

Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020

Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.8558

Marcello Ferro, Sara Giulivi, Claudia Cappa

Aerest is a reading assessment protocol for the concurrent evaluation of a child’s decoding and comprehension skills. Reading data complying with the Aerest protocol were automatically collected and structured with the ReadLet web-based platform in a pilot study, to form the Aerest Reading Database. The content, structure and potential of the database are described here, together with the main directions of current and future developments. Aerest è un protocollo di valutazione della lettura che misura in parallelo la capacità di decodifica e quella di comprensione del testo. Il protocollo è stato applicato in uno studio pilota i cui dati sono stati raccolti attraverso la piattaforma web ReadLet. L’articolo descrive il contenuto, la strutture e le potenzialità del data set risultante, insieme a future direzioni di sviluppo.

Aerest是对儿童解码和理解技能的连续评估的评估协议。阅读与航空协议的数据匹配是自动收集和结构与基于web平台的试点研究，形成航空阅读数据库。数据库的内容、结构和潜力在这里描述，结合当前和未来发展的主要方向。Aerest是一种阅读评估协议，它可以同时测量解密能力和文本理解能力。该协议已应用于一项通过web ReadLet平台收集数据的试点研究。本文描述了由此产生的数据集的内容、结构和潜力，以及未来的发展方向。

引用次数: 0

Multiword Expressions We Live by: A Validated Usage-based Dataset from Corpora of Written Italian 我们赖以生存的多词表达:来自书面意大利语语料库的基于用法的验证数据集

Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020

Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.8710

F. Masini, M. Micheli, Andrea Zaninello, S. Castagnoli, M. Nissim

The paper describes the creation of a manually validated dataset of Italian multiword expressions, building on candidates automatically extracted from corpora of written Italian. The main features of the resource, such as POS-pattern and lemma distribution, are also discussed, together with possible applications.

本文描述了一个手动验证的意大利语多词表达式数据集的创建，该数据集基于从书面意大利语语料库中自动提取的候选数据。还讨论了该资源的主要特征，如pos模式和引理分布，以及可能的应用。

引用次数: 0

Distributional Semantics: Yesterday, Today, and Tomorrow 分布语义:昨天、今天和明天

Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020

Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.9030

Alessandro Lenci

Distributional semantics is undoubtedly the mainstream approach to meaning representation in computational linguistics today. It has also become an important paradigm of semantic analysis in cognitive science, and even linguists have started looking at it with growing interest. The popularity of distributional semantics has literally boomed in the era of Deep Learning, when “word embeddings” have become the basic ingredient to “cook” any NLP task. The era of BERT & co. has brought new types of contextualized representations that have often generated hasty claims of incredible breakthroughs in the natural language understanding capability of deep learning models. Unfortunately, these claims are not always supported by the improved semantic abilities of the last generation of embeddings. Models like BERT are still rooted in the principles of distributional learning, but at the same time their goal is more ambitious than generating corpus-based representations of meaning. On the one hand, the embeddings they produce encode much more than lexical meaning, but on the other hand we are still largely uncertain about what semantic properties of natural language they actually capture. Distributional semantics has surely benefited from the successes of the deep learning, but this might even jeopardize the very essence of distributional models of meaning, by making their goals and foundations unclear.

分布语义学无疑是当今计算语言学研究意义表示的主流方法。它也成为认知科学中语义分析的一个重要范式，甚至语言学家也开始对它产生越来越大的兴趣。BERT & co.的时代带来了新型的情境化表示，这些表示通常会匆忙地声称深度学习模型在自然语言理解能力方面取得了令人难以置信的突破。不幸的是，这些说法并不总是得到上一代嵌入改进的语义能力的支持。像BERT这样的模型仍然植根于分布式学习的原则，但与此同时，它们的目标比生成基于语料库的意义表示更雄心勃勃。一方面，它们产生的嵌入编码远不止词汇意义，但另一方面，我们仍然在很大程度上不确定它们实际上捕获了自然语言的哪些语义属性。分布语义学确实从深度学习的成功中受益，但这甚至可能危及意义分布模型的本质，因为它们的目标和基础不明确。

{"title":"Distributional Semantics: Yesterday, Today, and Tomorrow","authors":"Alessandro Lenci","doi":"10.4000/books.aaccademia.9030","DOIUrl":"https://doi.org/10.4000/books.aaccademia.9030","url":null,"abstract":"Distributional semantics is undoubtedly the mainstream approach to meaning representation in computational linguistics today. It has also become an important paradigm of semantic analysis in cognitive science, and even linguists have started looking at it with growing interest. The popularity of distributional semantics has literally boomed in the era of Deep Learning, when “word embeddings” have become the basic ingredient to “cook” any NLP task. The era of BERT & co. has brought new types of contextualized representations that have often generated hasty claims of incredible breakthroughs in the natural language understanding capability of deep learning models. Unfortunately, these claims are not always supported by the improved semantic abilities of the last generation of embeddings. Models like BERT are still rooted in the principles of distributional learning, but at the same time their goal is more ambitious than generating corpus-based representations of meaning. On the one hand, the embeddings they produce encode much more than lexical meaning, but on the other hand we are still largely uncertain about what semantic properties of natural language they actually capture. Distributional semantics has surely benefited from the successes of the deep learning, but this might even jeopardize the very essence of distributional models of meaning, by making their goals and foundations unclear.","PeriodicalId":300279,"journal":{"name":"Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020","volume":"215 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134520385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀