Pub Date : 1900-01-01DOI: 10.4000/books.aaccademia.8768
Isabeau Oliveri, Luca Ardito, Giuseppe Rizzo, M. Morisio
English. In this paper we define the creativity embedding of a text based on four self-assessment creativity metrics, namely diversity, novelty, serendipity and magnitude, knowledge graphs, and neural networks. We use as basic unit the notion of triple (head, relation, tail). We investigate if additional information about creativity improves natural language processing tasks. In this work, we focus on triple plausibility task, exploiting BERT model and a WordNet11 dataset sample. Contrary to our hypothesis, we do not detect increase in the performance.
{"title":"Creativity Embedding: A Vector to Characterise and Classify Plausible Triples in Deep Learning NLP Models","authors":"Isabeau Oliveri, Luca Ardito, Giuseppe Rizzo, M. Morisio","doi":"10.4000/books.aaccademia.8768","DOIUrl":"https://doi.org/10.4000/books.aaccademia.8768","url":null,"abstract":"English. In this paper we define the creativity embedding of a text based on four self-assessment creativity metrics, namely diversity, novelty, serendipity and magnitude, knowledge graphs, and neural networks. We use as basic unit the notion of triple (head, relation, tail). We investigate if additional information about creativity improves natural language processing tasks. In this work, we focus on triple plausibility task, exploiting BERT model and a WordNet11 dataset sample. Contrary to our hypothesis, we do not detect increase in the performance.","PeriodicalId":300279,"journal":{"name":"Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020","volume":"122 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114518887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.4000/books.aaccademia.8345
Camilla Casula, Sara Tonelli
While using machine-translated data for supervised training can alleviate data sparseness problems when dealing with less-resourced languages, it is important that the source data are not only correctly translated, but also follow the same annotation scheme and possibly class balance as the smaller dataset in the target language. We therefore present an evaluation of hate speech detection in Italian using machine-translated data from English and comparing three settings, in order to understand the impact of training size, class distribution and annotation scheme.1
{"title":"Hate Speech Detection with Machine-Translated Data: The Role of Annotation Scheme, Class Imbalance and Undersampling","authors":"Camilla Casula, Sara Tonelli","doi":"10.4000/books.aaccademia.8345","DOIUrl":"https://doi.org/10.4000/books.aaccademia.8345","url":null,"abstract":"While using machine-translated data for supervised training can alleviate data sparseness problems when dealing with less-resourced languages, it is important that the source data are not only correctly translated, but also follow the same annotation scheme and possibly class balance as the smaller dataset in the target language. We therefore present an evaluation of hate speech detection in Italian using machine-translated data from English and comparing three settings, in order to understand the impact of training size, class distribution and annotation scheme.1","PeriodicalId":300279,"journal":{"name":"Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130934947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.4000/books.aaccademia.8300
Silvia Brambilla, D. Croce, F. Tamburini, R. Basili
In this paper we investigate the applicability of automatic methods for frame induction to improve the coverage of IFrameNet, a novel lexical resource based on Frame Semantics in Italian. The experimental evaluations show that the adopted methods based on neural word embeddings pave the way for the assisted development of a large scale lexical resource for
{"title":"Automatic Induction of FrameNet lexical units in Italian","authors":"Silvia Brambilla, D. Croce, F. Tamburini, R. Basili","doi":"10.4000/books.aaccademia.8300","DOIUrl":"https://doi.org/10.4000/books.aaccademia.8300","url":null,"abstract":"In this paper we investigate the applicability of automatic methods for frame induction to improve the coverage of IFrameNet, a novel lexical resource based on Frame Semantics in Italian. The experimental evaluations show that the adopted methods based on neural word embeddings pave the way for the assisted development of a large scale lexical resource for","PeriodicalId":300279,"journal":{"name":"Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131322863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.4000/books.aaccademia.8653
F. M. Cecchini, R. Sprugnoli, Giovanni Moretti, M. Passarotti
English. This paper1 presents the early stages of the development of a new treebank containing all of Dante Alighieri’s Latin works. In particular, it describes the conversion of the original TEI-XML files to CoNLL-U, the creation of a gold standard, the process of training four annotators and the evaluation of the syntactic annotation in terms of inter-annotator agreement and LA, UAS and LAS. The aim is to release a new resource, in view of the celebrations for the 700th anniversary of Dante’s death, which can support the development of the Vocabolario Dantesco.
{"title":"UDante: First Steps Towards the Universal Dependencies Treebank of Dante's Latin Works","authors":"F. M. Cecchini, R. Sprugnoli, Giovanni Moretti, M. Passarotti","doi":"10.4000/books.aaccademia.8653","DOIUrl":"https://doi.org/10.4000/books.aaccademia.8653","url":null,"abstract":"English. This paper1 presents the early stages of the development of a new treebank containing all of Dante Alighieri’s Latin works. In particular, it describes the conversion of the original TEI-XML files to CoNLL-U, the creation of a gold standard, the process of training four annotators and the evaluation of the syntactic annotation in terms of inter-annotator agreement and LA, UAS and LAS. The aim is to release a new resource, in view of the celebrations for the 700th anniversary of Dante’s death, which can support the development of the Vocabolario Dantesco.","PeriodicalId":300279,"journal":{"name":"Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020","volume":"39 7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123272858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.4000/books.aaccademia.8558
Marcello Ferro, Sara Giulivi, Claudia Cappa
Aerest is a reading assessment protocol for the concurrent evaluation of a child’s decoding and comprehension skills. Reading data complying with the Aerest protocol were automatically collected and structured with the ReadLet web-based platform in a pilot study, to form the Aerest Reading Database. The content, structure and potential of the database are described here, together with the main directions of current and future developments. Aerest è un protocollo di valutazione della lettura che misura in parallelo la capacità di decodifica e quella di comprensione del testo. Il protocollo è stato applicato in uno studio pilota i cui dati sono stati raccolti attraverso la piattaforma web ReadLet. L’articolo descrive il contenuto, la strutture e le potenzialità del data set risultante, insieme a future direzioni di sviluppo.
{"title":"The AEREST Reading Database","authors":"Marcello Ferro, Sara Giulivi, Claudia Cappa","doi":"10.4000/books.aaccademia.8558","DOIUrl":"https://doi.org/10.4000/books.aaccademia.8558","url":null,"abstract":"Aerest is a reading assessment protocol for the concurrent evaluation of a child’s decoding and comprehension skills. Reading data complying with the Aerest protocol were automatically collected and structured with the ReadLet web-based platform in a pilot study, to form the Aerest Reading Database. The content, structure and potential of the database are described here, together with the main directions of current and future developments. Aerest è un protocollo di valutazione della lettura che misura in parallelo la capacità di decodifica e quella di comprensione del testo. Il protocollo è stato applicato in uno studio pilota i cui dati sono stati raccolti attraverso la piattaforma web ReadLet. L’articolo descrive il contenuto, la strutture e le potenzialità del data set risultante, insieme a future direzioni di sviluppo.","PeriodicalId":300279,"journal":{"name":"Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133503911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.4000/books.aaccademia.8710
F. Masini, M. Micheli, Andrea Zaninello, S. Castagnoli, M. Nissim
The paper describes the creation of a manually validated dataset of Italian multiword expressions, building on candidates automatically extracted from corpora of written Italian. The main features of the resource, such as POS-pattern and lemma distribution, are also discussed, together with possible applications.
{"title":"Multiword Expressions We Live by: A Validated Usage-based Dataset from Corpora of Written Italian","authors":"F. Masini, M. Micheli, Andrea Zaninello, S. Castagnoli, M. Nissim","doi":"10.4000/books.aaccademia.8710","DOIUrl":"https://doi.org/10.4000/books.aaccademia.8710","url":null,"abstract":"The paper describes the creation of a manually validated dataset of Italian multiword expressions, building on candidates automatically extracted from corpora of written Italian. The main features of the resource, such as POS-pattern and lemma distribution, are also discussed, together with possible applications.","PeriodicalId":300279,"journal":{"name":"Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020","volume":"64 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132187094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.4000/books.aaccademia.8889
Giulia Speranza, Raffaele Manna, M. Buono, J. Monti
In this paper, we present the ArchaeoTerm Project, along with one of its first efforts in enhancing multilingual access to Archaeological data, making available a resource of Archaeological terms within the framework of YourTerm CULT project. In order to enhance and promote the use of a terminological common ground across different languages the Archaeo-Term multilingual Glossary is intended both for scholars, experts in the field, translators and the general public. Its first release contains terms in Italian, English, German, Spanish and Dutch together with PoS, definitions and other linguistic information. This paper presents the data and the methodology adopted to create the glossary as well as the evaluation of the first results.
{"title":"The Archaeo-Term Project: Multilingual Terminology in Archaeology","authors":"Giulia Speranza, Raffaele Manna, M. Buono, J. Monti","doi":"10.4000/books.aaccademia.8889","DOIUrl":"https://doi.org/10.4000/books.aaccademia.8889","url":null,"abstract":"In this paper, we present the ArchaeoTerm Project, along with one of its first efforts in enhancing multilingual access to Archaeological data, making available a resource of Archaeological terms within the framework of YourTerm CULT project. In order to enhance and promote the use of a terminological common ground across different languages the Archaeo-Term multilingual Glossary is intended both for scholars, experts in the field, translators and the general public. Its first release contains terms in Italian, English, German, Spanish and Dutch together with PoS, definitions and other linguistic information. This paper presents the data and the methodology adopted to create the glossary as well as the evaluation of the first results.","PeriodicalId":300279,"journal":{"name":"Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122406824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}