Pub Date : 2022-01-01DOI: 10.18653/v1/2022.case-1.26
Diogo Fernandes Costa Silva, A. Junior, Gabriel Marques, A. Soares, A. R. G. Filho
This paper summarizes our work on the document classification subtask of Multilingual protest news detection of the CASE @ ACL-IJCNLP 2022 workshok. In this context, we investigate the performance of monolingual and multilingual transformer-based models in low data resources, taking Portuguese as an example and evaluating language models on document classification. Our approach became the winning solution in Portuguese document classification achieving 0.8007 F1 Score on Test set. The experimental results demonstrate that multilingual models achieve best results in scenarios with few dataset samples of specific language, because we can train models using datasets from other languages of the same task and domain.
{"title":"CEIA-NLP at CASE 2022 Task 1: Protest News Detection for Portuguese","authors":"Diogo Fernandes Costa Silva, A. Junior, Gabriel Marques, A. Soares, A. R. G. Filho","doi":"10.18653/v1/2022.case-1.26","DOIUrl":"https://doi.org/10.18653/v1/2022.case-1.26","url":null,"abstract":"This paper summarizes our work on the document classification subtask of Multilingual protest news detection of the CASE @ ACL-IJCNLP 2022 workshok. In this context, we investigate the performance of monolingual and multilingual transformer-based models in low data resources, taking Portuguese as an example and evaluating language models on document classification. Our approach became the winning solution in Portuguese document classification achieving 0.8007 F1 Score on Test set. The experimental results demonstrate that multilingual models achieve best results in scenarios with few dataset samples of specific language, because we can train models using datasets from other languages of the same task and domain.","PeriodicalId":80307,"journal":{"name":"The Case manager","volume":"16 1","pages":"184-188"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79254773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.18653/v1/2022.case-1.6
Juhyeon Kim, Yesong Choe, Sanghack Lee
Finding causal relations in texts has been a challenge since it requires methods ranging from defining event ontologies to developing proper algorithmic approaches. In this paper, we developed a framework which classifies whether a given sentence contains a causal event.As our approach, we exploited an external corpus that has causal labels to overcome the small size of the original corpus (Causal News Corpus) provided by task organizers.Further, we employed a data augmentation technique utilizing Part-Of-Speech (POS) based on our observation that some parts of speech are more (or less) relevant to causality. Our approach especially improved the recall of detecting causal events in sentences.
{"title":"SNU-Causality Lab @ Causal News Corpus 2022: Detecting Causality by Data Augmentation via Part-of-Speech tagging","authors":"Juhyeon Kim, Yesong Choe, Sanghack Lee","doi":"10.18653/v1/2022.case-1.6","DOIUrl":"https://doi.org/10.18653/v1/2022.case-1.6","url":null,"abstract":"Finding causal relations in texts has been a challenge since it requires methods ranging from defining event ontologies to developing proper algorithmic approaches. In this paper, we developed a framework which classifies whether a given sentence contains a causal event.As our approach, we exploited an external corpus that has causal labels to overcome the small size of the original corpus (Causal News Corpus) provided by task organizers.Further, we employed a data augmentation technique utilizing Part-Of-Speech (POS) based on our observation that some parts of speech are more (or less) relevant to causality. Our approach especially improved the recall of detecting causal events in sentences.","PeriodicalId":80307,"journal":{"name":"The Case manager","volume":"25 1","pages":"44-49"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84840745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.18653/v1/2022.case-1.19
Abdul Aziz, Md. Akram Hossain, Abu Nowshed Chy
Identifying cause-effect relationships in sentences is one of the formidable tasks to tackle the challenges of inference and understanding of natural language. However, the diversity of word semantics and sentence structure makes it challenging to determine the causal relationship effectively. To address these challenges, CASE-2022 shared task 3 introduced a task focusing on event causality identification with causal news corpus. This paper presents our participation in this task, especially in subtask 1 which is the causal event classification task. To tackle the task challenge, we propose a unified neural model through exploiting two fine-tuned transformer models including RoBERTa and Twitter-RoBERTa. For the score fusion, we combine the prediction scores of each component model using weighted arithmetic mean to generate the probability score for class label identification. The experimental results showed that our proposed method achieved the top performance (ranked 1st) among the participants.
{"title":"CSECU-DSG @ Causal News Corpus 2022: Fusion of RoBERTa Transformers Variants for Causal Event Classification","authors":"Abdul Aziz, Md. Akram Hossain, Abu Nowshed Chy","doi":"10.18653/v1/2022.case-1.19","DOIUrl":"https://doi.org/10.18653/v1/2022.case-1.19","url":null,"abstract":"Identifying cause-effect relationships in sentences is one of the formidable tasks to tackle the challenges of inference and understanding of natural language. However, the diversity of word semantics and sentence structure makes it challenging to determine the causal relationship effectively. To address these challenges, CASE-2022 shared task 3 introduced a task focusing on event causality identification with causal news corpus. This paper presents our participation in this task, especially in subtask 1 which is the causal event classification task. To tackle the task challenge, we propose a unified neural model through exploiting two fine-tuned transformer models including RoBERTa and Twitter-RoBERTa. For the score fusion, we combine the prediction scores of each component model using weighted arithmetic mean to generate the probability score for class label identification. The experimental results showed that our proposed method achieved the top performance (ranked 1st) among the participants.","PeriodicalId":80307,"journal":{"name":"The Case manager","volume":"15 1","pages":"138-142"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88283332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.1109/CASE49997.2022.9926501
Diego Tristán-Rodríguez, R. Garrido, E. Mezura-Montes
{"title":"Optimization of a state feedback controller using a PSO algorithm","authors":"Diego Tristán-Rodríguez, R. Garrido, E. Mezura-Montes","doi":"10.1109/CASE49997.2022.9926501","DOIUrl":"https://doi.org/10.1109/CASE49997.2022.9926501","url":null,"abstract":"","PeriodicalId":80307,"journal":{"name":"The Case manager","volume":"19 1","pages":"729-734"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86488102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.18653/v1/2022.case-1.20
Guneet Singh Kohli, Prabsimran Kaur, Jatin Bedi
Causal (a cause-effect relationship between two arguments) has become integral to various NLP domains such as question answering, summarization, and event prediction. To understand causality in detail, Event Causality Identification with Causal News Corpus (CASE-2022) has organized shared tasks. This paper defines our participation in Subtask 1, which focuses on classifying event causality. We used sentence-level augmentation based on contextualized word embeddings of distillBERT to construct new data. This data was then trained using two approaches. The first technique used the DeBERTa language model, and the second used the RoBERTa language model in combination with cross-attention. We obtained the second-best F1 score (0.8610) in the competition with the Contextually Augmented DeBERTa model.
{"title":"ARGUABLY @ Causal News Corpus 2022: Contextually Augmented Language Models for Event Causality Identification","authors":"Guneet Singh Kohli, Prabsimran Kaur, Jatin Bedi","doi":"10.18653/v1/2022.case-1.20","DOIUrl":"https://doi.org/10.18653/v1/2022.case-1.20","url":null,"abstract":"Causal (a cause-effect relationship between two arguments) has become integral to various NLP domains such as question answering, summarization, and event prediction. To understand causality in detail, Event Causality Identification with Causal News Corpus (CASE-2022) has organized shared tasks. This paper defines our participation in Subtask 1, which focuses on classifying event causality. We used sentence-level augmentation based on contextualized word embeddings of distillBERT to construct new data. This data was then trained using two approaches. The first technique used the DeBERTa language model, and the second used the RoBERTa language model in combination with cross-attention. We obtained the second-best F1 score (0.8610) in the competition with the Contextually Augmented DeBERTa model.","PeriodicalId":80307,"journal":{"name":"The Case manager","volume":"7 1","pages":"143-148"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82344516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.18653/v1/2022.case-1.18
Anik Saha, Alex Gittens, Jian Ni, Oktie Hassanzadeh, B. Yener, Kavitha Srinivas
Understanding causal relationship is an importance part of natural language processing. We address the causal information extraction problem with different neural models built on top of pre-trained transformer-based language models for identifying Cause, Effect and Signal spans, from news data sets. We use the Causal News Corpus subtask 2 training data set to train span-based and sequence tagging models. Our span-based model based on pre-trained BERT base weights achieves an F1 score of 47.48 on the test set with an accuracy score of 36.87 and obtained 3rd place in the Causal News Corpus 2022 shared task.
{"title":"SPOCK @ Causal News Corpus 2022: Cause-Effect-Signal Span Detection Using Span-Based and Sequence Tagging Models","authors":"Anik Saha, Alex Gittens, Jian Ni, Oktie Hassanzadeh, B. Yener, Kavitha Srinivas","doi":"10.18653/v1/2022.case-1.18","DOIUrl":"https://doi.org/10.18653/v1/2022.case-1.18","url":null,"abstract":"Understanding causal relationship is an importance part of natural language processing. We address the causal information extraction problem with different neural models built on top of pre-trained transformer-based language models for identifying Cause, Effect and Signal spans, from news data sets. We use the Causal News Corpus subtask 2 training data set to train span-based and sequence tagging models. Our span-based model based on pre-trained BERT base weights achieves an F1 score of 47.48 on the test set with an accuracy score of 36.87 and obtained 3rd place in the Causal News Corpus 2022 shared task.","PeriodicalId":80307,"journal":{"name":"The Case manager","volume":"1 1","pages":"133-137"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75981974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.18653/v1/2022.case-1.7
H. Adibhatla, Manish Shrivastava
Causality detection and identification is centered on identifying semantic and cognitive connections in a sentence. In this paper, we describe the effort of team LTRC for Causal News Corpus - Event Causality Shared Task 2022 at the 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2022). The shared task consisted of two subtasks: 1) identifying if a sentence contains a causality relation, and 2) identifying spans of text that correspond to cause, effect and signals. We fine-tuned transformer-based models with adapters for both subtasks. Our best-performing models obtained a binary F1 score of 0.853 on held-out data for subtask 1 and a macro F1 score of 0.032 on held-out data for subtask 2. Our approach is ranked third in subtask 1 and fourth in subtask 2. The paper describes our experiments, solutions, and analysis in detail.
{"title":"LTRC @ Causal News Corpus 2022: Extracting and Identifying Causal Elements using Adapters","authors":"H. Adibhatla, Manish Shrivastava","doi":"10.18653/v1/2022.case-1.7","DOIUrl":"https://doi.org/10.18653/v1/2022.case-1.7","url":null,"abstract":"Causality detection and identification is centered on identifying semantic and cognitive connections in a sentence. In this paper, we describe the effort of team LTRC for Causal News Corpus - Event Causality Shared Task 2022 at the 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2022). The shared task consisted of two subtasks: 1) identifying if a sentence contains a causality relation, and 2) identifying spans of text that correspond to cause, effect and signals. We fine-tuned transformer-based models with adapters for both subtasks. Our best-performing models obtained a binary F1 score of 0.853 on held-out data for subtask 1 and a macro F1 score of 0.032 on held-out data for subtask 2. Our approach is ranked third in subtask 1 and fourth in subtask 2. The paper describes our experiments, solutions, and analysis in detail.","PeriodicalId":80307,"journal":{"name":"The Case manager","volume":"19 1","pages":"50-55"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80890482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.18653/v1/2022.case-1.3
Theresa Krumbiegel, Sophie Decher
We present our submission to Subtask 1 of theCASE-2022 Shared Task 3: Event CausalityIdentification with Causal News Corpus as partof the 5th Workshop on Challenges and Applicationsof Automated Extraction of SociopoliticalEvents from Text (CASE 2022) (Tanet al., 2022a). The task focuses on causal eventclassification on the sentence level and involvesdifferentiating between sentences that include acause-effect relation and sentences that do not.We approached this as a binary text classificationtask and experimented with multiple trainingsets augmented with additional linguisticinformation. Our best model was generated bytraining roberta-base on a combination ofdata from both Subtasks 1 and 2 with the additionof named entity annotations. During thedevelopment phase we achieved a macro F1 of0.8641 with this model on the development setprovided by the task organizers. When testingthe model on the final test data, we achieved amacro F1 of 0.8516.
{"title":"NLP4ITF @ Causal News Corpus 2022: Leveraging Linguistic Information for Event Causality Classification","authors":"Theresa Krumbiegel, Sophie Decher","doi":"10.18653/v1/2022.case-1.3","DOIUrl":"https://doi.org/10.18653/v1/2022.case-1.3","url":null,"abstract":"We present our submission to Subtask 1 of theCASE-2022 Shared Task 3: Event CausalityIdentification with Causal News Corpus as partof the 5th Workshop on Challenges and Applicationsof Automated Extraction of SociopoliticalEvents from Text (CASE 2022) (Tanet al., 2022a). The task focuses on causal eventclassification on the sentence level and involvesdifferentiating between sentences that include acause-effect relation and sentences that do not.We approached this as a binary text classificationtask and experimented with multiple trainingsets augmented with additional linguisticinformation. Our best model was generated bytraining roberta-base on a combination ofdata from both Subtasks 1 and 2 with the additionof named entity annotations. During thedevelopment phase we achieved a macro F1 of0.8641 with this model on the development setprovided by the task organizers. When testingthe model on the final test data, we achieved amacro F1 of 0.8516.","PeriodicalId":80307,"journal":{"name":"The Case manager","volume":"8 1","pages":"16-20"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74343852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.18653/v1/2022.case-1.29
Vanni Zavarella, Hristo Tanev, Ali Hürriyetoǧlu, Peratham Wiriyathammabhum, Bertrand De Longueville
The goal of Shared Task 2 is evaluating state-of-the-art event detection systems by comparing the spatio-temporal distribution of the events they detect with existing event databases.The task focuses on some usability requirements of event detection systems in real worldscenarios. Namely, it aims to measure the ability of such a system to: (i) detect socio-political event mentions in news and social media, (ii) properly find their geographical locations, (iii) de-duplicate reports extracted from multiple sources referring to the same actual event. Building an annotated corpus for training and evaluating jointly these sub-tasks is highly time consuming. One possible way to indirectly evaluate a system’s output without an annotated corpus available is to measure its correlation with human-curated event data sets.In the last three years, the COVID-19 pandemic became motivation for restrictions and anti-pandemic measures on a world scale. This has triggered a wave of reactions and citizen actions in many countries. Shared Task 2 challenges participants to identify COVID-19 related protest actions from large unstructureddata sources both from mainstream and social media. We assess each system’s ability to model the evolution of protest events both temporally and spatially by using a number of correlation metrics with respect to a comprehensive and validated data set of COVID-related protest events (Raleigh et al., 2010).
共享任务2的目标是通过将检测到的事件的时空分布与现有事件数据库进行比较,来评估最先进的事件检测系统。该任务主要关注现实世界场景中事件检测系统的一些可用性需求。也就是说,它旨在衡量这样一个系统的能力:(i)检测新闻和社交媒体中提到的社会政治事件,(ii)适当地找到它们的地理位置,(iii)从多个来源提取的涉及同一实际事件的重复报道。建立一个带注释的语料库来训练和评估这些子任务是非常耗时的。在没有带注释的语料库可用的情况下,间接评估系统输出的一种可能方法是测量其与人工策划的事件数据集的相关性。在过去三年中,COVID-19大流行成为世界范围内限制和抗流行病措施的动力。这在许多国家引发了一波反应和公民行动。共同任务2要求参与者从主流媒体和社交媒体的大型非结构化数据源中识别与COVID-19相关的抗议行动。我们对每个系统在时间和空间上模拟抗议事件演变的能力进行了评估,方法是使用一系列与covid - 19相关的综合且经过验证的抗议事件数据集相关的相关指标(Raleigh et al., 2010)。
{"title":"Tracking COVID-19 protest events in the United States. Shared Task 2: Event Database Replication, CASE 2022","authors":"Vanni Zavarella, Hristo Tanev, Ali Hürriyetoǧlu, Peratham Wiriyathammabhum, Bertrand De Longueville","doi":"10.18653/v1/2022.case-1.29","DOIUrl":"https://doi.org/10.18653/v1/2022.case-1.29","url":null,"abstract":"The goal of Shared Task 2 is evaluating state-of-the-art event detection systems by comparing the spatio-temporal distribution of the events they detect with existing event databases.The task focuses on some usability requirements of event detection systems in real worldscenarios. Namely, it aims to measure the ability of such a system to: (i) detect socio-political event mentions in news and social media, (ii) properly find their geographical locations, (iii) de-duplicate reports extracted from multiple sources referring to the same actual event. Building an annotated corpus for training and evaluating jointly these sub-tasks is highly time consuming. One possible way to indirectly evaluate a system’s output without an annotated corpus available is to measure its correlation with human-curated event data sets.In the last three years, the COVID-19 pandemic became motivation for restrictions and anti-pandemic measures on a world scale. This has triggered a wave of reactions and citizen actions in many countries. Shared Task 2 challenges participants to identify COVID-19 related protest actions from large unstructureddata sources both from mainstream and social media. We assess each system’s ability to model the evolution of protest events both temporally and spatially by using a number of correlation metrics with respect to a comprehensive and validated data set of COVID-related protest events (Raleigh et al., 2010).","PeriodicalId":80307,"journal":{"name":"The Case manager","volume":"20 1","pages":"209-216"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89566353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}