Language Resources and Evaluation最新文献

英文中文

Fine-tuning language models to recognize semantic relations 微调语言模型以识别语义关系

IF 2.7 3区计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Language Resources and Evaluation

Pub Date : 2023-07-23 DOI: 10.1007/s10579-023-09677-w

D. Roussinov, S. Sharoff, Nadezhda Puchnina

引用次数: 0

Assessment of pragmatic abilities and cognitive substrates (APACS) brief remote: a novel tool for the rapid and tele-evaluation of pragmatic skills in Italian 语用能力和认知基础评估(APACS):一种用于意大利语语用技能快速和远程评估的新工具

IF 2.7 3区计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Language Resources and Evaluation

Pub Date : 2023-07-23 DOI: 10.1007/s10579-023-09667-y

L. Bischetti, C. Pompei, Biagio Scalingi, F. Frau, M. Bosia, G. Arcara, V. Bambini

引用次数: 0

MarIA and BETO are sexist: evaluating gender bias in large language models for Spanish MarIA和BETO是性别歧视者：评估西班牙语大型语言模型中的性别偏见

IF 2.7 3区计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Language Resources and Evaluation

Pub Date : 2023-07-23 DOI: 10.1007/s10579-023-09670-3

Ismael Garrido-Muñoz, F. Martínez-Santiago, Arturo Montejo-Ráez

引用次数: 1

FullStop: punctuation and segmentation prediction for Dutch with transformers FullStop:带变压器的荷兰语标点和分词预测

IF 2.7 3区计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Language Resources and Evaluation

Pub Date : 2023-07-14 DOI: 10.1007/s10579-023-09676-x

Vincent Vandeghinste, Oliver Guhr

When applying automated speech recognition (ASR) for Belgian Dutch, the output consists of an unsegmented stream of words, without any punctuation. A next step is to perform segmentation and insert punctuation, making the ASR output more readable and easy to manually correct. We present the first (as far as we know) publicly available punctuation insertion system for Dutch that functions at a usable level and that is publicly available. The model we present here is an extension of the approach of Guhr et al. (In: Swiss Text Analytics Conference. Shared task on Sentence End and Punctuation Prediction in NLG Text, 2021) for Dutch: we finetuned the Dutch language model RobBERT on a punctuation prediction sequence classification task. The model was finetuned on two datasets: the Dutch side of Europarl and the SoNaR corpus. For every word in the input sequence, the model predicts a punctuation marker that follows the word. In cases where the language is unknown or where code switching applies, we have extended an existing multilingual model with Dutch. Previous work showed that such a multilingual model, based on “xlm-roberta-base” performs on par or sometimes even better than the monolingual cases. The system was evaluated on in-domain data as a classifier and on out-of-domain data as a sentence segmentation system through full stop prediction. The evaluations on sentence segmentation on out of domain data show that models finetuned on SoNaR show the best results, which can be attributed to SoNaR being a reference corpus containing different language registers. The multilingual models show an even better precision (at the cost of a lower recall) compared to the monolingual models.

当对比利时荷兰语应用自动语音识别(ASR)时，输出由未分割的单词流组成，没有任何标点符号。下一步是执行分割和插入标点符号，使ASR输出更具可读性和易于手动纠正。我们提出了第一个(据我们所知)公开可用的荷兰语标点插入系统，该系统在可用级别上运行，并且是公开可用的。我们在这里提出的模型是Guhr等人的方法的扩展(参见:瑞士文本分析会议)。荷兰语句子结尾和标点符号预测的共享任务NLG文本，2021):我们在标点符号预测序列分类任务上微调荷兰语模型robert。该模型在两个数据集上进行了微调:Europarl的荷兰方面和SoNaR语料库。对于输入序列中的每个单词，该模型预测单词后面的标点符号。在语言未知或需要代码转换的情况下，我们用荷兰语扩展了现有的多语言模型。先前的研究表明，这种基于“xlm-roberta-base”的多语言模型的表现与单语言情况相当，有时甚至更好。通过句号预测对域内数据作为分类器和域外数据作为句子切分系统进行了评价。对域外数据的句子切分评价表明，在SoNaR上调优的模型效果最好，这可归因于SoNaR是包含不同语言语域的参考语料库。与单语言模型相比，多语言模型显示出更好的精度(以更低的召回率为代价)。

{"title":"FullStop: punctuation and segmentation prediction for Dutch with transformers","authors":"Vincent Vandeghinste, Oliver Guhr","doi":"10.1007/s10579-023-09676-x","DOIUrl":"https://doi.org/10.1007/s10579-023-09676-x","url":null,"abstract":"When applying automated speech recognition (ASR) for Belgian Dutch, the output consists of an unsegmented stream of words, without any punctuation. A next step is to perform segmentation and insert punctuation, making the ASR output more readable and easy to manually correct. We present the first (as far as we know) publicly available punctuation insertion system for Dutch that functions at a usable level and that is publicly available. The model we present here is an extension of the approach of Guhr et al. (In: Swiss Text Analytics Conference. Shared task on Sentence End and Punctuation Prediction in NLG Text, 2021) for Dutch: we finetuned the Dutch language model RobBERT on a punctuation prediction sequence classification task. The model was finetuned on two datasets: the Dutch side of Europarl and the SoNaR corpus. For every word in the input sequence, the model predicts a punctuation marker that follows the word. In cases where the language is unknown or where code switching applies, we have extended an existing multilingual model with Dutch. Previous work showed that such a multilingual model, based on “xlm-roberta-base” performs on par or sometimes even better than the monolingual cases. The system was evaluated on in-domain data as a classifier and on out-of-domain data as a sentence segmentation system through full stop prediction. The evaluations on sentence segmentation on out of domain data show that models finetuned on SoNaR show the best results, which can be attributed to SoNaR being a reference corpus containing different language registers. The multilingual models show an even better precision (at the cost of a lower recall) compared to the monolingual models.","PeriodicalId":49927,"journal":{"name":"Language Resources and Evaluation","volume":"3 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2023-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138513877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

adaptNMT: an open-source, language-agnostic development environment for neural machine translation adaptNMT:一个开源的、与语言无关的神经机器翻译开发环境

IF 2.7 3区计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Language Resources and Evaluation

Pub Date : 2023-07-14 DOI: 10.1007/s10579-023-09671-2

Séamus Lankford, Haithem Afli, Andy Way

引用次数: 2

The Visual Language Research Corpus (VLRC): an annotated corpus of comics from Asia, Europe, and the United States 视觉语言研究语料库(VLRC):一个来自亚洲、欧洲和美国的漫画注释语料库

IF 2.7 3区计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Language Resources and Evaluation

Pub Date : 2023-07-14 DOI: 10.1007/s10579-023-09673-0

Neil Cohn, Bruno Cardoso, Bien Klomberg, Irmak Hacımusaoğlu

引用次数: 2

Evaluation of a rule-based approach to automatic factual question generation using syntactic and semantic analysis 使用句法和语义分析评估基于规则的事实问题自动生成方法

IF 2.7 3区计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Language Resources and Evaluation

Pub Date : 2023-07-10 DOI: 10.1007/s10579-023-09672-1

A. Gašpar, Ani Grubišić, Ines Šarić-Grgić

引用次数: 1

Sentiment analysis in Portuguese tweets: an evaluation of diverse word representation models 葡萄牙语推文的情感分析:不同词表示模型的评价

IF 2.7 3区计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Language Resources and Evaluation

Pub Date : 2023-06-28 DOI: 10.1007/s10579-023-09661-4

Daniela Vianna, Fernando Carneiro, Jonnathan Carvalho, Alexandre Plastino, A. Paes

引用次数: 0

Clinical Profile of Cerebrovascular Disease Population in Sorsogon: A Hospital-based Study. 索索贡省脑血管疾病人群的临床概况：一项基于医院的研究

3区计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Language Resources and Evaluation

Pub Date : 2023-06-28 eCollection Date: 2023-01-01 DOI: 10.47895/amp.vi0.4975

Frances Jane Hermo-Aganon, John Jerusalem Tiongson

Objectives: In the Philippines, an estimated half million are affected annually by stroke. It is the third most common cause of mortality among Filipinos. Locally, there are limited data on the epidemiology of stroke in the country. This study aimed to study cerebrovascular disease in the rural setting in the country, primarily exploring the demographic characteristics, risk factors, clinical profile, and outcomes of patients assessed with cerebrovascular disease in the province of Sorsogon.

Methods: This was a retrospective study of all adult patients admitted to two tertiary hospitals in Sorsogon between February 1, 2020, and January 31, 2021, with a stroke diagnosis (International Classification of Diseases, Revision 10). A manual review of the charts and demographics, risk factors, clinical presentation, neuroimaging findings, and outcome were recorded.

Results: A total of 721 cases with a mean age of 63.06 ± 13.96 years were involved in the analysis. Of all the stroke cases, 64.7% were ischemic, and 29.7% were hemorrhagic strokes. The most common risk factors for stroke occurrence were hypertension (65%), history of stroke (16.2%), and diabetes (11.4%). Most sought consultation was due to one-sided weakness (41.3%) and slurring speech (14.2%).

Conclusion: In a third-class province in the Philippines, the most common type of stroke was an ischemic stroke. Analysis showed that diabetes was more associated with ischemia while hypertension was significantly associated with hemorrhagic stroke. A mortality rate of 26.8% was seen in this cerebrovascular disease population.

目的：在菲律宾，估计每年有 50 万人受到中风的影响。它是菲律宾人第三大常见死因。在当地，有关中风流行病学的数据十分有限。本研究旨在研究菲律宾农村地区的脑血管疾病，主要探讨索索贡省脑血管疾病患者的人口学特征、风险因素、临床概况和评估结果：这是一项回顾性研究，研究对象是 2020 年 2 月 1 日至 2021 年 1 月 31 日期间索索贡省两家三级医院收治的所有诊断为中风（《国际疾病分类》修订版 10）的成年患者。对病历进行了人工审核，并记录了人口统计学、风险因素、临床表现、神经影像学检查结果和预后：共有 721 例病例参与分析，平均年龄为（63.06±13.96）岁。在所有中风病例中，64.7%为缺血性中风，29.7%为出血性中风。中风发生的最常见风险因素是高血压（65%）、中风史（16.2%）和糖尿病（11.4%）。大多数人因单侧乏力（41.3%）和言语不清（14.2%）而就诊：在菲律宾的一个三等省，最常见的中风类型是缺血性中风。分析表明，糖尿病与缺血性中风的关系更为密切，而高血压则与出血性中风密切相关。该脑血管疾病人群的死亡率为 26.8%。

{"title":"Clinical Profile of Cerebrovascular Disease Population in Sorsogon: A Hospital-based Study.","authors":"Frances Jane Hermo-Aganon, John Jerusalem Tiongson","doi":"10.47895/amp.vi0.4975","DOIUrl":"10.47895/amp.vi0.4975","url":null,"abstract":"Objectives: In the Philippines, an estimated half million are affected annually by stroke. It is the third most common cause of mortality among Filipinos. Locally, there are limited data on the epidemiology of stroke in the country. This study aimed to study cerebrovascular disease in the rural setting in the country, primarily exploring the demographic characteristics, risk factors, clinical profile, and outcomes of patients assessed with cerebrovascular disease in the province of Sorsogon.Methods: This was a retrospective study of all adult patients admitted to two tertiary hospitals in Sorsogon between February 1, 2020, and January 31, 2021, with a stroke diagnosis (International Classification of Diseases, Revision 10). A manual review of the charts and demographics, risk factors, clinical presentation, neuroimaging findings, and outcome were recorded.Results: A total of 721 cases with a mean age of 63.06 ± 13.96 years were involved in the analysis. Of all the stroke cases, 64.7% were ischemic, and 29.7% were hemorrhagic strokes. The most common risk factors for stroke occurrence were hypertension (65%), history of stroke (16.2%), and diabetes (11.4%). Most sought consultation was due to one-sided weakness (41.3%) and slurring speech (14.2%).Conclusion: In a third-class province in the Philippines, the most common type of stroke was an ischemic stroke. Analysis showed that diabetes was more associated with ischemia while hypertension was significantly associated with hemorrhagic stroke. A mortality rate of 26.8% was seen in this cerebrovascular disease population.","PeriodicalId":49927,"journal":{"name":"Language Resources and Evaluation","volume":"52 1","pages":"35-39"},"PeriodicalIF":0.0,"publicationDate":"2023-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11522632/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70457326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The C-ORAL-ESQ project: a corpus for the study of spontaneous speech of individuals with schizophrenia C-ORAL-ESQ项目：一个研究精神分裂症患者自发言语的语料库

IF 2.7 3区计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Language Resources and Evaluation

Pub Date : 2023-06-27 DOI: 10.1007/s10579-023-09675-y

Tommaso Raso, Bruno Neves Rati de Melo Rocha, J. Salgado, B. Cruz, Lucas Machado Mantovani, Heliana Mello

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Language Resources and Evaluation

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀