Brazilian Portuguese corpora for teaching and translation: the CoMET project

IF 1.8 3区计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Language Resources and Evaluation Pub Date : 2023-11-16 DOI:10.1007/s10579-023-09690-z

Stella E. O. Tagnin

引用次数: 0

Abstract

This paper starts with an overview of corpora available for Brazilian Portuguese to subsequently focus mainly on the CoMET Project developed at the University of São Paulo. CoMET consists of three corpora: a comparable Portuguese-English technical corpus (CorTec), a Portuguese-English parallel (translation) corpus (CorTrad) and a multilingual learner corpus, (CoMAprend), all available for online queries with specific tools. CorTec offers over fifty corpora in a variety of domains, from Health Sciences to Olympic Games. CorTrad is divided into three parts: Popular Science, Technical-Scientific and Literary. Each one of CoMET’s corpora is presented in detail. Examples are also provided.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用于教学和翻译的巴西葡萄牙语料库:CoMET项目

本文首先概述了巴西葡萄牙语可用的语料库，随后主要关注圣保罗大学开发的CoMET项目。CoMET由三个语料库组成:一个类似的葡萄牙语-英语技术语料库(CorTec)，一个葡萄牙语-英语平行(翻译)语料库(CorTrad)和一个多语言学习者语料库(CoMAprend)，所有这些都可以通过特定的工具在线查询。CorTec提供从健康科学到奥运会等多个领域的50多个语料库。科普特分为三个部分:科普、科技和文学。详细介绍了CoMET的每个语料库。还提供了示例。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Language Resources and Evaluation 工程技术-计算机：跨学科应用

CiteScore

6.50

自引率

3.70%

发文量

审稿时长

>12 weeks

期刊介绍： Language Resources and Evaluation is the first publication devoted to the acquisition, creation, annotation, and use of language resources, together with methods for evaluation of resources, technologies, and applications. Language resources include language data and descriptions in machine readable form used to assist and augment language processing applications, such as written or spoken corpora and lexica, multimodal resources, grammars, terminology or domain specific databases and dictionaries, ontologies, multimedia databases, etc., as well as basic software tools for their acquisition, preparation, annotation, management, customization, and use. Evaluation of language resources concerns assessing the state-of-the-art for a given technology, comparing different approaches to a given problem, assessing the availability of resources and technologies for a given application, benchmarking, and assessing system usability and user satisfaction.