This article reviews The Human Factor in Machine Translation 978-1-138-55121-3978-1-315-14753-6
本文综述了机器翻译中的人为因素98-1-13 8-55121-3978-1-315-14753-6
{"title":"Chan, Sin-wai (ed.). 2018. The Human Factor in Machine Translation","authors":"Hui Liu","doi":"10.1075/TERM.00030.LIU","DOIUrl":"https://doi.org/10.1075/TERM.00030.LIU","url":null,"abstract":"This article reviews The Human Factor in Machine Translation 978-1-138-55121-3978-1-315-14753-6","PeriodicalId":44429,"journal":{"name":"Terminology","volume":" ","pages":""},"PeriodicalIF":0.8,"publicationDate":"2019-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48485927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Frame Semantics provides a powerful cross-lingual model to describe the conceptual structure underlying specialized language. Building specialized frames is challenging because of the complex nature of predicate-argument structures, and because of the domain-specific uses of general-language predicates. Our semi-automatic method elicits semantic frames from specialized corpora. It aims to discover lexical patterns that reveal the structure of specialized frames and to populate them with corpus-based data. Firstly, we automatically extracted verb-noun triples from corpora using bootstrapping to identify noun-verb-noun phraseological patterns. Secondly, we annotated each noun-verb-noun triple with the lexical domain of the verbs and the semantic class and role of the noun filling each argument slot. We then used these annotations and patterns to classify similar triples. Thus, the structure and the types of lexical units that belong to each specialized frames were inferred. Specialized corpora analysis of environmental science texts in English and in Spanish illustrate our methodology.
{"title":"Eliciting specialized frames from corpora using argument-structure extraction techniques","authors":"Beatriz Sánchez Cárdenas, Carlos Ramisch","doi":"10.1075/TERM.00026.SAN","DOIUrl":"https://doi.org/10.1075/TERM.00026.SAN","url":null,"abstract":"\u0000 Frame Semantics provides a powerful cross-lingual model to describe the conceptual structure underlying\u0000 specialized language. Building specialized frames is challenging because of the complex nature of predicate-argument structures,\u0000 and because of the domain-specific uses of general-language predicates. Our semi-automatic method elicits semantic frames from\u0000 specialized corpora. It aims to discover lexical patterns that reveal the structure of specialized frames and to populate them\u0000 with corpus-based data. Firstly, we automatically extracted verb-noun triples from corpora using bootstrapping to identify\u0000 noun-verb-noun phraseological patterns. Secondly, we annotated each noun-verb-noun triple with the lexical domain of the verbs and\u0000 the semantic class and role of the noun filling each argument slot. We then used these annotations and patterns to classify\u0000 similar triples. Thus, the structure and the types of lexical units that belong to each specialized frames were inferred.\u0000 Specialized corpora analysis of environmental science texts in English and in Spanish illustrate our methodology.","PeriodicalId":44429,"journal":{"name":"Terminology","volume":" ","pages":""},"PeriodicalIF":0.8,"publicationDate":"2019-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45333884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper focuses on the study of word combinations “of common usage” which are “lexicalized”, have “syntactic and semantic stability, may be idiomatized and carry connotations, and have an emphatic or intensifying function.” (Gläser 1994/1995, 45). Following previous research on Languages for Specific Purposes (LSP) and legal phraseology, we will define, identify and classify these units in English and Spanish according to their form and meaning, using a comparable corpus of sales contracts. To carry out our study, we will focus on a number of descriptors that are commonly used within these units on the basis of the headwords they collocate with, in order to determine how specific or general they are in their form, use and meaning since this issue poses translation problems. As genres determine matters such as or terminology and phraseology, the results will be useful for specialized translators and legal drafters.
{"title":"Lexical chunks in English and Spanish sales contracts","authors":"Belén López Arroyo, Leticia Moreno Pérez","doi":"10.1075/TERM.00027.LOP","DOIUrl":"https://doi.org/10.1075/TERM.00027.LOP","url":null,"abstract":"\u0000This paper focuses on the study of word combinations “of common usage” which are “lexicalized”, have “syntactic and semantic stability, may be idiomatized and carry connotations, and have an emphatic or intensifying function.” (Gläser 1994/1995, 45). Following previous research on Languages for Specific Purposes (LSP) and legal phraseology, we will define, identify and classify these units in English and Spanish according to their form and meaning, using a comparable corpus of sales contracts. To carry out our study, we will focus on a number of descriptors that are commonly used within these units on the basis of the headwords they collocate with, in order to determine how specific or general they are in their form, use and meaning since this issue poses translation problems. As genres determine matters such as or terminology and phraseology, the results will be useful for specialized translators and legal drafters.","PeriodicalId":44429,"journal":{"name":"Terminology","volume":"1 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2019-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42534715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This article explores terminological variation, in particular denominative variation, in popular science discourse. Our objective is to analyse the terminological complexity and instability of terms referring to the controversial notion of biocontrol (lutte biologique in French) in two types of publications. The analysis is based on the identification of the different denominations used in this interdisciplinary subject field, both in a Journal specialized in plant protection and in the most popular French daily newspapers. Our study aims to give an in-depth linguistic and cognitive analysis of French terms and explain the reasons for the observed variation. As most scientific research topics gaining public attention, biological control (or biocontrol) is prone to a multiplication of terms. In this study, we show how the profusion of terms, in a domain with important scientific and societal implications, can maintain or even exacerbate terminological and conceptual confusion.
{"title":"Vulgarisation scientifique et médiatisation de la science","authors":"Hélène Ledouble","doi":"10.1075/TERM.00028.LED","DOIUrl":"https://doi.org/10.1075/TERM.00028.LED","url":null,"abstract":"\u0000 This article explores terminological variation, in particular denominative variation, in popular science\u0000 discourse. Our objective is to analyse the terminological complexity and instability of terms referring to the controversial\u0000 notion of biocontrol (lutte biologique in French) in two types of publications. The analysis is based on the\u0000 identification of the different denominations used in this interdisciplinary subject field, both in a Journal specialized in plant\u0000 protection and in the most popular French daily newspapers. Our study aims to give an in-depth linguistic and cognitive analysis\u0000 of French terms and explain the reasons for the observed variation. As most scientific research topics gaining public attention,\u0000 biological control (or biocontrol) is prone to a multiplication of terms. In this study, we show how the profusion of terms, in a domain with\u0000 important scientific and societal implications, can maintain or even exacerbate terminological and conceptual confusion.","PeriodicalId":44429,"journal":{"name":"Terminology","volume":" ","pages":""},"PeriodicalIF":0.8,"publicationDate":"2019-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46268836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andraz Repar, V. Podpečan, Anze Vavpetic, N. Lavrač, Senja Pollak
Abstract This paper describes TermEnsembler, a bilingual term extraction and alignment system utilizing a novel ensemble learning approach to bilingual term alignment. In the proposed system, the processing starts with monolingual term extraction from a language industry standard file type containing aligned English and Slovenian texts. The two separate term lists are then automatically aligned using an ensemble of seven bilingual alignment methods, which are first executed separately and then merged using the weights learned with an evolutionary algorithm. In the experiments, the weights were learned on one domain and tested on two other domains. When evaluated on the top 400 aligned term pairs, the precision of term alignment is over 96%, while the number of correctly aligned multi-word unit terms exceeds 30% when evaluated on the top 400 term pairs.
{"title":"TermEnsembler: An ensemble learning approach to bilingual term extraction and alignment","authors":"Andraz Repar, V. Podpečan, Anze Vavpetic, N. Lavrač, Senja Pollak","doi":"10.1075/TERM.00029.REP","DOIUrl":"https://doi.org/10.1075/TERM.00029.REP","url":null,"abstract":"Abstract This paper describes TermEnsembler, a bilingual term extraction and alignment system utilizing a novel ensemble learning approach to bilingual term alignment. In the proposed system, the processing starts with monolingual term extraction from a language industry standard file type containing aligned English and Slovenian texts. The two separate term lists are then automatically aligned using an ensemble of seven bilingual alignment methods, which are first executed separately and then merged using the weights learned with an evolutionary algorithm. In the experiments, the weights were learned on one domain and tested on two other domains. When evaluated on the top 400 aligned term pairs, the precision of term alignment is over 96%, while the number of correctly aligned multi-word unit terms exceeds 30% when evaluated on the top 400 term pairs.","PeriodicalId":44429,"journal":{"name":"Terminology","volume":"25 1","pages":"93-120"},"PeriodicalIF":0.8,"publicationDate":"2019-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49482799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Gautier, Laurent (ed.). 2018. Figement et discours spécialisés. Forum für Fachsprachen-Forschung","authors":"Johannes Dahm","doi":"10.1075/TERM.00031.DAH","DOIUrl":"https://doi.org/10.1075/TERM.00031.DAH","url":null,"abstract":"","PeriodicalId":44429,"journal":{"name":"Terminology","volume":" ","pages":""},"PeriodicalIF":0.8,"publicationDate":"2019-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49488122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Apart from the importance of accurate meaning transference, the key to English translation of Traditional Chinese Medicine (TCM) terms lies in the proper translation of term forms, in particular, of long term structures and length. This article reports on an empirical study of the English translation of long TCM terms by the following procedures: (1) collecting 1220 TCM terms and their English translations from dictionaries, journals and official websites related to TCM terminology translation; (2) segmenting and POS-tagging with ICTCLAS to obtain 823 long TCM terms with 3 or more Chinese words; (3) selecting 150 out of the 823 long TCM terms through random sampling; (4) POS-tagging the 150 English translations with CLAWS5; (5) basing on the parallel corpus and systematically discussing the structures, term length change, translation techniques and translation regularities generalized from the English translation of long TCM terms. The result shows nominalization, shift of some pre-modifiers into post-modifiers, and amplification of a predicate in the 9 kinds of structural features, and some translation techniques like literal translation, paraphrase, adaptation, amplification and simplification employed in the English translation of long TCM terms.
{"title":"English translation of long Traditional Chinese Medicine terms","authors":"Yaru Chen, Wei Chen","doi":"10.1075/TERM.00018.CHE","DOIUrl":"https://doi.org/10.1075/TERM.00018.CHE","url":null,"abstract":"\u0000 Apart from the importance of accurate meaning transference, the key to English translation of Traditional Chinese Medicine (TCM)\u0000 terms lies in the proper translation of term forms, in particular, of long term structures and length. This article reports on an\u0000 empirical study of the English translation of long TCM terms by the following procedures: (1) collecting 1220 TCM terms and their\u0000 English translations from dictionaries, journals and official websites related to TCM terminology translation; (2) segmenting and\u0000 POS-tagging with ICTCLAS to obtain 823 long TCM terms with 3 or more Chinese words; (3) selecting 150 out of the 823 long TCM\u0000 terms through random sampling; (4) POS-tagging the 150 English translations with CLAWS5; (5) basing on the parallel corpus and\u0000 systematically discussing the structures, term length change, translation techniques and translation regularities generalized from\u0000 the English translation of long TCM terms. The result shows nominalization, shift of some pre-modifiers into post-modifiers, and\u0000 amplification of a predicate in the 9 kinds of structural features, and some translation techniques like literal translation,\u0000 paraphrase, adaptation, amplification and simplification employed in the English translation of long TCM terms.","PeriodicalId":44429,"journal":{"name":"Terminology","volume":"1 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2018-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41800957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In recent decades, the scope of terminology research has been extended. The peculiarities and complexities of terminology are further ascertained by the exploration into the practice of terminology translation in the field of humanities and social sciences. The cultural functions that terminology in this research field (H&SS terms) fulfill and the intrinsic difficulties involved in translating them are worth further investigation. This paper based on some reflections of the development of the NUTermBank discusses the legitimacy of terminology translation as independent research and makes an initial attempt to theorize about how the research can be actually carried out. A holistic view of terminology translation is taken and a 3-M research model is proposed in this paper.
{"title":"Conceptualization and theorization of terminology translation in humanities and social sciences","authors":"Xiangqing Wei","doi":"10.1075/TERM.00021.WEI","DOIUrl":"https://doi.org/10.1075/TERM.00021.WEI","url":null,"abstract":"\u0000 In recent decades, the scope of terminology research has been extended. The peculiarities and complexities of terminology are\u0000 further ascertained by the exploration into the practice of terminology translation in the field of humanities and social\u0000 sciences. The cultural functions that terminology in this research field (H&SS terms) fulfill and the intrinsic difficulties\u0000 involved in translating them are worth further investigation. This paper based on some reflections of the development of the\u0000 NUTermBank discusses the legitimacy of terminology translation as independent research and makes an initial attempt to theorize\u0000 about how the research can be actually carried out. A holistic view of terminology translation is taken and a 3-M research model\u0000 is proposed in this paper.","PeriodicalId":44429,"journal":{"name":"Terminology","volume":" ","pages":""},"PeriodicalIF":0.8,"publicationDate":"2018-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44896835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This article examines the status of constructed controlled terminologies from the perspective of the coverage of terms/concepts. To facilitate controlled authoring of Japanese texts of the municipal domain and promote machine translatability into English, we constructed terminologies in the following way: (1) Japanese-English term pairs are extracted from aligned texts; (2) term variations are controlled by defining preferred and proscribed terms for both languages. To assess the coverage of the constructed terminologies, we propose a quantitative extrapolation method that estimates the potential vocabulary size. The coverage estimations show that the coverage of terms for Japanese is higher than that for English by about 10%, which reflects the greater diversity of the translated English terms. The coverage of concepts reaches around 60% for both Japanese and English. The method also enables us to quantitatively estimate how much effort is needed to further increase the coverage.
{"title":"Building controlled bilingual terminologies for the municipal domain and evaluating them using a coverage estimation\u0000 approach","authors":"Rei Miyata, K. Kageura","doi":"10.1075/TERM.00017.MIY","DOIUrl":"https://doi.org/10.1075/TERM.00017.MIY","url":null,"abstract":"\u0000 This article examines the status of constructed controlled terminologies from the perspective of the coverage of terms/concepts. To\u0000 facilitate controlled authoring of Japanese texts of the municipal domain and promote machine translatability into English, we\u0000 constructed terminologies in the following way: (1) Japanese-English term pairs are extracted from aligned texts; (2) term\u0000 variations are controlled by defining preferred and proscribed terms for both languages. To assess the coverage of the constructed\u0000 terminologies, we propose a quantitative extrapolation method that estimates the potential vocabulary size. The coverage\u0000 estimations show that the coverage of terms for Japanese is higher than that for English by about 10%, which\u0000 reflects the greater diversity of the translated English terms. The coverage of concepts reaches around 60% for\u0000 both Japanese and English. The method also enables us to quantitatively estimate how much effort is needed to further increase the\u0000 coverage.","PeriodicalId":44429,"journal":{"name":"Terminology","volume":" ","pages":""},"PeriodicalIF":0.8,"publicationDate":"2018-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46955730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}