{"title":"Paraphrase and parallel treebank for the comparison of French and Chinese syntax","authors":"Rafaël Poiret, Simon Mille, Haitao Liu","doi":"10.1075/LIC.20002.POI","DOIUrl":null,"url":null,"abstract":"\nThis paper proposes to study the contrastive syntax of French and Chinese through the lens of syntactic mismatches, and by making use of parallel treebanks. A syntactic mismatch is the non-similarity between the syntactic structures of one linguistic unit and its translation. Syntactic mismatches are formalized using the notion of paraphrase from the Meaning-Text Theory, which allows for capturing mismatches at different levels of the linguistic description (e.g. Semantic, Deep-Syntactic, and Surface-Syntactic). In this paper, we report in details on the types of paraphrases found in the seed corpus used, demonstrating that the Deep-Syntactic paraphrases constitute the best starting point for our study. Then, we show how, starting from the seed corpus, we semi-automatically constructed a multi-layer parallel treebank with the alignment and annotation of paraphrases.","PeriodicalId":43502,"journal":{"name":"Languages in Contrast","volume":"23 1","pages":""},"PeriodicalIF":0.5000,"publicationDate":"2021-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Languages in Contrast","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1075/LIC.20002.POI","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
引用次数: 0
Abstract
This paper proposes to study the contrastive syntax of French and Chinese through the lens of syntactic mismatches, and by making use of parallel treebanks. A syntactic mismatch is the non-similarity between the syntactic structures of one linguistic unit and its translation. Syntactic mismatches are formalized using the notion of paraphrase from the Meaning-Text Theory, which allows for capturing mismatches at different levels of the linguistic description (e.g. Semantic, Deep-Syntactic, and Surface-Syntactic). In this paper, we report in details on the types of paraphrases found in the seed corpus used, demonstrating that the Deep-Syntactic paraphrases constitute the best starting point for our study. Then, we show how, starting from the seed corpus, we semi-automatically constructed a multi-layer parallel treebank with the alignment and annotation of paraphrases.
期刊介绍:
Languages in Contrast aims to publish contrastive studies of two or more languages. Any aspect of language may be covered, including vocabulary, phonology, morphology, syntax, semantics, pragmatics, text and discourse, stylistics, sociolinguistics and psycholinguistics. Languages in Contrast welcomes interdisciplinary studies, particularly those that make links between contrastive linguistics and translation, lexicography, computational linguistics, language teaching, literary and linguistic computing, literary studies and cultural studies.