{"title":"Australian English Bilingual Corpus: Automatic forced-alignment accuracy in Russian and English","authors":"Ksenia Gnevsheva, S. Gonzalez, R. Fromont","doi":"10.1080/07268602.2020.1737507","DOIUrl":null,"url":null,"abstract":"ABSTRACT This paper introduces the Australian English Bilingual Corpus, a Russian–English spoken corpus, and uses it for a comparison of automatic time alignment between two different languages. Automatic forced alignment is gaining popularity in corpus research as it allows for time-efficient processing of phonetic information. The Language, Brain and Behaviour: Corpus Analysis Tool is one aligner which compares well with others in terms of alignment accuracy. Most of the forced-alignment work has been done with different varieties of English. This paper compares alignment accuracy between Russian and English and discusses aligner settings and data characteristics that affect it. The results suggest higher alignment accuracy for English than Russian. For Russian, alignment accuracy improves with stress specification; that is, when stressed and unstressed vowels are treated as separate categories.","PeriodicalId":44988,"journal":{"name":"Australian Journal of Linguistics","volume":"40 1","pages":"182 - 193"},"PeriodicalIF":0.4000,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/07268602.2020.1737507","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Australian Journal of Linguistics","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1080/07268602.2020.1737507","RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
引用次数: 3
Abstract
ABSTRACT This paper introduces the Australian English Bilingual Corpus, a Russian–English spoken corpus, and uses it for a comparison of automatic time alignment between two different languages. Automatic forced alignment is gaining popularity in corpus research as it allows for time-efficient processing of phonetic information. The Language, Brain and Behaviour: Corpus Analysis Tool is one aligner which compares well with others in terms of alignment accuracy. Most of the forced-alignment work has been done with different varieties of English. This paper compares alignment accuracy between Russian and English and discusses aligner settings and data characteristics that affect it. The results suggest higher alignment accuracy for English than Russian. For Russian, alignment accuracy improves with stress specification; that is, when stressed and unstressed vowels are treated as separate categories.