Sigita Rackevičienė, A. Utka, Agnė Bielinskienė, A. Rokas
{"title":"在标注立陶宛网络安全语料库中的跨体裁术语分布","authors":"Sigita Rackevičienė, A. Utka, Agnė Bielinskienė, A. Rokas","doi":"10.15388/respectus.2022.41.46.105","DOIUrl":null,"url":null,"abstract":"The paper provides results of the frequential distribution analysis of cybersecurity terms used in the Lithuanian cybersecurity corpus composed of texts of different genres. The research focuses on the following aspects: overall distribution of cybersecurity terms (their density and diversity) across genres, distribution of English and English-Lithuanian terms and their usage patterns in Lithuanian sentences, and, finally, the most frequent cybersecurity terms and their thematic groups in each genre. The research was performed in several stages: compilation of a cybersecurity corpus and its subdivision into genre-specific subcorpora, manual annotation of cybersecurity terms, automatic lemmatisation of annotated terms and, finally, quantitative analysis of the distribution of the terms across the subcorpora. The results reveal the similarities and differences of the use of cybersecurity terminology across genres which are important to consider to get a complete picture of terminology usage trends in this domain.","PeriodicalId":36933,"journal":{"name":"Respectus Philologicus","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Distribution of Terms Across Genres in the Annotated Lithuanian Cybersecurity Corpus\",\"authors\":\"Sigita Rackevičienė, A. Utka, Agnė Bielinskienė, A. Rokas\",\"doi\":\"10.15388/respectus.2022.41.46.105\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The paper provides results of the frequential distribution analysis of cybersecurity terms used in the Lithuanian cybersecurity corpus composed of texts of different genres. The research focuses on the following aspects: overall distribution of cybersecurity terms (their density and diversity) across genres, distribution of English and English-Lithuanian terms and their usage patterns in Lithuanian sentences, and, finally, the most frequent cybersecurity terms and their thematic groups in each genre. The research was performed in several stages: compilation of a cybersecurity corpus and its subdivision into genre-specific subcorpora, manual annotation of cybersecurity terms, automatic lemmatisation of annotated terms and, finally, quantitative analysis of the distribution of the terms across the subcorpora. The results reveal the similarities and differences of the use of cybersecurity terminology across genres which are important to consider to get a complete picture of terminology usage trends in this domain.\",\"PeriodicalId\":36933,\"journal\":{\"name\":\"Respectus Philologicus\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-04-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Respectus Philologicus\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.15388/respectus.2022.41.46.105\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Arts and Humanities\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Respectus Philologicus","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15388/respectus.2022.41.46.105","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Arts and Humanities","Score":null,"Total":0}
Distribution of Terms Across Genres in the Annotated Lithuanian Cybersecurity Corpus
The paper provides results of the frequential distribution analysis of cybersecurity terms used in the Lithuanian cybersecurity corpus composed of texts of different genres. The research focuses on the following aspects: overall distribution of cybersecurity terms (their density and diversity) across genres, distribution of English and English-Lithuanian terms and their usage patterns in Lithuanian sentences, and, finally, the most frequent cybersecurity terms and their thematic groups in each genre. The research was performed in several stages: compilation of a cybersecurity corpus and its subdivision into genre-specific subcorpora, manual annotation of cybersecurity terms, automatic lemmatisation of annotated terms and, finally, quantitative analysis of the distribution of the terms across the subcorpora. The results reveal the similarities and differences of the use of cybersecurity terminology across genres which are important to consider to get a complete picture of terminology usage trends in this domain.