Pub Date : 2021-12-29DOI: 10.4312/slo2.0.2021.2.100-125
Mojca Stritar Kučuk
Redno vpisani tuji študenti Univerze v Ljubljani, ki se v prvem letu študija v okviru modula Leto plus učijo slovensko, se v drugem semestru na posebni delavnici podrobneje spoznajo s spletnimi jezikovnimi viri in tehnologijami za slovenščino. V prispevku je opisana izvedba te delavnice v študijskem letu 2019/20, ko je zaradi pandemije koronavirusa potekala na daljavo, v obliki interaktivnih videoposnetkov z nalogami za preverjanje razumevanja snovi. Drugi del prispevka se osredotoča na mnenje študentov o tovrstnih jezikovnih virih. S spletno anketo sem analizirala stališča in izkušnje študentov dveh generacij: študenti generacije 2018/19 so spletna orodja spoznavali v razredu, študenti generacije 2019/20 pa na daljavo. Sodeč po rezultatih ankete, mlajša generacija študentov jezikovne vire na spletu uporablja pogosteje. Študenti obeh skupin najpogosteje uporabljajo Googlov Prevajalnik, ki mu sledijo Sloleks, pregibnik Besana, Fran in Pons. Kot argumente za uporabo teh virov izpostavljajo predvsem hitrost oz. enostavnost uporabe in navajenost na določen vir.
{"title":"Spletna orodja za slovenščino in tuji študenti Univerze v Ljubljani","authors":"Mojca Stritar Kučuk","doi":"10.4312/slo2.0.2021.2.100-125","DOIUrl":"https://doi.org/10.4312/slo2.0.2021.2.100-125","url":null,"abstract":"Redno vpisani tuji študenti Univerze v Ljubljani, ki se v prvem letu študija v okviru modula Leto plus učijo slovensko, se v drugem semestru na posebni delavnici podrobneje spoznajo s spletnimi jezikovnimi viri in tehnologijami za slovenščino. V prispevku je opisana izvedba te delavnice v študijskem letu 2019/20, ko je zaradi pandemije koronavirusa potekala na daljavo, v obliki interaktivnih videoposnetkov z nalogami za preverjanje razumevanja snovi. Drugi del prispevka se osredotoča na mnenje študentov o tovrstnih jezikovnih virih. S spletno anketo sem analizirala stališča in izkušnje študentov dveh generacij: študenti generacije 2018/19 so spletna orodja spoznavali v razredu, študenti generacije 2019/20 pa na daljavo. Sodeč po rezultatih ankete, mlajša generacija študentov jezikovne vire na spletu uporablja pogosteje. Študenti obeh skupin najpogosteje uporabljajo Googlov Prevajalnik, ki mu sledijo Sloleks, pregibnik Besana, Fran in Pons. Kot argumente za uporabo teh virov izpostavljajo predvsem hitrost oz. enostavnost uporabe in navajenost na določen vir.","PeriodicalId":371035,"journal":{"name":"Slovenščina 2.0: empirical, applied and interdisciplinary research","volume":"122 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134632139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-29DOI: 10.4312/slo2.0.2021.2.71-99
Eva Trivunović
Prispevek prinaša pregled variant in modifikacij sedmih (iz)biblijskih frazemov v sodobni slovenščini ter njihove prisotnosti v sodobnem jeziku. Ugotovitve so primerjane z obravnavo frazemov v obstoječih slovarjih, kjer se kaže velik razkorak med slovarskim prikazom in stanjem, ki ga izkazuje korpusno gradivo. Za zanesljivejše ugotavljanje, v katerih primerih lahko govorimo o že ustaljeni variantnosti, so bili v raziskavi uporabljeni trije zvrstno različni korpusi: Gigafida 2.0, Janes in slWaC. Poleg ustaljenih variant so predstavljene neustaljene modifikacije, poseben poudarek je na prenovitvah, vendar se je jasno zastavljena tipologija mestoma izkazala za preveč togo, saj pri nekaterih mejnih primerih ni bilo mogoče nedvoumno ločiti ustaljenih variant od neprenovitvenih modifikacij ter neprenovitvenih modifikacij od prenovitvenih. Vsi izbrani frazemi in njihove prenovitve so najpogostejši v korpusu Janes, kar dokazuje nujnost vključevanja večjega števila raznovrstnih korpusov v jezikoslovne raziskave.
{"title":"Stalnost, variantnost in modificirana raba frazemov v slovenskem jeziku in slovarjih","authors":"Eva Trivunović","doi":"10.4312/slo2.0.2021.2.71-99","DOIUrl":"https://doi.org/10.4312/slo2.0.2021.2.71-99","url":null,"abstract":"Prispevek prinaša pregled variant in modifikacij sedmih (iz)biblijskih frazemov v sodobni slovenščini ter njihove prisotnosti v sodobnem jeziku. Ugotovitve so primerjane z obravnavo frazemov v obstoječih slovarjih, kjer se kaže velik razkorak med slovarskim prikazom in stanjem, ki ga izkazuje korpusno gradivo. Za zanesljivejše ugotavljanje, v katerih primerih lahko govorimo o že ustaljeni variantnosti, so bili v raziskavi uporabljeni trije zvrstno različni korpusi: Gigafida 2.0, Janes in slWaC. Poleg ustaljenih variant so predstavljene neustaljene modifikacije, poseben poudarek je na prenovitvah, vendar se je jasno zastavljena tipologija mestoma izkazala za preveč togo, saj pri nekaterih mejnih primerih ni bilo mogoče nedvoumno ločiti ustaljenih variant od neprenovitvenih modifikacij ter neprenovitvenih modifikacij od prenovitvenih. Vsi izbrani frazemi in njihove prenovitve so najpogostejši v korpusu Janes, kar dokazuje nujnost vključevanja večjega števila raznovrstnih korpusov v jezikoslovne raziskave.","PeriodicalId":371035,"journal":{"name":"Slovenščina 2.0: empirical, applied and interdisciplinary research","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127067154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-29DOI: 10.4312/slo2.0.2021.2.1-40
Maja Bitenc, Marko Stabej, Nataša Gliha Komac, Matejka Grgič, Monika Kalin Golob, K. Kenda-Jež, Albina Nečak Lük, Sonja Novak Lukanovič, Krištof Savski
Zapis posveta o aktualnih sociolingvističnih izzivih in prednostnih raziskovalnih tematikah, ki sta ga organizirala doc. dr. Maja Bitenc in red. prof. dr. Marko Stabej z Oddelka za slovenistiko in je potekal v ponedeljek, 27. 9. 2021, na Filozofski fakulteti Univerze v Ljubljani in s prenosom preko Zooma. V prvem delu so vabljene strokovnjakinje in strokovnjaki predstavili svoje poglede ob izhodiščnih vprašanjih, v drugem je sledila razprava vseh sodelujočih. Zapis posnetka so govornice in govorniki uredili po lastni presoji, načeloma s čim manj intervencijami, iz razprave pa so za branje prilagojene in objavljene vsebinsko tehtnejše replike.
由斯洛文尼亚语研究系的 Maja Bitenc 博士副教授和 Marko Stabej 博士副教授组织,于 2021 年 9 月 27 日星期一在卢布尔雅那大学文学院举行的关于当前社会语言学挑战和优先研究课题的讨论实录,并通过 Zoom 进行了流媒体传输。在第一部分,受邀专家就最初的问题发表了看法,随后所有与会者进行了讨论。录音誊本由发言者自行编辑,原则上尽量减少发言,辩论中更具实质性的反驳意见经改编后供阅读和发表。
{"title":"Sociolingvistični posvet: aktualni sociolingvistični izzivi in prednostne raziskovalne tematike","authors":"Maja Bitenc, Marko Stabej, Nataša Gliha Komac, Matejka Grgič, Monika Kalin Golob, K. Kenda-Jež, Albina Nečak Lük, Sonja Novak Lukanovič, Krištof Savski","doi":"10.4312/slo2.0.2021.2.1-40","DOIUrl":"https://doi.org/10.4312/slo2.0.2021.2.1-40","url":null,"abstract":"Zapis posveta o aktualnih sociolingvističnih izzivih in prednostnih raziskovalnih tematikah, ki sta ga organizirala doc. dr. Maja Bitenc in red. prof. dr. Marko Stabej z Oddelka za slovenistiko in je potekal v ponedeljek, 27. 9. 2021, na Filozofski fakulteti Univerze v Ljubljani in s prenosom preko Zooma. V prvem delu so vabljene strokovnjakinje in strokovnjaki predstavili svoje poglede ob izhodiščnih vprašanjih, v drugem je sledila razprava vseh sodelujočih. Zapis posnetka so govornice in govorniki uredili po lastni presoji, načeloma s čim manj intervencijami, iz razprave pa so za branje prilagojene in objavljene vsebinsko tehtnejše replike.","PeriodicalId":371035,"journal":{"name":"Slovenščina 2.0: empirical, applied and interdisciplinary research","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126862833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-29DOI: 10.4312/slo2.0.2021.2.41-70
Nikola Ljubesic, N. Logar, Iztok Kosem
Collocations play a very important role in language description, especially in identifying meanings of words. Modern lexicography’s inevitable part of meaning deduction are lists of collocates ranked by some statistical measurement. In the paper, we present a comparison between two approaches to the ranking of collocates: (a) the logDice method, which is dominantly used and frequency-based, and (b) the fastText word embeddings method, which is new and semantic-based. The comparison was made on two Slovene datasets, one representing general language headwords and their collocates, and the other representing headwords and their collocates extracted from a language for special purposes corpus. In the experiment, two methods were used: for the quantitative part of the evaluation, we used supervised machine learning with the area-under-the-curve (AUC) ROC score and support-vector machines (SVMs) algorithm, and in the qualitative part the ranking results of the two methods were evaluated by lexicographers. The results were somewhat inconsistent; while the quantitative evaluation confirmed that the machine-learning-based approach produced better collocate ranking results than the frequency-based one, lexicographers in most cases considered the listings of collocates of both methods very similar.
{"title":"Collocation ranking: frequency vs semantics","authors":"Nikola Ljubesic, N. Logar, Iztok Kosem","doi":"10.4312/slo2.0.2021.2.41-70","DOIUrl":"https://doi.org/10.4312/slo2.0.2021.2.41-70","url":null,"abstract":"Collocations play a very important role in language description, especially in identifying meanings of words. Modern lexicography’s inevitable part of meaning deduction are lists of collocates ranked by some statistical measurement. In the paper, we present a comparison between two approaches to the ranking of collocates: (a) the logDice method, which is dominantly used and frequency-based, and (b) the fastText word embeddings method, which is new and semantic-based. The comparison was made on two Slovene datasets, one representing general language headwords and their collocates, and the other representing headwords and their collocates extracted from a language for special purposes corpus. In the experiment, two methods were used: for the quantitative part of the evaluation, we used supervised machine learning with the area-under-the-curve (AUC) ROC score and support-vector machines (SVMs) algorithm, and in the qualitative part the ranking results of the two methods were evaluated by lexicographers. The results were somewhat inconsistent; while the quantitative evaluation confirmed that the machine-learning-based approach produced better collocate ranking results than the frequency-based one, lexicographers in most cases considered the listings of collocates of both methods very similar.","PeriodicalId":371035,"journal":{"name":"Slovenščina 2.0: empirical, applied and interdisciplinary research","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130593069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-29DOI: 10.4312/slo2.0.2021.2.126-129
Magdalena Gapsa
Poročilo o dveh pomembnih leksikografskih konferencah, in sicer o sedmi bienalni konferenci združenja Electronic lexicography in the 21st century (na kratko: eLex), ki je potekala med 5. in 7. julijem 2021, ter devetnajsti bienalni konferenci Evropskega leksikografskega združenja (European Association for Lexicography, EURALEX), ki je potekala med 7. in 9. septembrom 2021.
{"title":"Mednarodni konferenci eLex (5.–7. julij 2021) in EURALEX (7.–9. september 2021)","authors":"Magdalena Gapsa","doi":"10.4312/slo2.0.2021.2.126-129","DOIUrl":"https://doi.org/10.4312/slo2.0.2021.2.126-129","url":null,"abstract":"Poročilo o dveh pomembnih leksikografskih konferencah, in sicer o sedmi bienalni konferenci združenja Electronic lexicography in the 21st century (na kratko: eLex), ki je potekala med 5. in 7. julijem 2021, ter devetnajsti bienalni konferenci Evropskega leksikografskega združenja (European Association for Lexicography, EURALEX), ki je potekala med 7. in 9. septembrom 2021.","PeriodicalId":371035,"journal":{"name":"Slovenščina 2.0: empirical, applied and interdisciplinary research","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115032306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-06DOI: 10.4312/SLO2.0.2021.1.26-59
Matej Ulčar, Anka Supej, M. Robnik-Sikonja, Senja Pollak
In recent years, the use of deep neural networks and dense vector embeddings for text representation have led to excellent results in the field of computational understanding of natural language. It has also been shown that word embeddings often capture gender, racial and other types of bias. The article focuses on evaluating Slovene and Croatian word embeddings in terms of gender bias using word analogy calculations. We compiled a list of masculine and feminine nouns for occupations in Slovene and evaluated the gender bias of fastText, word2vec and ELMo embeddings with different configurations and different approaches to analogy calculations. The lowest occupational gender bias was observed with the fastText embeddings. Similarly, we compared different fastText embeddings on Croatian occupational analogies.
{"title":"Slovene and Croatian word embeddings in terms of gender occupational analogies","authors":"Matej Ulčar, Anka Supej, M. Robnik-Sikonja, Senja Pollak","doi":"10.4312/SLO2.0.2021.1.26-59","DOIUrl":"https://doi.org/10.4312/SLO2.0.2021.1.26-59","url":null,"abstract":"In recent years, the use of deep neural networks and dense vector embeddings for text representation have led to excellent results in the field of computational understanding of natural language. It has also been shown that word embeddings often capture gender, racial and other types of bias. The article focuses on evaluating Slovene and Croatian word embeddings in terms of gender bias using word analogy calculations. We compiled a list of masculine and feminine nouns for occupations in Slovene and evaluated the gender bias of fastText, word2vec and ELMo embeddings with different configurations and different approaches to analogy calculations. The lowest occupational gender bias was observed with the fastText embeddings. Similarly, we compared different fastText embeddings on Croatian occupational analogies.","PeriodicalId":371035,"journal":{"name":"Slovenščina 2.0: empirical, applied and interdisciplinary research","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120962954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-06DOI: 10.4312/SLO2.0.2021.1.123-144
Dolores Lemmenmeier-Batinić
This paper describes the procedure of building a TEI-XML corpus of spoken Serbian starting from raw transcripts. The corpus consists of semi–structured interviews, which were gathered with the aim of investigating forms of address in Serbian. The interviews were thoroughly transcribed according to GAT transcribing conventions. However, the transcription was carried out without tools that would control the validity of the GAT syntax, or align the transcript with the audio records. In order to offer this resource to a broader audience, we resolved the inconsistencies in the original transcripts, normalised the semi-orthographic transcriptions and converted the corpus into a TEI-format for transcriptions of speech. Further, we enriched the corpus by tagging and lemmatising the data. Lastly, we aligned the corpus turns to the corresponding audio segments by using a force-alignment tool. In addition to presenting the main steps involved in converting the corpus to the XML-format, this paper also discusses current challenges in the processing of spoken data, and the implications of data re-use regarding transcriptions of speech. This corpus can be used for studying Serbian from the perspective of interactional linguistics, for investigating morphosyntax, grammar, lexicon and phonetics of spoken Serbian, for studying disfluencies, as well as for testing models for automatic speech recognition and forced alignment. The corpus is freely available for research purposes.
{"title":"Converting raw transcripts into an annotated and turn-aligned TEI-XML corpus: the example of the Corpus of Serbian Forms of Address","authors":"Dolores Lemmenmeier-Batinić","doi":"10.4312/SLO2.0.2021.1.123-144","DOIUrl":"https://doi.org/10.4312/SLO2.0.2021.1.123-144","url":null,"abstract":"This paper describes the procedure of building a TEI-XML corpus of spoken Serbian starting from raw transcripts. The corpus consists of semi–structured interviews, which were gathered with the aim of investigating forms of address in Serbian. The interviews were thoroughly transcribed according to GAT transcribing conventions. However, the transcription was carried out without tools that would control the validity of the GAT syntax, or align the transcript with the audio records. In order to offer this resource to a broader audience, we resolved the inconsistencies in the original transcripts, normalised the semi-orthographic transcriptions and converted the corpus into a TEI-format for transcriptions of speech. Further, we enriched the corpus by tagging and lemmatising the data. Lastly, we aligned the corpus turns to the corresponding audio segments by using a force-alignment tool. In addition to presenting the main steps involved in converting the corpus to the XML-format, this paper also discusses current challenges in the processing of spoken data, and the implications of data re-use regarding transcriptions of speech. This corpus can be used for studying Serbian from the perspective of interactional linguistics, for investigating morphosyntax, grammar, lexicon and phonetics of spoken Serbian, for studying disfluencies, as well as for testing models for automatic speech recognition and forced alignment. The corpus is freely available for research purposes.","PeriodicalId":371035,"journal":{"name":"Slovenščina 2.0: empirical, applied and interdisciplinary research","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114146521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-06DOI: 10.4312/SLO2.0.2021.1.60-89
Lucija Gril, Mirjam Sepesy Maučec, Gregor Donaj, Andrej Žgank
Na področju govornih in jezikovnih tehnologij predstavlja avtomatsko razpoznavanje govora enega izmed ključnih gradnikov. V prispevku bomo predstavili razvoj avtomatskega razpoznavalnika slovenskega govora za domeno dnevnoinformativnih oddaj. Arhitektura sistema je zasnovana na globokih nevronskih mrežah. Pri tem smo ob upoštevanju razpoložljivih govornih virov izvedli modeliranje z različnimi aktivacijskimi funkcijami. V postopku razvoja razpoznavalnika govora smo preverili tudi, kakšen je vpliv izgubnih govornih kodekov na rezultate razpoznavanja govora. Za učenje razpoznavalnika govora smo uporabili bazi UMB BNSI Broadcast News in IETK-TV. Skupni obseg govornih posnetkov je znašal 66 ur. Vzporedno z globokimi nevronskimi mrežami smo povečali slovar razpoznavanja govora, ki je tako znašal 250.000 besed. Na ta način smo znižali delež besed izven slovarja na 1,33 %. Z razpoznavanjem govora na testni množici smo dosegli najboljšo stopnjo napačno razpoznanih besed (WER) 15,17 %. Med procesom vrednotenja rezultatov smo izvedli tudi podrobnejšo analizo napak razpoznavanja govora na osnovi lem in F-razredov, ki v določeni meri pokažejo na zahtevnost slovenskega jezika za takšne scenarije uporabe tehnologije.
{"title":"Avtomatsko razpoznavanja slovenskega govora za dnevnoinformativne oddaje","authors":"Lucija Gril, Mirjam Sepesy Maučec, Gregor Donaj, Andrej Žgank","doi":"10.4312/SLO2.0.2021.1.60-89","DOIUrl":"https://doi.org/10.4312/SLO2.0.2021.1.60-89","url":null,"abstract":"Na področju govornih in jezikovnih tehnologij predstavlja avtomatsko razpoznavanje govora enega izmed ključnih gradnikov. V prispevku bomo predstavili razvoj avtomatskega razpoznavalnika slovenskega govora za domeno dnevnoinformativnih oddaj. Arhitektura sistema je zasnovana na globokih nevronskih mrežah. Pri tem smo ob upoštevanju razpoložljivih govornih virov izvedli modeliranje z različnimi aktivacijskimi funkcijami. V postopku razvoja razpoznavalnika govora smo preverili tudi, kakšen je vpliv izgubnih govornih kodekov na rezultate razpoznavanja govora. Za učenje razpoznavalnika govora smo uporabili bazi UMB BNSI Broadcast News in IETK-TV. Skupni obseg govornih posnetkov je znašal 66 ur. Vzporedno z globokimi nevronskimi mrežami smo povečali slovar razpoznavanja govora, ki je tako znašal 250.000 besed. Na ta način smo znižali delež besed izven slovarja na 1,33 %. Z razpoznavanjem govora na testni množici smo dosegli najboljšo stopnjo napačno razpoznanih besed (WER) 15,17 %. Med procesom vrednotenja rezultatov smo izvedli tudi podrobnejšo analizo napak razpoznavanja govora na osnovi lem in F-razredov, ki v določeni meri pokažejo na zahtevnost slovenskega jezika za takšne scenarije uporabe tehnologije.","PeriodicalId":371035,"journal":{"name":"Slovenščina 2.0: empirical, applied and interdisciplinary research","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128788196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-06DOI: 10.4312/SLO2.0.2021.1.181-215
Darinka Verdonik, Simona Majhenič, Špela Antloga, Sandi Majninger, Marko Ferme, Kaja Dobrovoljc, Simona Pulko, Mira Krajnc Ivič, Natalija Ulčnik
Prispevek izhaja iz treh izzivov, ki jih zaznavamo pri pouku slovenščine v višjih razredih osnovnih šol in v srednjih šolah: kako odpraviti napake knjižne norme, ki vztrajajo v pisnih izdelkih učencev; kako izboljšati frazeološko kompetenco; kako izboljšati sporazumevalno jezikovno zmožnost. Ti izzivi so osrednja točka razvoja sodobnega učnega e-okolja Slovenščina na dlani, ki temelji na jezikovnih in informacijsko-komunikacijskih tehnologijah ter prinaša podporo prožnim oblikam poučevanja, poučevanju na daljavo, lajša učiteljevo delo, omogoča pa tudi motiviranje učencev prek elementov igrifikacije. V prispevku predstavljamo zasnovo in izvedbo vsakega od štirih vsebinskih sklopov e-okolja: pravopis, slovnica, frazeologija in besedila.
{"title":"Učno E-okolje Slovenščina na dlani: izzivi in rešitve","authors":"Darinka Verdonik, Simona Majhenič, Špela Antloga, Sandi Majninger, Marko Ferme, Kaja Dobrovoljc, Simona Pulko, Mira Krajnc Ivič, Natalija Ulčnik","doi":"10.4312/SLO2.0.2021.1.181-215","DOIUrl":"https://doi.org/10.4312/SLO2.0.2021.1.181-215","url":null,"abstract":"Prispevek izhaja iz treh izzivov, ki jih zaznavamo pri pouku slovenščine v višjih razredih osnovnih šol in v srednjih šolah: kako odpraviti napake knjižne norme, ki vztrajajo v pisnih izdelkih učencev; kako izboljšati frazeološko kompetenco; kako izboljšati sporazumevalno jezikovno zmožnost. Ti izzivi so osrednja točka razvoja sodobnega učnega e-okolja Slovenščina na dlani, ki temelji na jezikovnih in informacijsko-komunikacijskih tehnologijah ter prinaša podporo prožnim oblikam poučevanja, poučevanju na daljavo, lajša učiteljevo delo, omogoča pa tudi motiviranje učencev prek elementov igrifikacije. V prispevku predstavljamo zasnovo in izvedbo vsakega od štirih vsebinskih sklopov e-okolja: pravopis, slovnica, frazeologija in besedila.","PeriodicalId":371035,"journal":{"name":"Slovenščina 2.0: empirical, applied and interdisciplinary research","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130373487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-06DOI: 10.4312/SLO2.0.2021.1.90-122
Lucia Vlášková, Hana Strachoňová
As a growing field of study within sign language linguistics, sign language lexicography faces many challenges that have already been answered for audio-oral language material. In this paper, we present some of these challenges and methods developed to help navigate the complex lexical classification field. The described methods and strategies are implemented in the first Czech sign language (ČZJ) online dictionary, a part of the platform Dictio, developed at Masaryk University in Brno. We cover the topic of lemmatisation and how to decide what constitutes a lexeme in sign language. We introduce four types of expressions that qualify for a dictionary entry: a simple lexeme, a compound, a derivative, and a set phrase. We address the question of the place of classifier constructions and shape and size specifiers in a dictionary, given their peculiar semantic status. We maintain the standard classification of classifiers (whole entity and holding classifiers) and size and shape specifiers (SASSes; static and tracing specifiers). We provide arguments for separating the category of specifiers from the category of classifiers. We discuss the proper treatment of mouthings and mouth gestures concerning citation forms, derivation and translation. We show why it is difficult in sign language to distinguish synonyms from variants and how our proposed phonological criteria can help. We explain how to construct a semantic definition in a sign language and what is the solution for multiple meanings of one form. We offer simple guidelines for forming proper examples of use in a sign language. And finally, we briefly comment on the process of the translation between sign and spoken languages. We conclude the paper with a summary of roles that Dictio plays in the ČZJ-signing community.
{"title":"Sign language lexicography: a case study of an online dictionary","authors":"Lucia Vlášková, Hana Strachoňová","doi":"10.4312/SLO2.0.2021.1.90-122","DOIUrl":"https://doi.org/10.4312/SLO2.0.2021.1.90-122","url":null,"abstract":"As a growing field of study within sign language linguistics, sign language lexicography faces many challenges that have already been answered for audio-oral language material. In this paper, we present some of these challenges and methods developed to help navigate the complex lexical classification field. The described methods and strategies are implemented in the first Czech sign language (ČZJ) online dictionary, a part of the platform Dictio, developed at Masaryk University in Brno. We cover the topic of lemmatisation and how to decide what constitutes a lexeme in sign language. We introduce four types of expressions that qualify for a dictionary entry: a simple lexeme, a compound, a derivative, and a set phrase. We address the question of the place of classifier constructions and shape and size specifiers in a dictionary, given their peculiar semantic status. We maintain the standard classification of classifiers (whole entity and holding classifiers) and size and shape specifiers (SASSes; static and tracing specifiers). We provide arguments for separating the category of specifiers from the category of classifiers. We discuss the proper treatment of mouthings and mouth gestures concerning citation forms, derivation and translation. We show why it is difficult in sign language to distinguish synonyms from variants and how our proposed phonological criteria can help. We explain how to construct a semantic definition in a sign language and what is the solution for multiple meanings of one form. We offer simple guidelines for forming proper examples of use in a sign language. And finally, we briefly comment on the process of the translation between sign and spoken languages. We conclude the paper with a summary of roles that Dictio plays in the ČZJ-signing community.","PeriodicalId":371035,"journal":{"name":"Slovenščina 2.0: empirical, applied and interdisciplinary research","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122419001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}