开发具有语言能力的真实规模的IT系统是对资源较少的语言的挑战:印度雅利安语言的方法论建议

IF 2.7 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Journal of Information and Telecommunication Pub Date : 2021-10-02 DOI:10.1080/24751839.2021.1966236
Z. Vetulani, Grazyna Vetulani, P. Mohanty
{"title":"开发具有语言能力的真实规模的IT系统是对资源较少的语言的挑战:印度雅利安语言的方法论建议","authors":"Z. Vetulani, Grazyna Vetulani, P. Mohanty","doi":"10.1080/24751839.2021.1966236","DOIUrl":null,"url":null,"abstract":"ABSTRACT In this paper, based on the example of our early works for Polish, we want to share our experience in the challenging task of developing NLP-based technologies in the situation of initial scarcity of digital language resources that ranked Polish among the Less-Resourced Languages. We present some of our projects aiming at language resources and tools we had to create in order to be able to process texts in Polish and develop real-scale systems with language understanding competence. The case study we present here is the rule-based system POLINT-112-SMS for improving information management in emergency situations. We argue in favour of the lexicon-grammar approach to the formal description of inflecting languages and present our current work on this grammatical paradigm. Our current work is on the implementation of the ideas presented in the first part of the paper on three prominent Indian languages, that is, Hindi, Odia, and Bengali.","PeriodicalId":32180,"journal":{"name":"Journal of Information and Telecommunication","volume":null,"pages":null},"PeriodicalIF":2.7000,"publicationDate":"2021-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Development of real size IT systems with language competence as a challenge for a Less-Resourced Language: a methodological proposal for Indo-Aryan languages\",\"authors\":\"Z. Vetulani, Grazyna Vetulani, P. Mohanty\",\"doi\":\"10.1080/24751839.2021.1966236\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"ABSTRACT In this paper, based on the example of our early works for Polish, we want to share our experience in the challenging task of developing NLP-based technologies in the situation of initial scarcity of digital language resources that ranked Polish among the Less-Resourced Languages. We present some of our projects aiming at language resources and tools we had to create in order to be able to process texts in Polish and develop real-scale systems with language understanding competence. The case study we present here is the rule-based system POLINT-112-SMS for improving information management in emergency situations. We argue in favour of the lexicon-grammar approach to the formal description of inflecting languages and present our current work on this grammatical paradigm. Our current work is on the implementation of the ideas presented in the first part of the paper on three prominent Indian languages, that is, Hindi, Odia, and Bengali.\",\"PeriodicalId\":32180,\"journal\":{\"name\":\"Journal of Information and Telecommunication\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2021-10-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Information and Telecommunication\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1080/24751839.2021.1966236\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information and Telecommunication","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/24751839.2021.1966236","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 1

摘要

在本文中,基于我们早期对波兰语的工作,我们想分享我们在数字语言资源最初稀缺的情况下开发基于nlp技术的挑战任务的经验,波兰语被列为资源较少的语言。我们展示了一些针对语言资源和工具的项目,我们必须创建这些资源和工具,以便能够处理波兰语文本并开发具有语言理解能力的实际规模系统。我们在此介绍的案例研究是基于规则的POLINT-112-SMS系统,用于改善紧急情况下的信息管理。我们主张用词典-语法方法来正式描述屈折语言,并介绍我们目前在这种语法范式上的工作。我们目前的工作是将论文第一部分中提出的关于三种主要印度语言(即印地语、奥迪亚语和孟加拉语)的想法付诸实施。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Development of real size IT systems with language competence as a challenge for a Less-Resourced Language: a methodological proposal for Indo-Aryan languages
ABSTRACT In this paper, based on the example of our early works for Polish, we want to share our experience in the challenging task of developing NLP-based technologies in the situation of initial scarcity of digital language resources that ranked Polish among the Less-Resourced Languages. We present some of our projects aiming at language resources and tools we had to create in order to be able to process texts in Polish and develop real-scale systems with language understanding competence. The case study we present here is the rule-based system POLINT-112-SMS for improving information management in emergency situations. We argue in favour of the lexicon-grammar approach to the formal description of inflecting languages and present our current work on this grammatical paradigm. Our current work is on the implementation of the ideas presented in the first part of the paper on three prominent Indian languages, that is, Hindi, Odia, and Bengali.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
7.50
自引率
0.00%
发文量
18
审稿时长
27 weeks
期刊最新文献
A fast and efficient data reuse scheme for HEVC Integer Motion Estimation hardware architecture 2TierEdge-Defense: a cascaded defense framework with rule-based LSTM for NCIFA in NDN Physical layer security in wireless sensors networks: secrecy outage probability analysis Deep learning-based human pose estimation towards artworks classification JCARP: Joint Channel Assignment and Routing Protocol for cognitive-radio-based internet of things (CRIoT)
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1