A Systematic Mapping Study of Language Features Identification from Large Text Collection

D. Mati, Jaumin Ajdari, Bujar Raufi, Mentor Hamiti, B. Selimi
{"title":"A Systematic Mapping Study of Language Features Identification from Large Text Collection","authors":"D. Mati, Jaumin Ajdari, Bujar Raufi, Mentor Hamiti, B. Selimi","doi":"10.1109/MECO.2019.8760042","DOIUrl":null,"url":null,"abstract":"Natural Language Processing11Henceforth: NLP is an emerging research area in today's era. The NLP resources are quite useful when it comes to building a machine capable of translating between linguistic pairs – a solution that strives to resolve the language barrier problems. Based on this premise, we are focusing our research on feature identification from large text collections of Albanian language. ‘Rule-based’ or statistical Part-of-Speech22Henceforth: POS (POS) taggers are sought to be utilized that would either need considerable time for rule development or a sufficient amount of manually labelled data. In light of this, the impact of this research is based on exploring numerous cases that are conducive to progress and further development of this field. One of the goals of this paper is to conduct a systematic review study; to explore and analyze existing research that seek to target low resources language such as is the case of the Albanian language. According to prior observation of published research conducted since 2015, we are focusing our research on studies that have been published in areas that are relevant to Natural Language Processing. Based on considerable load of related research on this field, it is essential to conduct a review and provide an outline of the research situation as well as current developments in this specific but important field of research.","PeriodicalId":141324,"journal":{"name":"2019 8th Mediterranean Conference on Embedded Computing (MECO)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 8th Mediterranean Conference on Embedded Computing (MECO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MECO.2019.8760042","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Natural Language Processing11Henceforth: NLP is an emerging research area in today's era. The NLP resources are quite useful when it comes to building a machine capable of translating between linguistic pairs – a solution that strives to resolve the language barrier problems. Based on this premise, we are focusing our research on feature identification from large text collections of Albanian language. ‘Rule-based’ or statistical Part-of-Speech22Henceforth: POS (POS) taggers are sought to be utilized that would either need considerable time for rule development or a sufficient amount of manually labelled data. In light of this, the impact of this research is based on exploring numerous cases that are conducive to progress and further development of this field. One of the goals of this paper is to conduct a systematic review study; to explore and analyze existing research that seek to target low resources language such as is the case of the Albanian language. According to prior observation of published research conducted since 2015, we are focusing our research on studies that have been published in areas that are relevant to Natural Language Processing. Based on considerable load of related research on this field, it is essential to conduct a review and provide an outline of the research situation as well as current developments in this specific but important field of research.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
大型文本集语言特征识别的系统映射研究
自然语言处理今后:自然语言处理是当今时代的一个新兴研究领域。当涉及到构建能够在语言对之间进行翻译的机器时,NLP资源非常有用——这是一种努力解决语言障碍问题的解决方案。基于这一前提,我们将重点研究阿尔巴尼亚语大型文本集的特征识别。“基于规则的”或统计词性22从此以后:寻求使用POS (POS)标记器,这要么需要相当长的时间来开发规则,要么需要足够数量的手动标记数据。鉴于此,本研究的影响是建立在探索众多有利于该领域进步和进一步发展的案例的基础上的。本文的目标之一是进行系统的综述研究;探索和分析现有的以低资源语言为目标的研究,如阿尔巴尼亚语的情况。根据之前对2015年以来发表的研究的观察,我们的研究重点是在与自然语言处理相关的领域发表的研究。基于这一领域的大量相关研究,有必要对这一特定但重要的研究领域的研究现状和当前发展进行回顾和概述。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
E-Learning Tool to Enhance Technological Pedagogical Content Knowledge A scalable Echo State Networks hardware generator for embedded systems using high-level synthesis Exploiting Task-based Parallelism in Application Loops E-health Card Information System: Case Study Health Insurance Fund of Montenegro Smart Universal Multifunctional Digital Terminal/Portal Devices
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1