自然语言处理与大数据——基于本体的跨语言信息检索方法

J. Monti, Mario Monteleone, Maria Pia di Buono, Federica Marano
{"title":"自然语言处理与大数据——基于本体的跨语言信息检索方法","authors":"J. Monti, Mario Monteleone, Maria Pia di Buono, Federica Marano","doi":"10.1109/SocialCom.2013.108","DOIUrl":null,"url":null,"abstract":"Extracting relevant information in multilingual context from massive amounts of unstructured, structured and semi-structured data is a challenging task. Various theories have been developed and applied to ease the access to multicultural and multilingual resources. This papers describes a methodology for the development of an ontology-based Cross-Language Information Retrieval (CLIR) application and shows how it is possible to achieve the translation of Natural Language (NL) queries in any language by means of a knowledge-driven approach which allows to semi-automatically map natural language to formal language, simplifying and improving in this way the human-computer interaction and communication. The outlined research activities are based on Lexicon-Grammar (LG), a method devised for natural language formalization, automatic textual analysis and parsing. Thanks to its main characteristics, LG is independent from factors which are critical for other approaches, i.e. interaction type (voice or keyboard-based), length of sentences and propositions, type of vocabulary used and restrictions due to users' idiolects. The feasibility of our knowledge-based methodological framework, which allows mapping both data and metadata, will be tested for CLIR by implementing a domain-specific early prototype system.","PeriodicalId":129308,"journal":{"name":"2013 International Conference on Social Computing","volume":"84 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Natural Language Processing and Big Data - An Ontology-Based Approach for Cross-Lingual Information Retrieval\",\"authors\":\"J. Monti, Mario Monteleone, Maria Pia di Buono, Federica Marano\",\"doi\":\"10.1109/SocialCom.2013.108\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Extracting relevant information in multilingual context from massive amounts of unstructured, structured and semi-structured data is a challenging task. Various theories have been developed and applied to ease the access to multicultural and multilingual resources. This papers describes a methodology for the development of an ontology-based Cross-Language Information Retrieval (CLIR) application and shows how it is possible to achieve the translation of Natural Language (NL) queries in any language by means of a knowledge-driven approach which allows to semi-automatically map natural language to formal language, simplifying and improving in this way the human-computer interaction and communication. The outlined research activities are based on Lexicon-Grammar (LG), a method devised for natural language formalization, automatic textual analysis and parsing. Thanks to its main characteristics, LG is independent from factors which are critical for other approaches, i.e. interaction type (voice or keyboard-based), length of sentences and propositions, type of vocabulary used and restrictions due to users' idiolects. The feasibility of our knowledge-based methodological framework, which allows mapping both data and metadata, will be tested for CLIR by implementing a domain-specific early prototype system.\",\"PeriodicalId\":129308,\"journal\":{\"name\":\"2013 International Conference on Social Computing\",\"volume\":\"84 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-09-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 International Conference on Social Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SocialCom.2013.108\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Conference on Social Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SocialCom.2013.108","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

摘要

从大量的非结构化、结构化和半结构化数据中提取多语言环境下的相关信息是一项具有挑战性的任务。各种理论已经发展并应用于简化多元文化和多语言资源的获取。本文描述了一种基于本体的跨语言信息检索(CLIR)应用程序的开发方法,并展示了如何通过知识驱动的方法实现任何语言的自然语言(NL)查询的翻译,该方法允许半自动地将自然语言映射到形式语言,以这种方式简化和改进人机交互和通信。本文概述的研究活动是基于词典语法(Lexicon-Grammar, LG),这是一种用于自然语言形式化、自动文本分析和解析的方法。由于它的主要特点,LG独立于其他方法的关键因素,即交互类型(语音或基于键盘),句子和命题的长度,使用的词汇类型和用户的习惯限制。我们基于知识的方法框架的可行性,它允许映射数据和元数据,将通过实现一个领域特定的早期原型系统来测试CLIR。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Natural Language Processing and Big Data - An Ontology-Based Approach for Cross-Lingual Information Retrieval
Extracting relevant information in multilingual context from massive amounts of unstructured, structured and semi-structured data is a challenging task. Various theories have been developed and applied to ease the access to multicultural and multilingual resources. This papers describes a methodology for the development of an ontology-based Cross-Language Information Retrieval (CLIR) application and shows how it is possible to achieve the translation of Natural Language (NL) queries in any language by means of a knowledge-driven approach which allows to semi-automatically map natural language to formal language, simplifying and improving in this way the human-computer interaction and communication. The outlined research activities are based on Lexicon-Grammar (LG), a method devised for natural language formalization, automatic textual analysis and parsing. Thanks to its main characteristics, LG is independent from factors which are critical for other approaches, i.e. interaction type (voice or keyboard-based), length of sentences and propositions, type of vocabulary used and restrictions due to users' idiolects. The feasibility of our knowledge-based methodological framework, which allows mapping both data and metadata, will be tested for CLIR by implementing a domain-specific early prototype system.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A Novel Group Recommendation Algorithm with Collaborative Filtering Access Control Policy Extraction from Unconstrained Natural Language Text Stock Market Manipulation Using Cyberattacks Together with Misinformation Disseminated through Social Media Friendship Prediction on Social Network Users An Empirical Comparison of Graph Databases
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1