Information system for extraction of information from open web resources

Vìsnik Nacìonalʹnogo unìversitetu "Lʹvìvsʹka polìtehnìka". Serìâ Ìnformacìjnì sistemi ta merežì Pub Date : 2022-12-15 DOI:10.23939/sisn2022.12.141

Petro Zdebskyi, A. Berko, L. Chyrun

{"title":"Information system for extraction of information from open web resources","authors":"Petro Zdebskyi, A. Berko, L. Chyrun","doi":"10.23939/sisn2022.12.141","DOIUrl":null,"url":null,"abstract":"The purpose of the work is to develop a project of an information and reference system for finding answers to questions based on the highest degree of comparison using text content from open English- language web resources. Examples of such questions can be: “What is the best book ever?”, “What is the most popular IDE for Python”. The result of the functioning of the information and reference system is a ranked list of answers based on the frequency of appearance of each of the answer options. Also, a numerical characteristic of the probability of the preference of a particular answer over others is added to each element of the list. Based on this metric, the obtained results are ranked. This information and reference system works with questions to which there is no unequivocal answer, what differs it from classic information systems for finding answers to questions of the QA-system type. The latter have a hypothesis that there is only one true answer to the question, often such systems work with well-known facts. Examples of questions they answer can be, for example, the date of birth of a famous person, or the population of a certain country. Instead, the proposed information and reference system answers subjective questions, for example, “What is the best book in the fantasy genre?” or “What is the best programming language?”. The system is based on the popularity of one or another answer. Proper names based on the analysis of N-grams are also keywords for forming the answer to the question.","PeriodicalId":444399,"journal":{"name":"Vìsnik Nacìonalʹnogo unìversitetu \"Lʹvìvsʹka polìtehnìka\". Serìâ Ìnformacìjnì sistemi ta merežì","volume":"41 36","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Vìsnik Nacìonalʹnogo unìversitetu \"Lʹvìvsʹka polìtehnìka\". Serìâ Ìnformacìjnì sistemi ta merežì","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23939/sisn2022.12.141","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The purpose of the work is to develop a project of an information and reference system for finding answers to questions based on the highest degree of comparison using text content from open English- language web resources. Examples of such questions can be: “What is the best book ever?”, “What is the most popular IDE for Python”. The result of the functioning of the information and reference system is a ranked list of answers based on the frequency of appearance of each of the answer options. Also, a numerical characteristic of the probability of the preference of a particular answer over others is added to each element of the list. Based on this metric, the obtained results are ranked. This information and reference system works with questions to which there is no unequivocal answer, what differs it from classic information systems for finding answers to questions of the QA-system type. The latter have a hypothesis that there is only one true answer to the question, often such systems work with well-known facts. Examples of questions they answer can be, for example, the date of birth of a famous person, or the population of a certain country. Instead, the proposed information and reference system answers subjective questions, for example, “What is the best book in the fantasy genre?” or “What is the best programming language?”. The system is based on the popularity of one or another answer. Proper names based on the analysis of N-grams are also keywords for forming the answer to the question.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用于从开放网络资源中提取信息的信息系统

这项工作的目的是开发一个信息和参考系统的项目，用于在使用开放的英语网络资源的文本内容的最高程度的比较的基础上寻找问题的答案。这类问题的例子可以是:“有史以来最好的书是什么?”，“Python最流行的IDE是什么”。信息和参考系统功能的结果是基于每个答案选项出现的频率排列的答案列表。此外，在列表的每个元素中添加了一个特定答案优于其他答案的概率的数值特征。基于这个度量，对得到的结果进行排序。这个信息和参考系统适用于没有明确答案的问题，它与寻找qa系统类型问题答案的经典信息系统的不同之处在于。后者有一个假设，即问题只有一个真正的答案，通常这样的系统与众所周知的事实一起工作。他们回答的问题可以是，例如，一个名人的出生日期，或者某个国家的人口。相反，建议的信息和参考系统回答主观问题，例如，“幻想类型中最好的书是什么?”或者“最好的编程语言是什么?”该系统是基于一个或另一个答案的受欢迎程度。基于n -gram分析的专有名称也是形成问题答案的关键字。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Vìsnik Nacìonalʹnogo unìversitetu "Lʹvìvsʹka polìtehnìka". Serìâ Ìnformacìjnì sistemi ta merežì

自引率

0.00%

发文量