Zheni Mincheva, Nikola Vasilev, Ventsislav Nikolov, A. Antonov
{"title":"Extracting Structured Data from Text in Natural Language","authors":"Zheni Mincheva, Nikola Vasilev, Ventsislav Nikolov, A. Antonov","doi":"10.11648/J.IJIIS.20211004.16","DOIUrl":null,"url":null,"abstract":"Nowadays, the amount of information in the web is tremendous. Big part of it is presented as articles, descriptions, posts and comments i.e. free text in natural language and it is really hard to make use of it while it is in this format. Whereas, in the structured form it could be used for a lot of purposes. So, the main idea that this paper proposes is an approach for extracting data which is given as a free text in natural language into a structured data for example table. The structured information is easy to search and analyze. The structured data is quantitative, while the unstructured data is qualitative. Overall such tool that enables conversion of a text into a structured data will not only provide automatic mechanism for data extraction but will also save a lot of resources for processing and storing of the extracted data. The data extraction from text will also provide automation of the process of extracting useful insights from data that is usually processed by people. The efficiency of the process as well as its accuracy will increase and the probability of human error will be minimized. The amount of the processed data will no longer be limited by the human resources.","PeriodicalId":39658,"journal":{"name":"International Journal of Intelligent Information and Database Systems","volume":"9 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Intelligent Information and Database Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11648/J.IJIIS.20211004.16","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0
Abstract
Nowadays, the amount of information in the web is tremendous. Big part of it is presented as articles, descriptions, posts and comments i.e. free text in natural language and it is really hard to make use of it while it is in this format. Whereas, in the structured form it could be used for a lot of purposes. So, the main idea that this paper proposes is an approach for extracting data which is given as a free text in natural language into a structured data for example table. The structured information is easy to search and analyze. The structured data is quantitative, while the unstructured data is qualitative. Overall such tool that enables conversion of a text into a structured data will not only provide automatic mechanism for data extraction but will also save a lot of resources for processing and storing of the extracted data. The data extraction from text will also provide automation of the process of extracting useful insights from data that is usually processed by people. The efficiency of the process as well as its accuracy will increase and the probability of human error will be minimized. The amount of the processed data will no longer be limited by the human resources.
期刊介绍:
Intelligent information systems and intelligent database systems are a very dynamically developing field in computer sciences. IJIIDS provides a medium for exchanging scientific research and technological achievements accomplished by the international community. It focuses on research in applications of advanced intelligent technologies for data storing and processing in a wide-ranging context. The issues addressed by IJIIDS involve solutions of real-life problems, in which it is necessary to apply intelligent technologies for achieving effective results. The emphasis of the reported work is on new and original research and technological developments rather than reports on the application of existing technology to different sets of data.