首页 > 最新文献

NaUKMA Research Papers. Computer Science最新文献

英文 中文
Parking Spot Occupancy Classification Using Deep Learning 车位占用分类使用深度学习
Pub Date : 2023-02-24 DOI: 10.18523/2617-3808.2022.5.72-78
Taras Kreshchenko, Yury Yushchenko
In today’s world, where a car is present in almost every family, the parking problem plays an extremely important role. Parking is one of the most important factors in modern transport infrastructure, because it allows to save the time of both drivers and passengers, to increase the level of comfort and safety of road trips. In Ukraine, this problem is especially relevant, since nowadays it is going through the process of improving its parking infrastructure.The paper examines the problem of parking in large cities, proposes a system for recognizing occupancy of parking spots using computer vision. Such system would use camera feed to track the occupancy of each parking space within a slot. Its benefits would include ease of scalability, saving time of drivers and passengers, automation of parking payment and detection of unpaid parkings. In addition, it makes it possible to easily collect statistics about the busyness of various areas throughout the day or week.The paper also describes the algorithm of classifying the parking spot, as well as a possible architecture that the system may have.Possible problems in training a computer vision model for building the proposed system are considered. Firstly, the available parking datasets are lacking images collected in snow conditions or during nighttime. The hypothesized solution is to use vehicle detection datasets, the number of which that are publicly available is considerably bigger. Another problem is that classification accuracy drops drastically when using different images in train and test dataset. The hypothesized solution here is to apply incremental learning to improve the model as it is being used in a real-life scenario.
在当今世界,几乎每个家庭都有汽车,停车问题起着极其重要的作用。停车是现代交通基础设施中最重要的因素之一,因为它可以节省司机和乘客的时间,提高公路旅行的舒适度和安全性。在乌克兰,这个问题尤其重要,因为目前乌克兰正在改善其停车基础设施。本文研究了大城市的停车问题,提出了一种基于计算机视觉的车位识别系统。这种系统将使用摄像头来跟踪每个停车位的占用情况。它的好处包括易于扩展、节省司机和乘客的时间、自动支付停车费用和检测未付停车费用。此外,它还可以方便地收集一天或一周内各个领域的繁忙程度的统计数据。本文还介绍了车位分类的算法,以及系统可能具有的架构。考虑了在训练计算机视觉模型以构建所提出的系统时可能存在的问题。首先,现有的停车数据集缺乏在降雪条件下或夜间收集的图像。假设的解决方案是使用车辆检测数据集,公开可用的数据集数量要大得多。另一个问题是,当训练集和测试集使用不同的图像时,分类精度会急剧下降。这里假设的解决方案是应用增量学习来改进模型,因为它正在实际场景中使用。
{"title":"Parking Spot Occupancy Classification Using Deep Learning","authors":"Taras Kreshchenko, Yury Yushchenko","doi":"10.18523/2617-3808.2022.5.72-78","DOIUrl":"https://doi.org/10.18523/2617-3808.2022.5.72-78","url":null,"abstract":"In today’s world, where a car is present in almost every family, the parking problem plays an extremely important role. Parking is one of the most important factors in modern transport infrastructure, because it allows to save the time of both drivers and passengers, to increase the level of comfort and safety of road trips. In Ukraine, this problem is especially relevant, since nowadays it is going through the process of improving its parking infrastructure.The paper examines the problem of parking in large cities, proposes a system for recognizing occupancy of parking spots using computer vision. Such system would use camera feed to track the occupancy of each parking space within a slot. Its benefits would include ease of scalability, saving time of drivers and passengers, automation of parking payment and detection of unpaid parkings. In addition, it makes it possible to easily collect statistics about the busyness of various areas throughout the day or week.The paper also describes the algorithm of classifying the parking spot, as well as a possible architecture that the system may have.Possible problems in training a computer vision model for building the proposed system are considered. Firstly, the available parking datasets are lacking images collected in snow conditions or during nighttime. The hypothesized solution is to use vehicle detection datasets, the number of which that are publicly available is considerably bigger. Another problem is that classification accuracy drops drastically when using different images in train and test dataset. The hypothesized solution here is to apply incremental learning to improve the model as it is being used in a real-life scenario.","PeriodicalId":433538,"journal":{"name":"NaUKMA Research Papers. Computer Science","volume":"146 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121522525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Usage of the Speech Disfluency Detection Method for the Machine Translation of the Transcriptions of Spoken Language 语音不流利度检测方法在口语文本机器翻译中的应用
Pub Date : 2023-02-24 DOI: 10.18523/2617-3808.2022.5.54-61
A. Kramov, S. Pogorilyy
Neural machine translation falls into the category of natural language processing tasks. Despite the availability of a big number of research papers that are devoted to the improvement of the quality of the machine translation of documents, the problem of the translation of the spoken language that contains the elements of the disfluency speech is still an actual task, especially for low-resource languages like the Ukrainian language. In this paper, the problem of the neural machine translation of the transcription results of the spoken language that incorporate different elements of the disfluency speech has been considered in the case of the translation from the English language to the Ukrainian language. Different methods and software libraries for the detection of the elements of disfluency speech in English texts have been analyzed. Due to the lack of open-access corpora of the speech disfluency samples, a new synthetic labeled corpus has been created. The created corpus contains both the original version of a document and its modified version according to the different types of speech disfluency: filler words (uh, ah, etc.) and phrases (you know, I mean), reparandum-repair pairs (cases when a speaker corrects himself during the speech). The experimental verification of the effectiveness of the usage of the method of disfluency speech detection for the improvement of the machine translation of the spoken language has been performed for the pair of English and Ukrainian languages. It has been shown that the current state-of-the-art neural translation models cannot produce the appropriate translation of the elements of speech disfluency, especially, in the reparandum-repair cases. The results obtained may indicate that the mentioned method of disfluency speech detection can be used for the previous processing of the transcriptions of spoken dialogues for the creation of coherent translations by the usage of the different models of neural machine translation.
神经机器翻译属于自然语言处理任务的范畴。尽管有大量的研究论文致力于提高文档的机器翻译质量,但包含不流利语音元素的口语翻译问题仍然是一个实际的任务,特别是对于像乌克兰语这样的低资源语言。在本文中,在从英语到乌克兰语的翻译中,考虑了包含不同不流利语音元素的口语转录结果的神经机器翻译问题。分析了英语语篇中不流利言语成分检测的不同方法和软件库。由于缺乏开放获取的语音不流畅样本语料库,本文创建了一种新的合成标注语料库。创建的语料库既包含文档的原始版本,也包含根据不同类型的语音不流畅进行修改的版本:填充词(呃,啊等)和短语(你知道,我的意思是),修复-修复对(演讲者在演讲中纠正自己的情况)。用英语和乌克兰语对机器翻译进行了实验验证,验证了使用不流利语音检测方法改进口语机器翻译的有效性。研究表明,目前最先进的神经翻译模型不能对言语不流利的要素产生适当的翻译,特别是在修复-修复的情况下。结果表明,上述语音不流畅检测方法可用于口语对话转录的先前处理,从而通过使用不同的神经机器翻译模型来创建连贯的翻译。
{"title":"Usage of the Speech Disfluency Detection Method for the Machine Translation of the Transcriptions of Spoken Language","authors":"A. Kramov, S. Pogorilyy","doi":"10.18523/2617-3808.2022.5.54-61","DOIUrl":"https://doi.org/10.18523/2617-3808.2022.5.54-61","url":null,"abstract":"Neural machine translation falls into the category of natural language processing tasks. Despite the availability of a big number of research papers that are devoted to the improvement of the quality of the machine translation of documents, the problem of the translation of the spoken language that contains the elements of the disfluency speech is still an actual task, especially for low-resource languages like the Ukrainian language. In this paper, the problem of the neural machine translation of the transcription results of the spoken language that incorporate different elements of the disfluency speech has been considered in the case of the translation from the English language to the Ukrainian language. Different methods and software libraries for the detection of the elements of disfluency speech in English texts have been analyzed. Due to the lack of open-access corpora of the speech disfluency samples, a new synthetic labeled corpus has been created. The created corpus contains both the original version of a document and its modified version according to the different types of speech disfluency: filler words (uh, ah, etc.) and phrases (you know, I mean), reparandum-repair pairs (cases when a speaker corrects himself during the speech). The experimental verification of the effectiveness of the usage of the method of disfluency speech detection for the improvement of the machine translation of the spoken language has been performed for the pair of English and Ukrainian languages. It has been shown that the current state-of-the-art neural translation models cannot produce the appropriate translation of the elements of speech disfluency, especially, in the reparandum-repair cases. The results obtained may indicate that the mentioned method of disfluency speech detection can be used for the previous processing of the transcriptions of spoken dialogues for the creation of coherent translations by the usage of the different models of neural machine translation.","PeriodicalId":433538,"journal":{"name":"NaUKMA Research Papers. Computer Science","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131321301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic Generation of Ontologies Based on Articles Written in Ukrainian Language 基于乌克兰语文章的本体自动生成
Pub Date : 2023-02-24 DOI: 10.18523/2617-3808.2022.5.12-15
O. Zhezherun, M. Ryepkin
The article presents a system capable of generating new ontologies or supplementing existing ones based on articles in Ukrainian. Ontologies are described and an algorithm suitable for automated concept extraction from natural language texts is presented.Ontology as a technology has become an increasingly important topic in contemporary research. Since the creation of the Semantic Web, ontology has become a solution to many problems of understanding natural language by computers. If an ontology existed and was used to analyze documents, then we would have systems that could answer very complex queries in natural language. Google’s success showed that loading HTML pages is much easier than marking everything with semantic markup, wasting human intellectual resources. To find a solution to this problem, a new direction in the ontological field, called ontological engineering, has appeared. This direction began to study ways of automating the generation of knowledge, which would be consolidated by an ontology from the text.Humanity generates more data every day than yesterday. One of the main levers today in the choice of technologies for the implementation of new projects is whether it can cope with this flow of data, which will increase every day. Because of this, some technologies come to the fore, such as machine learning, while others recede to the periphery, due to the impossibility or lack of time to adapt to modern needs, as happened with ontologies. The main reason for the decrease in the popularity of ontologies was the need to hire experts for its construction and the lack of methods for automated construction of ontologies.This article considers the problem of automated ontology generation using articles from the Ukrainian Wikipedia, and geometry was taken as an example of the subject area. A system was built that collects data, analyzes it, and forms an ontology from it.
文章提出了一个系统能够生成新的本体或补充现有的基于文章在乌克兰。对本体进行了描述,提出了一种适用于自然语言文本概念自动抽取的算法。本体作为一种技术,已成为当代研究中日益重要的课题。自语义网创建以来,本体已经成为计算机理解自然语言的许多问题的解决方案。如果存在本体并用于分析文档,那么我们将拥有可以用自然语言回答非常复杂查询的系统。谷歌的成功表明,加载HTML页面比用语义标记所有内容要容易得多,因为语义标记浪费了人力资源。为了解决这一问题,本体论领域出现了一个新的方向——本体论工程。这个方向开始研究自动化知识生成的方法,这将通过文本本体来巩固。人类每天都会产生比昨天更多的数据。如今,在选择实施新项目的技术时,一个主要的杠杆是它是否能够应对这种每天都在增加的数据流。正因为如此,一些技术脱颖而出,如机器学习,而其他技术则退居二线,因为不可能或缺乏时间来适应现代需求,就像本体论一样。本体受欢迎程度下降的主要原因是需要聘请专家来构建本体,并且缺乏自动化构建本体的方法。本文考虑了使用乌克兰维基百科文章自动生成本体的问题,并以几何作为主题领域的一个例子。建立了一个收集数据、分析数据并从中形成本体的系统。
{"title":"Automatic Generation of Ontologies Based on Articles Written in Ukrainian Language","authors":"O. Zhezherun, M. Ryepkin","doi":"10.18523/2617-3808.2022.5.12-15","DOIUrl":"https://doi.org/10.18523/2617-3808.2022.5.12-15","url":null,"abstract":"The article presents a system capable of generating new ontologies or supplementing existing ones based on articles in Ukrainian. Ontologies are described and an algorithm suitable for automated concept extraction from natural language texts is presented.Ontology as a technology has become an increasingly important topic in contemporary research. Since the creation of the Semantic Web, ontology has become a solution to many problems of understanding natural language by computers. If an ontology existed and was used to analyze documents, then we would have systems that could answer very complex queries in natural language. Google’s success showed that loading HTML pages is much easier than marking everything with semantic markup, wasting human intellectual resources. To find a solution to this problem, a new direction in the ontological field, called ontological engineering, has appeared. This direction began to study ways of automating the generation of knowledge, which would be consolidated by an ontology from the text.Humanity generates more data every day than yesterday. One of the main levers today in the choice of technologies for the implementation of new projects is whether it can cope with this flow of data, which will increase every day. Because of this, some technologies come to the fore, such as machine learning, while others recede to the periphery, due to the impossibility or lack of time to adapt to modern needs, as happened with ontologies. The main reason for the decrease in the popularity of ontologies was the need to hire experts for its construction and the lack of methods for automated construction of ontologies.This article considers the problem of automated ontology generation using articles from the Ukrainian Wikipedia, and geometry was taken as an example of the subject area. A system was built that collects data, analyzes it, and forms an ontology from it.","PeriodicalId":433538,"journal":{"name":"NaUKMA Research Papers. Computer Science","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133955159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Software System of Checking for Plagiarism of Ukrainian Texts 检查乌克兰文本抄袭的软件系统
Pub Date : 2023-02-24 DOI: 10.18523/2617-3808.2022.5.16-25
A. Hlybovets, Mykola Bikchentaev
The purpose of this work is to describe the methodology of building a software system (application) for plagiarism checking of scientific publications in the Ukrainian language using two machine learning models, Word2Vec and BERT. We consider the detection of external plagiarism in Ukrainian texts.Plagiarism is usually defined as the passing off someone else’s ideas as your own. As the Internet becomes more and more accessible every day, a huge amount of data becomes available to people. Nowadays, it is quite easy to find a suitable study and plagiarize it instead of developing one’s own from scratch.Plagiarism undermines the efforts of the researcher whose work has been plagiarized and gives the plagiarist the opportunity to over-praise himself; such a person can be detrimental when appointed to an important position.Many fields of life are susceptible to plagiarism, including research and education. Plagiarism can also take many forms: from straight up copy-paste to paraphrasing and sentence restructuring. This makes plagiarism a rather complex problem, where methods, such as longest common subsequence or n-grams, based on finding shared words between documents, might not work. Therefore, we might consider applying deep learning to the problem of plagiarism detection.In this article we discussed the concept of plagiarism and listed its types. Two machine learning models have been proposed for plagiarism detection: Word2Vec and BERT. We also provided an overview of both models and described how they could be used in the problem of plagiarism detection.A web application for plagiarism detection in the Ukrainian language has been developed. This application features React, a JavaScript framework, on the frontend and Python on the backend. To store application data, MongoDB is used.This application allows a user to input a text that will be compared with the texts from the application database using cosine similarity or Euclidean distance as metrics. Comparison is performed using word embeddings, calculated by pre-trained BERT or Word2Vec model. A user can choose the model and similarity metrics using the application’s UI.The application can be further improved to not only output similarity metric but also highlight the similar sentences in the texts.
这项工作的目的是描述使用两个机器学习模型Word2Vec和BERT构建一个软件系统(应用程序)的方法,用于乌克兰语科学出版物的剽窃检查。我们考虑在乌克兰文本外部抄袭的检测。剽窃通常被定义为把别人的想法冒充自己的。随着互联网每天变得越来越容易访问,人们可以获得大量的数据。如今,很容易找到一个合适的研究和剽窃,而不是从头开始发展自己的。抄袭破坏了研究人员的努力,并给了剽窃者过度赞扬自己的机会;当这样的人被任命为重要职位时,可能是有害的。生活的许多领域都容易受到抄袭的影响,包括研究和教育。抄袭也可以采取多种形式:从直接复制粘贴到释义和句子重组。这使得剽窃成为一个相当复杂的问题,在这种情况下,基于查找文档之间的共享单词的方法,如最长公共子序列或n-grams,可能不起作用。因此,我们可以考虑将深度学习应用于剽窃检测问题。在这篇文章中,我们讨论了剽窃的概念,并列出了它的类型。已经提出了两种用于剽窃检测的机器学习模型:Word2Vec和BERT。我们还提供了这两个模型的概述,并描述了它们如何用于剽窃检测问题。已经开发了乌克兰语的抄袭检测网络应用程序。这个应用程序的前端是React(一个JavaScript框架),后端是Python。使用MongoDB存储应用程序数据。此应用程序允许用户输入文本,该文本将使用余弦相似度或欧几里得距离作为度量与应用程序数据库中的文本进行比较。使用词嵌入进行比较,由预训练的BERT或Word2Vec模型计算。用户可以使用应用程序的UI选择模型和相似度指标。该应用程序可以进一步改进,不仅可以输出相似度度量,还可以突出显示文本中的相似句子。
{"title":"Software System of Checking for Plagiarism of Ukrainian Texts","authors":"A. Hlybovets, Mykola Bikchentaev","doi":"10.18523/2617-3808.2022.5.16-25","DOIUrl":"https://doi.org/10.18523/2617-3808.2022.5.16-25","url":null,"abstract":"The purpose of this work is to describe the methodology of building a software system (application) for plagiarism checking of scientific publications in the Ukrainian language using two machine learning models, Word2Vec and BERT. We consider the detection of external plagiarism in Ukrainian texts.Plagiarism is usually defined as the passing off someone else’s ideas as your own. As the Internet becomes more and more accessible every day, a huge amount of data becomes available to people. Nowadays, it is quite easy to find a suitable study and plagiarize it instead of developing one’s own from scratch.Plagiarism undermines the efforts of the researcher whose work has been plagiarized and gives the plagiarist the opportunity to over-praise himself; such a person can be detrimental when appointed to an important position.Many fields of life are susceptible to plagiarism, including research and education. Plagiarism can also take many forms: from straight up copy-paste to paraphrasing and sentence restructuring. This makes plagiarism a rather complex problem, where methods, such as longest common subsequence or n-grams, based on finding shared words between documents, might not work. Therefore, we might consider applying deep learning to the problem of plagiarism detection.In this article we discussed the concept of plagiarism and listed its types. Two machine learning models have been proposed for plagiarism detection: Word2Vec and BERT. We also provided an overview of both models and described how they could be used in the problem of plagiarism detection.A web application for plagiarism detection in the Ukrainian language has been developed. This application features React, a JavaScript framework, on the frontend and Python on the backend. To store application data, MongoDB is used.This application allows a user to input a text that will be compared with the texts from the application database using cosine similarity or Euclidean distance as metrics. Comparison is performed using word embeddings, calculated by pre-trained BERT or Word2Vec model. A user can choose the model and similarity metrics using the application’s UI.The application can be further improved to not only output similarity metric but also highlight the similar sentences in the texts.","PeriodicalId":433538,"journal":{"name":"NaUKMA Research Papers. Computer Science","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131920006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analysis and Synthesis of Technology for Textual Information Classification 文本信息分类技术的分析与综合
Pub Date : 2023-02-24 DOI: 10.18523/2617-3808.2022.5.49-53
Vladyslav A. Kuznetsov, I. Krak, Volodymyr Lіashko, V. Kasianiuk
The task of developing effective text information classification systems requires the thoughtful analysis and synthesis of variable components of technology. These components strongly affect the practical efficiency and the requirements to the data. For this purpose, a typical technology was discussed, comparing the regular “learning from features” approach versus the more advanced “deep learning” approach, that studies from data. In order to implement the technology, the first approach was tested, which included the means (methods, algorithms) for analysis of the features of the source text, by applying the dimensionality transformation, and building model solutions that allow the correct classification of data by a set of features. As a result, all the steps of the technology are described, which allowed to determine the way of presenting data in terms of hidden features in data, their presentation in a standard visual form and evaluate the solution, as well as its practical efficiency, based on this set of features. In a depth study, the informational core of the document was studied, using the regression and T-stochastic grouping of features for dimensionality reduction.The separate results contain estimation of practical efficiency of the algorithms in terms of time and relative performance for each step of the proposed technology. This estimation gives a possibility to obtain the best algorithm of intelligent data processing that is useful for a given dataset and application. In order to estimate the best suited algorithm for separation in reduced dimension an experiment was carried out which allowed the selection of the best range of data classification algorithms, in particular boosting methods. As a result of the analysis of the technology, the necessary steps of this technology were discussed and the classification on real text data was conducted, which allowed to identify the most important stages of the technology for text classification.
开发有效的文本信息分类系统的任务需要对技术的可变组成部分进行深思熟虑的分析和综合。这些因素对实际工作效率和对数据的要求有很大的影响。为此,讨论了一种典型的技术,比较了常规的“从特征中学习”方法和更先进的“深度学习”方法,即从数据中学习。为了实现该技术,对第一种方法进行了测试,其中包括通过应用维度转换来分析源文本特征的手段(方法、算法),以及构建允许根据一组特征对数据进行正确分类的模型解决方案。因此,对该技术的所有步骤进行了描述,从而可以根据数据中的隐藏特征确定数据的表示方式,以标准的可视化形式表示它们,并基于这组特征评估解决方案及其实际效率。在深入研究中,研究了文档的信息核心,使用回归和特征的t随机分组进行降维。单独的结果包含算法在时间和相对性能方面的实际效率的估计,对于所提出的技术的每个步骤。这种估计为获得对给定数据集和应用有用的智能数据处理的最佳算法提供了可能。为了估计最适合的降维分离算法,进行了一项实验,该实验允许选择最佳范围的数据分类算法,特别是增强方法。通过对该技术的分析,讨论了该技术的必要步骤,并对实际文本数据进行了分类,从而确定了文本分类技术中最重要的阶段。
{"title":"Analysis and Synthesis of Technology for Textual Information Classification","authors":"Vladyslav A. Kuznetsov, I. Krak, Volodymyr Lіashko, V. Kasianiuk","doi":"10.18523/2617-3808.2022.5.49-53","DOIUrl":"https://doi.org/10.18523/2617-3808.2022.5.49-53","url":null,"abstract":"The task of developing effective text information classification systems requires the thoughtful analysis and synthesis of variable components of technology. These components strongly affect the practical efficiency and the requirements to the data. For this purpose, a typical technology was discussed, comparing the regular “learning from features” approach versus the more advanced “deep learning” approach, that studies from data. In order to implement the technology, the first approach was tested, which included the means (methods, algorithms) for analysis of the features of the source text, by applying the dimensionality transformation, and building model solutions that allow the correct classification of data by a set of features. As a result, all the steps of the technology are described, which allowed to determine the way of presenting data in terms of hidden features in data, their presentation in a standard visual form and evaluate the solution, as well as its practical efficiency, based on this set of features. In a depth study, the informational core of the document was studied, using the regression and T-stochastic grouping of features for dimensionality reduction.The separate results contain estimation of practical efficiency of the algorithms in terms of time and relative performance for each step of the proposed technology. This estimation gives a possibility to obtain the best algorithm of intelligent data processing that is useful for a given dataset and application. In order to estimate the best suited algorithm for separation in reduced dimension an experiment was carried out which allowed the selection of the best range of data classification algorithms, in particular boosting methods. As a result of the analysis of the technology, the necessary steps of this technology were discussed and the classification on real text data was conducted, which allowed to identify the most important stages of the technology for text classification.","PeriodicalId":433538,"journal":{"name":"NaUKMA Research Papers. Computer Science","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130903571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bicycle Protection System Using GPS/GSM Modules аnd Radio Protocol 基于GPS/GSM模块和无线协议的自行车保护系统
Pub Date : 2023-02-24 DOI: 10.18523/2617-3808.2022.5.41-44
S. Gorokhovskyi, Аnton Аlieksieiev
Bicycle security systems have not developed as much as home security, and it is difficult to find competitive examples when researching the market. Many security systems on the market have weaknesses that can be bypassed or are not convenient to use. The technologies used to protect bicycles are rather uniform, predictable and not reliable. Most of such systems do not have convenient means of monitoring, such as, for example, a mobile application. Improvement of these systems, introduction of new technologies is very relevant in the field of bicycle protection. This is due to the unpopularity of these systems, their unreliability and lack of control over the phone. The majority of bicycle users are inclined to use proven methods – bicycle locks. But this decision is wrong.The system with GPS is so easy not to be deceived – it has more than one level of protection, and quickly warns the user about a threat. It has deterrents and means of attracting the attention of others.In addition, the use of GSM technology facilitates the possibility of control through a mobile application, which simplifies work with the system.Using GPS is the best way to monitor the position of the bicycle in space, and to track movement in unpredictable circumstances. GPS opens a number of possibilities and increases the functionality of the system. From monitoring the situation of the protection object to collecting statistics].The GSM module is almost never used in bicycle security systems. This is due to the concept of bike guarding, which says why use the ability to transmit data to any corner of the world if the user does not move more than 100 meters from the guarded object. But this concept is wrong. GSM is one of the fastest solutions among analogs. But transmission speed is not the only criterion for information transmission in wireless systems.Since the bicycle is a moving object, and the security system must be wireless, an important criterion for the functioning of such a system is the operating time.This article deals with the problem of protecting a moving object, using GSM and GPS modules. The main features of existing systems in this area, their advantages and disadvantages are shown. The advantages of using a radio protocol for bicycle protection are given. A model of the system that meets the needs of the user has been developed.
自行车安防系统还没有像家庭安防那样发达,在研究市场时很难找到有竞争力的例子。市场上的许多安全系统都存在可以绕过或不方便使用的弱点。用于保护自行车的技术相当统一,可预测且不可靠。大多数这样的系统没有方便的监控手段,例如,移动应用程序。改进这些系统,引进新技术在自行车保护领域是非常相关的。这是由于这些系统不受欢迎,它们的不可靠性和缺乏对电话的控制。大多数自行车使用者倾向于使用经过验证的方法——自行车锁。但这个决定是错误的。带有GPS的系统很容易不被欺骗——它有不止一个级别的保护,并迅速警告用户有威胁。它具有威慑力和吸引他人注意的手段。此外,GSM技术的使用促进了通过移动应用程序进行控制的可能性,从而简化了与系统的工作。使用GPS是监测自行车在空间中的位置,以及在不可预测的情况下跟踪其运动的最佳方法。GPS开启了许多可能性,并增加了系统的功能。从监控保护对象的情况到统计数据的收集。GSM模块几乎从未在自行车安全系统中使用过。这是由于自行车保护的概念,如果用户不移动超过100米,为什么要使用将数据传输到世界任何角落的能力。但这种观念是错误的。GSM是模拟中最快的解决方案之一。但传输速度并不是无线系统中信息传输的唯一标准。由于自行车是一个移动的物体,而安全系统必须是无线的,所以这种系统能否正常工作的一个重要标准是运行时间。本文讨论了利用GSM和GPS模块对移动物体进行保护的问题。介绍了该领域现有系统的主要特点和优缺点。给出了使用无线电协议保护自行车的优点。开发了一个满足用户需求的系统模型。
{"title":"Bicycle Protection System Using GPS/GSM Modules аnd Radio Protocol","authors":"S. Gorokhovskyi, Аnton Аlieksieiev","doi":"10.18523/2617-3808.2022.5.41-44","DOIUrl":"https://doi.org/10.18523/2617-3808.2022.5.41-44","url":null,"abstract":"Bicycle security systems have not developed as much as home security, and it is difficult to find competitive examples when researching the market. Many security systems on the market have weaknesses that can be bypassed or are not convenient to use. The technologies used to protect bicycles are rather uniform, predictable and not reliable. Most of such systems do not have convenient means of monitoring, such as, for example, a mobile application. Improvement of these systems, introduction of new technologies is very relevant in the field of bicycle protection. This is due to the unpopularity of these systems, their unreliability and lack of control over the phone. The majority of bicycle users are inclined to use proven methods – bicycle locks. But this decision is wrong.The system with GPS is so easy not to be deceived – it has more than one level of protection, and quickly warns the user about a threat. It has deterrents and means of attracting the attention of others.In addition, the use of GSM technology facilitates the possibility of control through a mobile application, which simplifies work with the system.Using GPS is the best way to monitor the position of the bicycle in space, and to track movement in unpredictable circumstances. GPS opens a number of possibilities and increases the functionality of the system. From monitoring the situation of the protection object to collecting statistics].The GSM module is almost never used in bicycle security systems. This is due to the concept of bike guarding, which says why use the ability to transmit data to any corner of the world if the user does not move more than 100 meters from the guarded object. But this concept is wrong. GSM is one of the fastest solutions among analogs. But transmission speed is not the only criterion for information transmission in wireless systems.Since the bicycle is a moving object, and the security system must be wireless, an important criterion for the functioning of such a system is the operating time.This article deals with the problem of protecting a moving object, using GSM and GPS modules. The main features of existing systems in this area, their advantages and disadvantages are shown. The advantages of using a radio protocol for bicycle protection are given. A model of the system that meets the needs of the user has been developed.","PeriodicalId":433538,"journal":{"name":"NaUKMA Research Papers. Computer Science","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117184969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Methods of Learning the Structure of the Bayesian Network 学习贝叶斯网络结构的方法
Pub Date : 2021-12-10 DOI: 10.18523/2617-3808.2021.4.56-59
A. Salii
Sometimes in practice it is necessary to calculate the probability of an uncertain cause, taking into account some observed evidence. For example, we would like to know the probability of a particular disease when we observe the patient’s symptoms. Such problems are often complex with many interrelated variables. There may be many symptoms and even more potential causes. In practice, it is usually possible to obtain only the inverse conditional probability, the probability of evidence giving the cause, the probability of observing the symptoms if the patient has the disease.Intelligent systems must think about their environment. For example, a robot needs to know about the possible outcomes of its actions, and the system of medical experts needs to know what causes what consequences. Intelligent systems began to use probabilistic methods to deal with the uncertainty of the real world. Instead of building a special system of probabilistic reasoning for each new program, we would like a common framework that would allow probabilistic reasoning in any new program without restoring everything from scratch. This justifies the relevance of the developed genetic algorithm. Bayesian networks, which first appeared in the work of Judas Pearl and his colleagues in the late 1980s, offer just such an independent basis for plausible reasoning.This article presents the genetic algorithm for learning the structure of the Bayesian network that searches the space of the graph, uses mutation and crossover operators. The algorithm can be used as a quick way to learn the structure of a Bayesian network with as few constraints as possible.learn the structure of a Bayesian network with as few constraints as possible.
有时在实践中,考虑到一些观察到的证据,有必要计算不确定原因发生的概率。例如,当我们观察病人的症状时,我们想知道某种疾病发生的概率。这类问题往往是复杂的,有许多相互关联的变量。可能有许多症状和更多的潜在原因。在实践中,通常可能只获得逆条件概率,给出原因的证据的概率,如果患者患病,观察到症状的概率。智能系统必须考虑它们的环境。例如,机器人需要知道其行为可能产生的结果,医学专家系统需要知道是什么导致了什么后果。智能系统开始使用概率方法来处理现实世界的不确定性。我们不需要为每个新程序建立一个特殊的概率推理系统,我们希望有一个通用的框架,允许在任何新程序中进行概率推理,而不需要从头开始恢复一切。这证明了开发的遗传算法的相关性。贝叶斯网络最早出现在犹大·珀尔(Judas Pearl)和他的同事们于20世纪80年代末的工作中,它为合理的推理提供了这样一个独立的基础。本文提出了一种学习贝叶斯网络结构的遗传算法,该算法使用变异算子和交叉算子搜索图的空间。该算法可以在尽可能少的约束条件下快速学习贝叶斯网络的结构。学习约束尽可能少的贝叶斯网络的结构。
{"title":"Methods of Learning the Structure of the Bayesian Network","authors":"A. Salii","doi":"10.18523/2617-3808.2021.4.56-59","DOIUrl":"https://doi.org/10.18523/2617-3808.2021.4.56-59","url":null,"abstract":"Sometimes in practice it is necessary to calculate the probability of an uncertain cause, taking into account some observed evidence. For example, we would like to know the probability of a particular disease when we observe the patient’s symptoms. Such problems are often complex with many interrelated variables. There may be many symptoms and even more potential causes. In practice, it is usually possible to obtain only the inverse conditional probability, the probability of evidence giving the cause, the probability of observing the symptoms if the patient has the disease.Intelligent systems must think about their environment. For example, a robot needs to know about the possible outcomes of its actions, and the system of medical experts needs to know what causes what consequences. Intelligent systems began to use probabilistic methods to deal with the uncertainty of the real world. Instead of building a special system of probabilistic reasoning for each new program, we would like a common framework that would allow probabilistic reasoning in any new program without restoring everything from scratch. This justifies the relevance of the developed genetic algorithm. Bayesian networks, which first appeared in the work of Judas Pearl and his colleagues in the late 1980s, offer just such an independent basis for plausible reasoning.This article presents the genetic algorithm for learning the structure of the Bayesian network that searches the space of the graph, uses mutation and crossover operators. The algorithm can be used as a quick way to learn the structure of a Bayesian network with as few constraints as possible.learn the structure of a Bayesian network with as few constraints as possible.","PeriodicalId":433538,"journal":{"name":"NaUKMA Research Papers. Computer Science","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115211021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Creating a Static Design Pattern for Double Dispatching Model Signatures 创建双调度模型签名的静态设计模式
Pub Date : 2021-12-10 DOI: 10.18523/2617-3808.2021.4.64-71
Volodymyr Boublik
The paper investigates a possibility of developing a non-virtual hierarchy for a special case of class signature, which may possess different interpretations. The approach is similar to double dispatching in the C ++ programming language. As an alternative to polymorphism, a non-polymorphic hierarchy has been suggested based on generic programming templates. This hierarchy is based on inverse parametrization for templates enabling constructing a general scheme for the design pattern. The pattern defined a class architecture suitable for static implementation of double dispatched multimethod for a special case of signature- defined interfaces.In fact, any abstract base class (interface) with purely virtual operations must acquire a polymorphic implementation. Besides, the polymorphism itself, the dependence of a virtual function on two objects – “this” and another parameter – requires the use of double dispatch, turning a class member function into a double dispatched multimethod.A preliminary consideration deals with issues of double dispatching in the C++ programming language. Inheritance with polymorphic class member functions is used. This requires special efforts of adding to both bases and derived classes a couple of virtual functions to support dispatching. In any case, this approach, besides using virtual functions, has a disadvantage of violating one of the SOLID principles, namely the principle of dependency inversion: base classes should not depend on derivatives, which negatively affects the quality of the software.Polymorphism is usually understood as the dynamic tuning of a program to the data type of the object that the program will encounter during its execution. That is, by its nature, polymorphism is a purely dynamic characteristic. However, in C++ literature and in practice, you can come across the term “static polymorphism”.At the same time, research of possibilities of generalized programming (templates) allows transferring some dynamic problems to the static level. In particular, a variant of static polymorphism application without virtual functions can be considered.A variant of non-virtual double scheduling has been proposed, generalized in the form of a created design pattern “Signature multimethod”. The use of the newly created pattern is illustrated with an example of implementing classes of complex numbers. The absence of violations of SOLID principles is shown, and the possibility of supplementing the hierarchy with new derived classes without the need to interfere with the structure of the base class is demonstrated.The approach suggested in this work has been used in courses in object-oriented programming at the Faculty of Informatics of Kyiv-Mohyla Academy.
本文研究了类签名的一种特殊情况下,具有不同解释的非虚层次结构的可能性。这种方法类似于c++编程语言中的双调度。作为多态的替代方案,基于泛型编程模板的非多态层次结构已被提出。此层次结构基于模板的逆参数化,支持为设计模式构造通用方案。该模式为签名定义接口的特殊情况定义了一种适合静态实现双调度多方法的类体系结构。实际上,任何具有纯虚拟操作的抽象基类(接口)都必须获得多态实现。此外,多态性本身、虚函数对两个对象(“this”和另一个参数)的依赖要求使用双重分派,从而将类成员函数转换为双重分派的多方法。初步考虑处理c++编程语言中的双调度问题。使用多态类成员函数的继承。这需要在基类和派生类中添加一些虚拟函数来支持分派。无论如何,这种方法除了使用虚函数之外,还有一个缺点就是违反了SOLID原则之一,即依赖倒置原则:基类不应该依赖于派生类,这会对软件的质量产生负面影响。多态性通常被理解为对程序在执行过程中遇到的对象的数据类型进行动态调优。也就是说,就其本质而言,多态性是一种纯粹的动态特性。然而,在c++文献和实践中,您可能会遇到术语“静态多态性”。同时,通过对广义规划(模板)可能性的研究,可以将一些动态问题转移到静态层面。特别是,可以考虑一种没有虚函数的静态多态应用程序的变体。提出了一种非虚拟双调度的变体,并以创建的设计模式“签名多方法”的形式进行了推广。通过一个实现复数类的示例来说明新创建的模式的使用。展示了不违反SOLID原则的情况,并展示了用新的派生类补充层次结构而不需要干扰基类结构的可能性。这项工作中提出的方法已在基辅-莫希拉学院信息学学院的面向对象程序设计课程中使用。
{"title":"Towards Creating a Static Design Pattern for Double Dispatching Model Signatures","authors":"Volodymyr Boublik","doi":"10.18523/2617-3808.2021.4.64-71","DOIUrl":"https://doi.org/10.18523/2617-3808.2021.4.64-71","url":null,"abstract":"The paper investigates a possibility of developing a non-virtual hierarchy for a special case of class signature, which may possess different interpretations. The approach is similar to double dispatching in the C ++ programming language. As an alternative to polymorphism, a non-polymorphic hierarchy has been suggested based on generic programming templates. This hierarchy is based on inverse parametrization for templates enabling constructing a general scheme for the design pattern. The pattern defined a class architecture suitable for static implementation of double dispatched multimethod for a special case of signature- defined interfaces.In fact, any abstract base class (interface) with purely virtual operations must acquire a polymorphic implementation. Besides, the polymorphism itself, the dependence of a virtual function on two objects – “this” and another parameter – requires the use of double dispatch, turning a class member function into a double dispatched multimethod.A preliminary consideration deals with issues of double dispatching in the C++ programming language. Inheritance with polymorphic class member functions is used. This requires special efforts of adding to both bases and derived classes a couple of virtual functions to support dispatching. In any case, this approach, besides using virtual functions, has a disadvantage of violating one of the SOLID principles, namely the principle of dependency inversion: base classes should not depend on derivatives, which negatively affects the quality of the software.Polymorphism is usually understood as the dynamic tuning of a program to the data type of the object that the program will encounter during its execution. That is, by its nature, polymorphism is a purely dynamic characteristic. However, in C++ literature and in practice, you can come across the term “static polymorphism”.At the same time, research of possibilities of generalized programming (templates) allows transferring some dynamic problems to the static level. In particular, a variant of static polymorphism application without virtual functions can be considered.A variant of non-virtual double scheduling has been proposed, generalized in the form of a created design pattern “Signature multimethod”. The use of the newly created pattern is illustrated with an example of implementing classes of complex numbers. The absence of violations of SOLID principles is shown, and the possibility of supplementing the hierarchy with new derived classes without the need to interfere with the structure of the base class is demonstrated.The approach suggested in this work has been used in courses in object-oriented programming at the Faculty of Informatics of Kyiv-Mohyla Academy.","PeriodicalId":433538,"journal":{"name":"NaUKMA Research Papers. Computer Science","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121749403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Implementation of a Graphic Interface Development Tool for Prolog Prolog图形界面开发工具的实现
Pub Date : 2021-12-10 DOI: 10.18523/2617-3808.2021.4.108-112
N. Ivaniuk, A. Kucher, Yury Yuschenko
The work examines the current problems of the spread of use of logical programming in the development of commercial multi-platform software applications, tools for convenient development of a modern graphical interface to the logical programs. Libraries with similar concepts of use have been analyzed and described. The purpose of the proposed concept, which is implemented as an open source library, is described, and the advantages of the proposed tools over similar existing tools are indicated. The main feature and advantage of the proposed concept is the implementation of Prolog business logic and interface by means of JavaScript usage of child processes. The proposed concept of interface to Prolog takes full advantage of the possibilities provided by async await. A framework library has been created for the use of Logic Programming in graphical interface development without losses in the application performance. The paper describes the proposed concept and the developed framework (library). The ways to further improve the possibilities for expanding the purpose of the implemented library were identified. The directions of further simplification for programmers of integration of the graphic interface to logical programs have been defined. A significant advantage of the proposed tool is the easy-to-use functions to wrap and control the correctness of requests to the Prolog. The main goal of the library is to create an environment for the Prolog developers where they can create any type of software, which is meant to be user friendly, fast, and cross platform using modern and flexible. This concept also tries to solve disadvantages and architectural problems that were found in other libraries. The safety of library functionality has been analyzed. The concept of potential horizontal application scalability is described. Conclusions and future of libraries were introduced, in which the usage of TypeScript for type-safety and avoidance of run-time errors is mentioned. Overall, the library extends the use of Prolog beyond logical programming and takes a leap forward in its progress.
这项工作考察了目前在商业多平台软件应用程序开发中使用逻辑编程的传播问题,为方便开发逻辑程序的现代图形界面的工具。对具有类似使用概念的库进行了分析和描述。本文描述了所提出的概念的目的,该概念是作为一个开源库实现的,并指出了所提出的工具相对于类似的现有工具的优势。提出的概念的主要特点和优点是通过使用子进程的JavaScript实现Prolog业务逻辑和接口。提议的Prolog接口概念充分利用了async await提供的可能性。为在图形界面开发中使用逻辑编程而不损失应用程序性能,创建了一个框架库。本文描述了提出的概念和开发的框架(库)。确定了进一步改进扩展已实现库的目的的可能性的方法。定义了程序员进一步简化图形界面与逻辑程序集成的方向。所建议的工具的一个显著优点是易于使用的功能来包装和控制对Prolog的请求的正确性。该库的主要目标是为Prolog开发人员创建一个环境,他们可以在其中创建任何类型的软件,这意味着使用现代和灵活的用户友好,快速和跨平台。这个概念还试图解决在其他库中发现的缺点和体系结构问题。分析了库功能的安全性。描述了潜在水平应用程序可伸缩性的概念。介绍了库的结论和未来,其中提到了TypeScript在类型安全和避免运行时错误方面的使用。总的来说,该库将Prolog的使用扩展到逻辑编程之外,并在其进展中取得了飞跃。
{"title":"Implementation of a Graphic Interface Development Tool for Prolog","authors":"N. Ivaniuk, A. Kucher, Yury Yuschenko","doi":"10.18523/2617-3808.2021.4.108-112","DOIUrl":"https://doi.org/10.18523/2617-3808.2021.4.108-112","url":null,"abstract":"\u0000 \u0000 \u0000The work examines the current problems of the spread of use of logical programming in the development of commercial multi-platform software applications, tools for convenient development of a modern graphical interface to the logical programs. Libraries with similar concepts of use have been analyzed and described. The purpose of the proposed concept, which is implemented as an open source library, is described, and the advantages of the proposed tools over similar existing tools are indicated. The main feature and advantage of the proposed concept is the implementation of Prolog business logic and interface by means of JavaScript usage of child processes. The proposed concept of interface to Prolog takes full advantage of the possibilities provided by async await. A framework library has been created for the use of Logic Programming in graphical interface development without losses in the application performance. The paper describes the proposed concept and the developed framework (library). The ways to further improve the possibilities for expanding the purpose of the implemented library were identified. The directions of further simplification for programmers of integration of the graphic interface to logical programs have been defined. A significant advantage of the proposed tool is the easy-to-use functions to wrap and control the correctness of requests to the Prolog. The main goal of the library is to create an environment for the Prolog developers where they can create any type of software, which is meant to be user friendly, fast, and cross platform using modern and flexible. This concept also tries to solve disadvantages and architectural problems that were found in other libraries. The safety of library functionality has been analyzed. The concept of potential horizontal application scalability is described. Conclusions and future of libraries were introduced, in which the usage of TypeScript for type-safety and avoidance of run-time errors is mentioned. Overall, the library extends the use of Prolog beyond logical programming and takes a leap forward in its progress. \u0000 \u0000 \u0000","PeriodicalId":433538,"journal":{"name":"NaUKMA Research Papers. Computer Science","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130831146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Parallel SVD Algorithm for a Three-Diagonal Matrix on a Video Card Using the Nvidia CUDA Architecture 基于Nvidia CUDA架构的显卡上三对角矩阵的并行SVD算法
Pub Date : 2021-12-10 DOI: 10.18523/2617-3808.2021.4.16-22
Mykola Semylitko, G. Malaschonok
SVD (Singular Value Decomposition) algorithm is used in recommendation systems, machine learning, image processing, and in various algorithms for working with matrices which can be very large and Big Data, so, given the peculiarities of this algorithm, it can be performed on a large number of computing threads that have only video cards.CUDA is a parallel computing platform and application programming interface model created by Nvidia. It allows software developers and software engineers to use a CUDA-enabled graphics processing unit for general purpose processing – an approach termed GPGPU (general-purpose computing on graphics processing units). The GPU provides much higher instruction throughput and memory bandwidth than the CPU within a similar price and power envelope. Many applications leverage these higher capabilities to run faster on the GPU than on the CPU. Other computing devices, like FPGAs, are also very energy efficient, but they offer much less programming flexibility than GPUs.The developed modification uses the CUDA architecture, which is intended for a large number of simultaneous calculations, which allows to quickly process matrices of very large sizes. The algorithm of parallel SVD for a three-diagonal matrix based on the Givents rotation provides a high accuracy of calculations. Also the algorithm has a number of optimizations to work with memory and multiplication algorithms that can significantly reduce the computation time discarding empty iterations.This article proposes an approach that will reduce the computation time and, consequently, resources and costs. The developed algorithm can be used with the help of a simple and convenient API in C ++ and Java, as well as will be improved by using dynamic parallelism or parallelization of multiplication operations. Also the obtained results can be used by other developers for comparison, as all conditions of the research are described in detail, and the code is in free access.
SVD(奇异值分解)算法被用于推荐系统、机器学习、图像处理,以及各种处理矩阵的算法,这些矩阵可能非常大,数据也很大,因此,考虑到该算法的特殊性,它可以在只有显卡的大量计算线程上执行。CUDA是Nvidia创建的并行计算平台和应用程序编程接口模型。它允许软件开发人员和软件工程师使用支持cuda的图形处理单元进行通用处理——一种称为GPGPU(图形处理单元上的通用计算)的方法。在类似的价格和功率范围内,GPU提供比CPU更高的指令吞吐量和内存带宽。许多应用程序利用这些更高的功能在GPU上比在CPU上运行得更快。其他计算设备,如fpga,也非常节能,但它们提供的编程灵活性远不如gpu。开发的修改使用CUDA架构,用于大量同时计算,允许快速处理非常大尺寸的矩阵。基于Givents旋转的三对角矩阵并行奇异值分解算法提供了较高的计算精度。此外,该算法还对内存和乘法算法进行了许多优化,可以显著减少丢弃空迭代的计算时间。本文提出了一种方法,可以减少计算时间,从而减少资源和成本。所开发的算法可以在c++和Java中使用一个简单方便的API,并且可以通过使用乘法运算的动态并行化或并行化来改进。由于详细描述了研究的所有条件,并且代码是免费访问的,因此获得的结果可以被其他开发人员用于比较。
{"title":"Parallel SVD Algorithm for a Three-Diagonal Matrix on a Video Card Using the Nvidia CUDA Architecture","authors":"Mykola Semylitko, G. Malaschonok","doi":"10.18523/2617-3808.2021.4.16-22","DOIUrl":"https://doi.org/10.18523/2617-3808.2021.4.16-22","url":null,"abstract":"SVD (Singular Value Decomposition) algorithm is used in recommendation systems, machine learning, image processing, and in various algorithms for working with matrices which can be very large and Big Data, so, given the peculiarities of this algorithm, it can be performed on a large number of computing threads that have only video cards.CUDA is a parallel computing platform and application programming interface model created by Nvidia. It allows software developers and software engineers to use a CUDA-enabled graphics processing unit for general purpose processing – an approach termed GPGPU (general-purpose computing on graphics processing units). The GPU provides much higher instruction throughput and memory bandwidth than the CPU within a similar price and power envelope. Many applications leverage these higher capabilities to run faster on the GPU than on the CPU. Other computing devices, like FPGAs, are also very energy efficient, but they offer much less programming flexibility than GPUs.The developed modification uses the CUDA architecture, which is intended for a large number of simultaneous calculations, which allows to quickly process matrices of very large sizes. The algorithm of parallel SVD for a three-diagonal matrix based on the Givents rotation provides a high accuracy of calculations. Also the algorithm has a number of optimizations to work with memory and multiplication algorithms that can significantly reduce the computation time discarding empty iterations.This article proposes an approach that will reduce the computation time and, consequently, resources and costs. The developed algorithm can be used with the help of a simple and convenient API in C ++ and Java, as well as will be improved by using dynamic parallelism or parallelization of multiplication operations. Also the obtained results can be used by other developers for comparison, as all conditions of the research are described in detail, and the code is in free access.","PeriodicalId":433538,"journal":{"name":"NaUKMA Research Papers. Computer Science","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115422956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
NaUKMA Research Papers. Computer Science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1