Pub Date : 2021-09-13DOI: 10.31799/1684-8853-2021-4-37-46
A. Afanasev, S. Biktimirov
Introduction: Satellites which face space debris cannot track it throughout the whole orbit due to natural limitations of their optical sensors, sush as field of view, Earth occultation, or solar illumination. Besides, the time of continuous observations is usually very short. Therefore, we are trying to offer the most effective configuration of optical sensors in order to provide short-arc tracking of a target piece of debris, using a scalable Extended Information Filter. Purpose: The best scenario for short-arc tracking of a space debris orbit using multipoint optical sensors. Results: We have found optimal configurations for groups of satellites with optical sensors which move along a sun-synchronous orbit. Debris orbit determination using an Extended Information Filter and measurements from multipoint sensors was simulated, and mean squared errors of the target's position were calculated. Based on the simulation results for variouos configurations, inter-satellite distances and measurement time, the most reliable scenario (four satellites in tetrahedral configuration) was found and recommended for practical use in short-arc debris tracking.
{"title":"CubeSat formation architecture for small space debris surveillance and orbit determination","authors":"A. Afanasev, S. Biktimirov","doi":"10.31799/1684-8853-2021-4-37-46","DOIUrl":"https://doi.org/10.31799/1684-8853-2021-4-37-46","url":null,"abstract":"Introduction: Satellites which face space debris cannot track it throughout the whole orbit due to natural limitations of their optical sensors, sush as field of view, Earth occultation, or solar illumination. Besides, the time of continuous observations is usually very short. Therefore, we are trying to offer the most effective configuration of optical sensors in order to provide short-arc tracking of a target piece of debris, using a scalable Extended Information Filter. Purpose: The best scenario for short-arc tracking of a space debris orbit using multipoint optical sensors. Results: We have found optimal configurations for groups of satellites with optical sensors which move along a sun-synchronous orbit. Debris orbit determination using an Extended Information Filter and measurements from multipoint sensors was simulated, and mean squared errors of the target's position were calculated. Based on the simulation results for variouos configurations, inter-satellite distances and measurement time, the most reliable scenario (four satellites in tetrahedral configuration) was found and recommended for practical use in short-arc debris tracking.","PeriodicalId":36977,"journal":{"name":"Informatsionno-Upravliaiushchie Sistemy","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46844830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-13DOI: 10.31799/1684-8853-2021-4-71-85
A. Trofimov, F. Taubin
Introduction: Since the exact value of a decoding error probability cannot usually be calculated, an upper bounding technique is used. The standard approach for obtaining the upper bound on the maximum likelihood decoding error probability is based on the use of the union bound and the Chernoff bound, as well as its modifications. For many situations, this approach is not accurate enough. Purpose: Development of a method for exact calculation of the union bound for a decoding error probability, for a wide class of codes and memoryless channels. Methods: Use of characteristic functions of logarithm of the likelihood ratio for an arbitrary pair of codewords, trellis representation of codes and numerical integration. Results: The resulting exact union bound on the decoding error probability is based on a combination of the use of characteristic functions and the product of trellis diagrams for the code, which allows to obtain the final expression in an integral form convenient for numerical integration. An important feature of the proposed procedure is that it allows one to accurately calculate the union bound using an approach based on the use of transfer (generating) functions. With this approach, the edge labels in the product of trellis diagrams for the code are replaced by their corresponding characteristic functions. The final expression allows, using the standard methods of numerical integration, to calculate the values of the union bound on the decoding error probability with the required accuracy. Practical relevance: The results presented in this article make it possible to significantly improve the accuracy of the bound of the error decoding probability, and thereby increase the efficiency of technical solutions in the design of specific coding schemes for a wide class of communication channels.
{"title":"Evaluation of the union bound for the decoding error probability using characteristic functions","authors":"A. Trofimov, F. Taubin","doi":"10.31799/1684-8853-2021-4-71-85","DOIUrl":"https://doi.org/10.31799/1684-8853-2021-4-71-85","url":null,"abstract":"Introduction: Since the exact value of a decoding error probability cannot usually be calculated, an upper bounding technique is used. The standard approach for obtaining the upper bound on the maximum likelihood decoding error probability is based on the use of the union bound and the Chernoff bound, as well as its modifications. For many situations, this approach is not accurate enough. Purpose: Development of a method for exact calculation of the union bound for a decoding error probability, for a wide class of codes and memoryless channels. Methods: Use of characteristic functions of logarithm of the likelihood ratio for an arbitrary pair of codewords, trellis representation of codes and numerical integration. Results: The resulting exact union bound on the decoding error probability is based on a combination of the use of characteristic functions and the product of trellis diagrams for the code, which allows to obtain the final expression in an integral form convenient for numerical integration. An important feature of the proposed procedure is that it allows one to accurately calculate the union bound using an approach based on the use of transfer (generating) functions. With this approach, the edge labels in the product of trellis diagrams for the code are replaced by their corresponding characteristic functions. The final expression allows, using the standard methods of numerical integration, to calculate the values of the union bound on the decoding error probability with the required accuracy. Practical relevance: The results presented in this article make it possible to significantly improve the accuracy of the bound of the error decoding probability, and thereby increase the efficiency of technical solutions in the design of specific coding schemes for a wide class of communication channels.","PeriodicalId":36977,"journal":{"name":"Informatsionno-Upravliaiushchie Sistemy","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45768883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-13DOI: 10.31799/1684-8853-2021-4-47-60
A. Smirnov, T. Levashova
Introduction. In the decision support domain, the practice of using information from user digital traces has not been widespread so far. Earlier, the authors of this paper developed a conceptual framework of intelligent decision support based on user digital life models that was aimed at recommending decisions using information from the user digital traces. The present research is aiming at the development of a scenario model that implements this framework. Purpose: the development of a scenario model of intelligent decision support based on user digital life models and an approach to grouping users with similar preferences and decision-making behaviours. Results: A scenario model of intelligent decision support based on user digital life models has been developed. The model is intended to recommend to the user decisions based on the knowledge about the user decision-maker type, decision support problem, and problem domain. The scenario model enables to process incompletely formulated problems due to taking into account the preferences of users who have preferences and decision-making behaviour similar to the active user. An approach to grouping users with similar preferences and decision-making behaviours has been proposed. The approach enables to group users with similar preferences and decision-making behaviours based on the information about user behavioural segments that exist in various domains, behavioural segmentation rules, and user actions represented in their digital life models. Practical relevance: the research results are beneficial for the development of advanced recommendation systems expected to tracking digital traces.
{"title":"Scenario model of intelligent decision support based user digital life models","authors":"A. Smirnov, T. Levashova","doi":"10.31799/1684-8853-2021-4-47-60","DOIUrl":"https://doi.org/10.31799/1684-8853-2021-4-47-60","url":null,"abstract":"Introduction. In the decision support domain, the practice of using information from user digital traces has not been widespread so far. Earlier, the authors of this paper developed a conceptual framework of intelligent decision support based on user digital life models that was aimed at recommending decisions using information from the user digital traces. The present research is aiming at the development of a scenario model that implements this framework. Purpose: the development of a scenario model of intelligent decision support based on user digital life models and an approach to grouping users with similar preferences and decision-making behaviours. Results: A scenario model of intelligent decision support based on user digital life models has been developed. The model is intended to recommend to the user decisions based on the knowledge about the user decision-maker type, decision support problem, and problem domain. The scenario model enables to process incompletely formulated problems due to taking into account the preferences of users who have preferences and decision-making behaviour similar to the active user. An approach to grouping users with similar preferences and decision-making behaviours has been proposed. The approach enables to group users with similar preferences and decision-making behaviours based on the information about user behavioural segments that exist in various domains, behavioural segmentation rules, and user actions represented in their digital life models. Practical relevance: the research results are beneficial for the development of advanced recommendation systems expected to tracking digital traces.","PeriodicalId":36977,"journal":{"name":"Informatsionno-Upravliaiushchie Sistemy","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44510686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-13DOI: 10.31799/1684-8853-2021-4-61-70
A. Parasich, V. Parasich, I. Parasich
Introduction: Proper training set formation is a key factor in machine learning. In real training sets, problems and errors commonly occur, having a critical impact on the training result. Training set need to be formed in all machine learning problems; therefore, knowledge of possible difficulties will be helpful. Purpose: Overview of possible problems in the formation of a training set, in order to facilitate their detection and elimination when working with real training sets. Analyzing the impact of these problems on the results of the training. Results: The article makes on overview of possible errors in training set formation, such as lack of data, imbalance, false patterns, sampling from a limited set of sources, change in the general population over time, and others. We discuss the influence of these errors on the result of the training, test set formation, and training algorithm quality measurement. The pseudo-labeling, data augmentation, and hard samples mining are considered the most effective ways to expand a training set. We offer practical recommendations for the formation of a training or test set. Examples from the practice of Kaggle competitions are given. For the problem of cross-dataset generalization in neural network training, we propose an algorithm called Cross-Dataset Machine, which is simple to implement and allows you to get a gain in cross-dataset generalization. Practical relevance: The materials of the article can be used as a practical guide in solving machine learning problems.
{"title":"Training set formation in machine learning problems (review)","authors":"A. Parasich, V. Parasich, I. Parasich","doi":"10.31799/1684-8853-2021-4-61-70","DOIUrl":"https://doi.org/10.31799/1684-8853-2021-4-61-70","url":null,"abstract":"Introduction: Proper training set formation is a key factor in machine learning. In real training sets, problems and errors commonly occur, having a critical impact on the training result. Training set need to be formed in all machine learning problems; therefore, knowledge of possible difficulties will be helpful. Purpose: Overview of possible problems in the formation of a training set, in order to facilitate their detection and elimination when working with real training sets. Analyzing the impact of these problems on the results of the training. Results: The article makes on overview of possible errors in training set formation, such as lack of data, imbalance, false patterns, sampling from a limited set of sources, change in the general population over time, and others. We discuss the influence of these errors on the result of the training, test set formation, and training algorithm quality measurement. The pseudo-labeling, data augmentation, and hard samples mining are considered the most effective ways to expand a training set. We offer practical recommendations for the formation of a training or test set. Examples from the practice of Kaggle competitions are given. For the problem of cross-dataset generalization in neural network training, we propose an algorithm called Cross-Dataset Machine, which is simple to implement and allows you to get a gain in cross-dataset generalization. Practical relevance: The materials of the article can be used as a practical guide in solving machine learning problems.","PeriodicalId":36977,"journal":{"name":"Informatsionno-Upravliaiushchie Sistemy","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48391327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-13DOI: 10.31799/1684-8853-2021-4-2-17
N. A. Balonin, A. Sergeev, Olga Sinitshina
Introduction: Hadamard matrices consisting of elements 1 and –1 are an ideal object for a visual application of finite dimensional mathematics operating with a finite number of addresses for –1 elements. The notation systems of abstract algebra methods, in contrast to the conventional matrix algebra, have been changing intensively, without being widely spread, leading to the necessity to revise and systematize the accumulated experience. Purpose: To describe the algorithms of finite fields and groups in a uniform notation in order to facilitate the perception of the extensive knowledge necessary for finding orthogonal and suborthogonal sequences. Results: Formulas have been proposed for calculating relatively unknown algorithms (or their versions) developed by Scarpis, Singer, Szekeres, Goethal — Seidel, and Noboru Ito, as well as polynomial equations used to prove the theorems about the existence of finite-dimensional solutions. This replenished the significant lack of information both in the domestic literature (most of these issues are published here for the first time) and abroad. Practical relevance: Orthogonal sequences and methods for their effective finding via the theory of finite fields and groups are of direct practical importance for noise-immune coding, compression and masking of video data.
{"title":"Finite field and group algorithms for orthogonal sequence search","authors":"N. A. Balonin, A. Sergeev, Olga Sinitshina","doi":"10.31799/1684-8853-2021-4-2-17","DOIUrl":"https://doi.org/10.31799/1684-8853-2021-4-2-17","url":null,"abstract":"Introduction: Hadamard matrices consisting of elements 1 and –1 are an ideal object for a visual application of finite dimensional mathematics operating with a finite number of addresses for –1 elements. The notation systems of abstract algebra methods, in contrast to the conventional matrix algebra, have been changing intensively, without being widely spread, leading to the necessity to revise and systematize the accumulated experience. Purpose: To describe the algorithms of finite fields and groups in a uniform notation in order to facilitate the perception of the extensive knowledge necessary for finding orthogonal and suborthogonal sequences. Results: Formulas have been proposed for calculating relatively unknown algorithms (or their versions) developed by Scarpis, Singer, Szekeres, Goethal — Seidel, and Noboru Ito, as well as polynomial equations used to prove the theorems about the existence of finite-dimensional solutions. This replenished the significant lack of information both in the domestic literature (most of these issues are published here for the first time) and abroad. Practical relevance: Orthogonal sequences and methods for their effective finding via the theory of finite fields and groups are of direct practical importance for noise-immune coding, compression and masking of video data.","PeriodicalId":36977,"journal":{"name":"Informatsionno-Upravliaiushchie Sistemy","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43660461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-06-29DOI: 10.31799/1684-8853-2021-3-29-38
I. Lebedev
Introduction: The application of machine learning methods involves the collection and processing of data which comes from the recording elements in the offline mode. Most models are trained on historical data and then used in forecasting, classification, search for influencing factors or impacts, and state analysis. In the long run, the data value ranges can change, affecting the quality of the classification algorithms and leading to the situation when the models should be constantly trained or readjusted taking into account the input data. Purpose: Development of a technique to improve the quality of machine learning algorithms in a dynamically changing and non-stationary environment where the data distribution can change over time. Methods: Splitting (segmentation) of multiple data based on the information about factors affecting the ranges of target variables. Results: A data segmentation technique has been proposed, based on taking into account the factors which affect the change in the data value ranges. Impact detection makes it possible to form samples based on the current and alleged situations. Using PowerSupply dataset as an example, the mass of data is split into subsets considering the effects of factors on the value ranges. The external factors and impacts are formalized based on production rules. The processing of the factors using the membership function (indicator function) is shown. The data sample is divided into a finite number of non-intersecting measurable subsets. Experimental values of the neural network loss function are shown for the proposed technique on the selected dataset. Qualitative indicators (Accuracy, AUC, F-measure) of the classification for various classifiers are presented. Practical relevance: The results can be used in the development of classification models of machine learning methods. The proposed technique can improve the classification quality in dynamically changing conditions of the functioning.
{"title":"Dataset segmentation considering the information about impact factors","authors":"I. Lebedev","doi":"10.31799/1684-8853-2021-3-29-38","DOIUrl":"https://doi.org/10.31799/1684-8853-2021-3-29-38","url":null,"abstract":"Introduction: The application of machine learning methods involves the collection and processing of data which comes from the recording elements in the offline mode. Most models are trained on historical data and then used in forecasting, classification, search for influencing factors or impacts, and state analysis. In the long run, the data value ranges can change, affecting the quality of the classification algorithms and leading to the situation when the models should be constantly trained or readjusted taking into account the input data. Purpose: Development of a technique to improve the quality of machine learning algorithms in a dynamically changing and non-stationary environment where the data distribution can change over time. Methods: Splitting (segmentation) of multiple data based on the information about factors affecting the ranges of target variables. Results: A data segmentation technique has been proposed, based on taking into account the factors which affect the change in the data value ranges. Impact detection makes it possible to form samples based on the current and alleged situations. Using PowerSupply dataset as an example, the mass of data is split into subsets considering the effects of factors on the value ranges. The external factors and impacts are formalized based on production rules. The processing of the factors using the membership function (indicator function) is shown. The data sample is divided into a finite number of non-intersecting measurable subsets. Experimental values of the neural network loss function are shown for the proposed technique on the selected dataset. Qualitative indicators (Accuracy, AUC, F-measure) of the classification for various classifiers are presented. Practical relevance: The results can be used in the development of classification models of machine learning methods. The proposed technique can improve the classification quality in dynamically changing conditions of the functioning.","PeriodicalId":36977,"journal":{"name":"Informatsionno-Upravliaiushchie Sistemy","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46294284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-06-29DOI: 10.31799/1684-8853-2021-3-53-60
S. Ziatdinov, L. Osipov
Introduction: We discuss the problem of correlated noise suppression by adaptive complex notch filters of various orders. In order to eliminate the dependence of the transmission coefficient of the useful signal on its frequency, the pulse repetition period is modulated. Purpose: Studying the influence of pulse repetition period modulation on the correlated noise suppression coefficient. Methods: The notch filter parameters were optimized with the criterion of minimum average dispersion of correlated noise at the output of the filters during the repetition period modulation. Results: Expressions are obtained for the variance of correlated noise at the output of complex adaptive filters of various orders when the repetition period is modulated. Relationships are given for finding the optimal values of the tuning frequency and coefficients of the notch filters which minimize the correlated noise level at their output. Expressions are obtained for the coefficients of correlated noise suppression by notch filters in the context of pulse repetition period modulation. The graphs are presented showing how the correlated noise suppression coefficient depends on the relative value of the probing signal repetition period deviation for various values of the correlated noise spectral density width at optimal or non-optimal values of the tuning frequency and coefficients of the notch filters. It is shown that the use of probing pulse repetition period modulation leads to a decrease in the correlated noise suppression coefficient. On the other hand, the adaptation of the weighting coefficients for the adopted models of notch filters and correlated interference provides an increase in the suppression coefficient. Practical relevance: When developing or studying correlated noise suppression systems, the obtained results make it possible, taking into account the permissible losses of the suppression coefficient, to reasonably choose the input pulse repetition period deviation value in order to eliminate the effect of “blind” frequencies.
{"title":"Suppression of correlated interference by adaptive notch filters under pulse repetition period modulation","authors":"S. Ziatdinov, L. Osipov","doi":"10.31799/1684-8853-2021-3-53-60","DOIUrl":"https://doi.org/10.31799/1684-8853-2021-3-53-60","url":null,"abstract":" Introduction: We discuss the problem of correlated noise suppression by adaptive complex notch filters of various orders. In order to eliminate the dependence of the transmission coefficient of the useful signal on its frequency, the pulse repetition period is modulated. Purpose: Studying the influence of pulse repetition period modulation on the correlated noise suppression coefficient. Methods: The notch filter parameters were optimized with the criterion of minimum average dispersion of correlated noise at the output of the filters during the repetition period modulation. Results: Expressions are obtained for the variance of correlated noise at the output of complex adaptive filters of various orders when the repetition period is modulated. Relationships are given for finding the optimal values of the tuning frequency and coefficients of the notch filters which minimize the correlated noise level at their output. Expressions are obtained for the coefficients of correlated noise suppression by notch filters in the context of pulse repetition period modulation. The graphs are presented showing how the correlated noise suppression coefficient depends on the relative value of the probing signal repetition period deviation for various values of the correlated noise spectral density width at optimal or non-optimal values of the tuning frequency and coefficients of the notch filters. It is shown that the use of probing pulse repetition period modulation leads to a decrease in the correlated noise suppression coefficient. On the other hand, the adaptation of the weighting coefficients for the adopted models of notch filters and correlated interference provides an increase in the suppression coefficient. Practical relevance: When developing or studying correlated noise suppression systems, the obtained results make it possible, taking into account the permissible losses of the suppression coefficient, to reasonably choose the input pulse repetition period deviation value in order to eliminate the effect of “blind” frequencies.","PeriodicalId":36977,"journal":{"name":"Informatsionno-Upravliaiushchie Sistemy","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46937385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-06-29DOI: 10.31799/1684-8853-2021-3-39-52
S. Kruglik, G. Kucherov, Kamilla Nazirkhanova, Mikhail Filitov
Introduction: Currently, we witness an explosive growth in the amount of information produced by humanity. This raises new fundamental problems of its efficient storage and processing. Commonly used magnetic, optical, and semiconductor information storage devices have several drawbacks related to small information density and limited durability. One of the promising novel approaches to solving these problems is DNA-based data storage. Purpose: An overview of modern DNA-based storage systems and related information-theoretic problems. Results: The current state of the art of DNA-based storage systems is reviewed. Types of errors occurring in them as well as corresponding error-correcting codes are analized. The disadvantages of these codes are shown, and possible pathways for improvement are mentioned. Proposed information-theoretic models of DNA-based storage systems are analyzed, and their limitation highlighted. In conclusion, main obstacles to practical implementation of DNA-based storage systems are formulated, which can be potentially overcome using information-theoretic methods considered in this overview.
{"title":"Information-theoretic problems of DNA-based storage systems","authors":"S. Kruglik, G. Kucherov, Kamilla Nazirkhanova, Mikhail Filitov","doi":"10.31799/1684-8853-2021-3-39-52","DOIUrl":"https://doi.org/10.31799/1684-8853-2021-3-39-52","url":null,"abstract":"Introduction: Currently, we witness an explosive growth in the amount of information produced by humanity. This raises new fundamental problems of its efficient storage and processing. Commonly used magnetic, optical, and semiconductor information storage devices have several drawbacks related to small information density and limited durability. One of the promising novel approaches to solving these problems is DNA-based data storage. Purpose: An overview of modern DNA-based storage systems and related information-theoretic problems. Results: The current state of the art of DNA-based storage systems is reviewed. Types of errors occurring in them as well as corresponding error-correcting codes are analized. The disadvantages of these codes are shown, and possible pathways for improvement are mentioned. Proposed information-theoretic models of DNA-based storage systems are analyzed, and their limitation highlighted. In conclusion, main obstacles to practical implementation of DNA-based storage systems are formulated, which can be potentially overcome using information-theoretic methods considered in this overview.","PeriodicalId":36977,"journal":{"name":"Informatsionno-Upravliaiushchie Sistemy","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49136482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-06-29DOI: 10.31799/1684-8853-2021-3-9-18
V. Osipov, Viktor Nikiforov
Introduction: When substantiating promising architectures of streaming recurrent neural networks, it becomes necessary to assess their stability in processing various input signals. For this, stability diagrams are constructed containing the results of simulation for each of the nodes of these diagrams. Such an estimation can be time-consuming and computationally intensive, especially when analyzing large neural networks. Purpose: Search for methods of quick construction of such diagrams and assessing the stability of streaming recurrent neural networks. Results: Analysis of the features of the stability diagrams under study showed that the nodes of the diagrams are grouped into continuous zones with the same ratio characteristics of the input signal processing defects. With this in mind, the article proposes a method for constructing these diagrams based on bypassing the boundaries of their zones. With this approach, you do not have to perform simulation for the interior nodes of each zone. The simulation should be performed only for the nodes adjacent to zone boundaries. Due to this, the number of nodes for which you need to perform simulation sessions is reduced by an order of magnitude. The influence of the input signal coding types on the streaming recurrent neural network stability has been investigated. It is shown that the representation of input signals in the form of sequences of single pulses with intersecting elements can provide greater stability as compared to pulses without any intersection.
{"title":"Coding and robustness of signal processing in streaming recurrent neural networks","authors":"V. Osipov, Viktor Nikiforov","doi":"10.31799/1684-8853-2021-3-9-18","DOIUrl":"https://doi.org/10.31799/1684-8853-2021-3-9-18","url":null,"abstract":"Introduction: When substantiating promising architectures of streaming recurrent neural networks, it becomes necessary to assess their stability in processing various input signals. For this, stability diagrams are constructed containing the results of simulation for each of the nodes of these diagrams. Such an estimation can be time-consuming and computationally intensive, especially when analyzing large neural networks. Purpose: Search for methods of quick construction of such diagrams and assessing the stability of streaming recurrent neural networks. Results: Analysis of the features of the stability diagrams under study showed that the nodes of the diagrams are grouped into continuous zones with the same ratio characteristics of the input signal processing defects. With this in mind, the article proposes a method for constructing these diagrams based on bypassing the boundaries of their zones. With this approach, you do not have to perform simulation for the interior nodes of each zone. The simulation should be performed only for the nodes adjacent to zone boundaries. Due to this, the number of nodes for which you need to perform simulation sessions is reduced by an order of magnitude. The influence of the input signal coding types on the streaming recurrent neural network stability has been investigated. It is shown that the representation of input signals in the form of sequences of single pulses with intersecting elements can provide greater stability as compared to pulses without any intersection.","PeriodicalId":36977,"journal":{"name":"Informatsionno-Upravliaiushchie Sistemy","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49346393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-15DOI: 10.31799/1684-8853-2020-6-12-20
V. Pimenov, I. Pimenov
Introduction: Artificial intelligence development strategy involves the use of deep machine learning algorithms in order to solve various problems. Neural network models trained on specific data sets are difficult to interpret, which is due to the “black box” approach when knowledge is formed as a set of interneuronal connection weights. Purpose: Development of a discrete knowledge model which explicitly represents information processing patterns encoded by connections between neurons. Methods: Adaptive quantization of a feature space using a genetic algorithm, and construction of a discrete model for a multidimensional OLAP cube with binary measures. Results: A genetic algorithm extracts a discrete knowledge carrier from a trained neural network. An individual's chromosome encodes a combination of values of all quantization levels for the measurable object properties. The head gene group defines the feature space structure, while the other genes are responsible for setting up the quantization of a multidimensional space, where each gene is responsible for one quantization threshold for a given variable. A discrete model of a multidimensional OLAP cube with binary measures explicitly represents the relationships between combinations of object feature values and classes. Practical relevance: For neural network prediction models based on a training sample, genetic algorithms make it possible to find the effective value of the feature space volume for the combinations of input feature values not represented in the training sample whose volume is usually limited. The proposed discrete model builds unique images of each class based on rectangular maps which use a mesh structure of gradations. The maps reflect the most significant integral indicators of classes that determine the location and size of a class in a multidimensional space. Based on a convolution of the constructed class images, a complete system of production decision rules is recorded for the preset feature gradations.
{"title":"Interpretation of a trained neural network based on genetic algorithms","authors":"V. Pimenov, I. Pimenov","doi":"10.31799/1684-8853-2020-6-12-20","DOIUrl":"https://doi.org/10.31799/1684-8853-2020-6-12-20","url":null,"abstract":"Introduction: Artificial intelligence development strategy involves the use of deep machine learning algorithms in order to solve various problems. Neural network models trained on specific data sets are difficult to interpret, which is due to the “black box” approach when knowledge is formed as a set of interneuronal connection weights. Purpose: Development of a discrete knowledge model which explicitly represents information processing patterns encoded by connections between neurons. Methods: Adaptive quantization of a feature space using a genetic algorithm, and construction of a discrete model for a multidimensional OLAP cube with binary measures. Results: A genetic algorithm extracts a discrete knowledge carrier from a trained neural network. An individual's chromosome encodes a combination of values of all quantization levels for the measurable object properties. The head gene group defines the feature space structure, while the other genes are responsible for setting up the quantization of a multidimensional space, where each gene is responsible for one quantization threshold for a given variable. A discrete model of a multidimensional OLAP cube with binary measures explicitly represents the relationships between combinations of object feature values and classes. Practical relevance: For neural network prediction models based on a training sample, genetic algorithms make it possible to find the effective value of the feature space volume for the combinations of input feature values not represented in the training sample whose volume is usually limited. The proposed discrete model builds unique images of each class based on rectangular maps which use a mesh structure of gradations. The maps reflect the most significant integral indicators of classes that determine the location and size of a class in a multidimensional space. Based on a convolution of the constructed class images, a complete system of production decision rules is recorded for the preset feature gradations.","PeriodicalId":36977,"journal":{"name":"Informatsionno-Upravliaiushchie Sistemy","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45258020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}