Pub Date : 2026-03-02DOI: 10.1134/S1064562425700292
S. N. Koltcov, V. V. Ignatenko, A. Yu. Surkov, V. O. Zakharov
This study investigates the capability of small reasoning-oriented language models to construct analytical solutions to differential equations. Computational experiments are conducted on such models as DeepSeek-R1-Distill-Qwen-1.5B, Qwen2.5-1.5B, and Open-Reasoner-Zero-1.5B. To extract the final answers from the models reasoning processes, postprocessing is applied using two additional language models, Qwen2.5:latest and Llama3.2: latest. The extracted solutions are then compared with reference solutions using the BLEU metric. Our results demonstrate that, on average, Open-Reasoner-Zero-1.5B achieves superior performance, reaching the highest BLEU score (0.978) for second-order homogeneous equations.
{"title":"Solving Differential Equations with Pretrained Out-of-the-Box Models: The Potential of Small-Scale LLMs","authors":"S. N. Koltcov, V. V. Ignatenko, A. Yu. Surkov, V. O. Zakharov","doi":"10.1134/S1064562425700292","DOIUrl":"10.1134/S1064562425700292","url":null,"abstract":"<p>This study investigates the capability of small reasoning-oriented language models to construct analytical solutions to differential equations. Computational experiments are conducted on such models as DeepSeek-R1-Distill-Qwen-1.5B, Qwen2.5-1.5B, and Open-Reasoner-Zero-1.5B. To extract the final answers from the models reasoning processes, postprocessing is applied using two additional language models, Qwen2.5:latest and Llama3.2: latest. The extracted solutions are then compared with reference solutions using the BLEU metric. Our results demonstrate that, on average, Open-Reasoner-Zero-1.5B achieves superior performance, reaching the highest BLEU score (0.978) for second-order homogeneous equations.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":"112 1","pages":"273 - 278"},"PeriodicalIF":0.6,"publicationDate":"2026-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147335854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-02DOI: 10.1134/S1064562425700255
A. K. Gorshenin, A. M. Dostovalova
The article presents a novel MMRFiGN ensemble graph neural network model, informed by multicomponent Markov random fields, to improve object segmentation quality in high-resolution images for cases of imbalanced and volatile datasets. A key component of this model is a specially designed two-branch block of graph convolutions. This block simultaneously processes local and global image features based on multiscale image partitions using a multicomponent Markov model to reconstruct spatial relationships between features. A theorem on the faster decrease of the loss function for a multicomponent graph architecture is proven, indicating faster model training compared to graph and convolutional models of comparable size. The MMRFiGN model was tested on the task of image segmentation collected with unmanned aerial vehicles on heterogeneous urban landscapes (open datasets UAVid and UDD were used: Ultra HD 4K resolution, with an imbalance of object classes in terms of numbers). MMRFiGN has outperformed the recognition of both large (buildings, roads) and small objects of different scales (cars) compared to modern convolutional architectures (DeepLabV3, ENet) as well as transformers (SegFormer and SOTA-model 2025 LWGANet): in the first case, an increase in the F1-score reaches 25.04% (on average, up to 12.08%), and in the second, 14.87 (on average, up to 11.52%). MMRFiGN also outperforms alternative ensemble implementations based on graph architectures with attention up to 20.97%. At the same time, MMRFiGN has fewer parameters than the basic networks, demonstrating the possibility of reduction by a factor of 1.78.
{"title":"MMRFiGN: An Ensemble Graph Segmentation Model for Imbalanced High-Resolution Images Informed by Multicomponent Markov Random Fields","authors":"A. K. Gorshenin, A. M. Dostovalova","doi":"10.1134/S1064562425700255","DOIUrl":"10.1134/S1064562425700255","url":null,"abstract":"<p>The article presents a novel MMRFiGN ensemble graph neural network model, informed by multicomponent Markov random fields, to improve object segmentation quality in high-resolution images for cases of imbalanced and volatile datasets. A key component of this model is a specially designed two-branch block of graph convolutions. This block simultaneously processes local and global image features based on multiscale image partitions using a multicomponent Markov model to reconstruct spatial relationships between features. A theorem on the faster decrease of the loss function for a multicomponent graph architecture is proven, indicating faster model training compared to graph and convolutional models of comparable size. The MMRFiGN model was tested on the task of image segmentation collected with unmanned aerial vehicles on heterogeneous urban landscapes (open datasets UAVid and UDD were used: Ultra HD 4K resolution, with an imbalance of object classes in terms of numbers). MMRFiGN has outperformed the recognition of both large (buildings, roads) and small objects of different scales (cars) compared to modern convolutional architectures (DeepLabV3, ENet) as well as transformers (SegFormer and SOTA-model 2025 LWGANet): in the first case, an increase in the <i>F</i><sub>1</sub>-score reaches 25.04% (on average, up to 12.08%), and in the second, 14.87 (on average, up to 11.52%). MMRFiGN also outperforms alternative ensemble implementations based on graph architectures with attention up to 20.97%. At the same time, MMRFiGN has fewer parameters than the basic networks, demonstrating the possibility of reduction by a factor of 1.78.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":"112 1","pages":"308 - 318"},"PeriodicalIF":0.6,"publicationDate":"2026-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147336332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-02DOI: 10.1134/S1064562425700322
M. M. Tikhomirov, D. I. Chernyshev
Multilingual Large Language Models (LLMs) often exhibit degraded performance for languages other than English due to the imbalance in their training data. Directly adapting these models to a new language, such as Russian, carries the risk of catastrophic forgetting of their original capabilities and demands significant computational resources. The article introduces Ruadapt: a comprehensive and computationally efficient methodology for language adaptation of LLMs, featuring tokenizer replacement. A full adaptation of a single Qwen3-8B model version with our methodology requires less than 2000 GPU h, while subsequent adaptations of other versions are up to ten times less resource-intensive due to the modular nature of the procedure’s steps. An optimal configuration achieves up to an 80% speed-up in generation, with full preservation of long-context capabilities and only minor degradation in instruction-following performance. The authors conduct a detailed empirical study of each adaptation step to identify optimal hyperparameters and to assess the impact of each key stage on the final quality. These resulting guidelines are implemented in the current generation of Ruadapt models, such as RuadaptQwen3-32B-Hybrid. We are open-sourcing our models, code, and datasets to provide the research community with a validated and cost-effective strategy for developing high-quality, language-specific models.
{"title":"Ruadapt: Cost-Effective Large Language Model Lingual Adaptation","authors":"M. M. Tikhomirov, D. I. Chernyshev","doi":"10.1134/S1064562425700322","DOIUrl":"10.1134/S1064562425700322","url":null,"abstract":"<p>Multilingual Large Language Models (LLMs) often exhibit degraded performance for languages other than English due to the imbalance in their training data. Directly adapting these models to a new language, such as Russian, carries the risk of catastrophic forgetting of their original capabilities and demands significant computational resources. The article introduces Ruadapt: a comprehensive and computationally efficient methodology for language adaptation of LLMs, featuring tokenizer replacement. A full adaptation of a single Qwen3-8B model version with our methodology requires less than 2000 GPU h, while subsequent adaptations of other versions are up to ten times less resource-intensive due to the modular nature of the procedure’s steps. An optimal configuration achieves up to an 80% speed-up in generation, with full preservation of long-context capabilities and only minor degradation in instruction-following performance. The authors conduct a detailed empirical study of each adaptation step to identify optimal hyperparameters and to assess the impact of each key stage on the final quality. These resulting guidelines are implemented in the current generation of Ruadapt models, such as RuadaptQwen3-32B-Hybrid. We are open-sourcing our models, code, and datasets to provide the research community with a validated and cost-effective strategy for developing high-quality, language-specific models.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":"112 1","pages":"246 - 254"},"PeriodicalIF":0.6,"publicationDate":"2026-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147335855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-02DOI: 10.1134/S1064562425700231
R. UI. Epifanov, Ya. V. Fedotova, D. R. Popov, R. I. Mulliyazhanov
We propose a universal neural network architecture for single-stage multi-class polygonal model generation of anatomical structures from three-dimensional medical images. The key component of the architecture is a trainable affine module that dynamically positions and scales the initial meshes of anatomical structures. This eliminates the need for manual template preparation and reduces the number of self-intersections in the resulting meshes. The effectiveness of the proposed approach has been confirmed on the CHAOS and MMWHS datasets. On CHAOS, an average Dice score of 0.958 is achieved with an ASSD of 1.399 mm, and self-intersections are observed in only 2 out of 20 generated surfaces. On MMWHS, the average Dice score across heart structures is approximately 0.9, and the proportion of self-intersecting edges is comparable to or lower than in the best available methods. Overall, the results demonstrate an accuracy level comparable to modern standards, while producing meshes with significantly cleaner topology. Ablation analysis also confirms the importance of the affine module for generating topologically correct polygonal models.
{"title":"Multi-Class Surface Generation of Complex Anatomical Structures Using Neural Networks","authors":"R. UI. Epifanov, Ya. V. Fedotova, D. R. Popov, R. I. Mulliyazhanov","doi":"10.1134/S1064562425700231","DOIUrl":"10.1134/S1064562425700231","url":null,"abstract":"<p>We propose a universal neural network architecture for single-stage multi-class polygonal model generation of anatomical structures from three-dimensional medical images. The key component of the architecture is a trainable affine module that dynamically positions and scales the initial meshes of anatomical structures. This eliminates the need for manual template preparation and reduces the number of self-intersections in the resulting meshes. The effectiveness of the proposed approach has been confirmed on the CHAOS and MMWHS datasets. On CHAOS, an average Dice score of 0.958 is achieved with an ASSD of 1.399 mm, and self-intersections are observed in only 2 out of 20 generated surfaces. On MMWHS, the average Dice score across heart structures is approximately 0.9, and the proportion of self-intersecting edges is comparable to or lower than in the best available methods. Overall, the results demonstrate an accuracy level comparable to modern standards, while producing meshes with significantly cleaner topology. Ablation analysis also confirms the importance of the affine module for generating topologically correct polygonal models.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":"112 1","pages":"332 - 341"},"PeriodicalIF":0.6,"publicationDate":"2026-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147335856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-02DOI: 10.1134/S1064562425700334
A. M. Trunova, D. A. Yudin
Forecasting the future state of a scene is a key computer vision task needed to build systems capable of proactive perception and decision-making in changing environments. This work addresses the problem of forecasting future scene graphs, where, given a video and a sequence of past graphs, one must predict objects and their relations in subsequent frames. Unlike existing approaches limited to static perception, the proposed method, GraphCast, takes into account semantic vision-language features of objects and their temporal dynamics. We introduce a model architecture based on object-centric encoding with a foundation transformer model, interaction modeling via a biaffine relation classification head, and a specialized object presence classifier. In addition, a temporal convolution module is used to extract features and improve robustness to noise. Experiments on the STAR and Action Genome datasets demonstrate that the proposed architecture outperforms existing baselines.
{"title":"Scene Graph Forecasting Using Neural Network-Based Methods","authors":"A. M. Trunova, D. A. Yudin","doi":"10.1134/S1064562425700334","DOIUrl":"10.1134/S1064562425700334","url":null,"abstract":"<p>Forecasting the future state of a scene is a key computer vision task needed to build systems capable of proactive perception and decision-making in changing environments. This work addresses the problem of forecasting future scene graphs, where, given a video and a sequence of past graphs, one must predict objects and their relations in subsequent frames. Unlike existing approaches limited to static perception, the proposed method, GraphCast, takes into account semantic vision-language features of objects and their temporal dynamics. We introduce a model architecture based on object-centric encoding with a foundation transformer model, interaction modeling via a biaffine relation classification head, and a specialized object presence classifier. In addition, a temporal convolution module is used to extract features and improve robustness to noise. Experiments on the STAR and Action Genome datasets demonstrate that the proposed architecture outperforms existing baselines.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":"112 1","pages":"239 - 245"},"PeriodicalIF":0.6,"publicationDate":"2026-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147335858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-02DOI: 10.1134/S1064562425700243
A. I. Burykina, D. R. Ledneva, D. P. Kuznetsov
We present JDCEmb—a new framework for training universal vector representations in task-oriented dialogue tasks. Text encoders play a crucial role in such systems, and their quality determines the effectiveness of dialogue systems. Modern approaches to training dialogue encoders often rely on contrastive methods, which improve the distinguishability of representations but are sensitive to the selection of positive and negative pairs. This can lead to loss of important semantic information. Knowledge distillation-based methods, on the other hand, transfer more context but struggle to distinguish similar utterances and perform poorly with subtle semantic differences.
JDCEmb combines the strengths of both approaches using a teacher–student architecture, where the student model is trained contrastively and aligned with the teacher model’s vector representations simultaneously. This combination makes it possible to maintain semantic richness while enhancing the distinctiveness of vector representations–crucial for dialogue systems. Experimental results on key dialogue tasks demonstrate the effectiveness of the approach: JDCEmb consistently reaches or surpasses state-of-the-art levels, outperforming strong current baseline models.
{"title":"JDCEMB: Joint Distillation and Contrastive Learning for Embeddings in Task-Oriented Dialogue Systems","authors":"A. I. Burykina, D. R. Ledneva, D. P. Kuznetsov","doi":"10.1134/S1064562425700243","DOIUrl":"10.1134/S1064562425700243","url":null,"abstract":"<div><p>We present JDCEmb—a new framework for training universal vector representations in task-oriented dialogue tasks. Text encoders play a crucial role in such systems, and their quality determines the effectiveness of dialogue systems. Modern approaches to training dialogue encoders often rely on contrastive methods, which improve the distinguishability of representations but are sensitive to the selection of positive and negative pairs. This can lead to loss of important semantic information. Knowledge distillation-based methods, on the other hand, transfer more context but struggle to distinguish similar utterances and perform poorly with subtle semantic differences.</p><p>JDCEmb combines the strengths of both approaches using a teacher–student architecture, where the student model is trained contrastively and aligned with the teacher model’s vector representations simultaneously. This combination makes it possible to maintain semantic richness while enhancing the distinctiveness of vector representations–crucial for dialogue systems. Experimental results on key dialogue tasks demonstrate the effectiveness of the approach: JDCEmb consistently reaches or surpasses state-of-the-art levels, outperforming strong current baseline models.</p></div>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":"112 1","pages":"319 - 331"},"PeriodicalIF":0.6,"publicationDate":"2026-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147335831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-02DOI: 10.1134/S1064562425700280
S. R. Kirpichenko, A. V. Konstantinov, L. V. Utkin
The paper presents a novel foundation model, FoCAT (Foundation Causal Adaptive Transformer), developed for estimating the conditional treatment effect. The model addresses several key challenges inherent in causal inference tasks, including a limited sample size in the treatment group, the impossibility of simultaneously observing patient outcomes before and after intervention, and difficulties in testing models on real data. FoCAT employs a hypernetwork architecture. Unlike existing approaches that predict separate outcome functions for control and treatment groups, FoCAT directly estimates the conditional treatment effect. The model allows for control of the context informativeness through specialized classification tokens. Numerical experiments on synthetic and real-world datasets demonstrate superiority of FoCAT in estimation of the treatment effect. The code implementing FoCAT is publicly available.
{"title":"FoCAT: Foundation Model for Estimating the Conditional Average Treatment Effect","authors":"S. R. Kirpichenko, A. V. Konstantinov, L. V. Utkin","doi":"10.1134/S1064562425700280","DOIUrl":"10.1134/S1064562425700280","url":null,"abstract":"<p>The paper presents a novel foundation model, FoCAT (Foundation Causal Adaptive Transformer), developed for estimating the conditional treatment effect. The model addresses several key challenges inherent in causal inference tasks, including a limited sample size in the treatment group, the impossibility of simultaneously observing patient outcomes before and after intervention, and difficulties in testing models on real data. FoCAT employs a hypernetwork architecture. Unlike existing approaches that predict separate outcome functions for control and treatment groups, FoCAT directly estimates the conditional treatment effect. The model allows for control of the context informativeness through specialized classification tokens. Numerical experiments on synthetic and real-world datasets demonstrate superiority of FoCAT in estimation of the treatment effect. The code implementing FoCAT is publicly available.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":"112 1","pages":"279 - 287"},"PeriodicalIF":0.6,"publicationDate":"2026-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147335765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-02DOI: 10.1134/S1064562425700309
A. E. Marusov, A. A. Zaytsev
The task of obtaining informative object representations involves training a model, called an encoder, which constructs informative, compressed representations of signals it receives as input. One approach to solving this problem is through the use of self-supervised learning (SSL) methods. An advantage of these methods lies in utilizing only unlabeled data, which is significantly more abundant than labeled data. Among SSL methods, contrastive approaches are particularly prominent; these are based on bringing representations of semantically similar objects (positive pairs) closer together and pushing representations of different signals (negative pairs) apart. Many modern contrastive SSL methods used for obtaining representations of dependent data—where elements within a sample are semantically related—employ a loss function originally designed for independent data. In this work, we propose a theoretically justified approach for selecting a loss function suitable for continuous dependent data, i.e., data in which neighboring elements within the sample can be considered a positive pair. The analysis presented introduces various ways to model similarity between objects and corresponding loss functions, explicitly accounting for correlations between objects. To empirically assess the effectiveness of the proposed loss functions, we focused on temperature and drought forecasting tasks, which can be classified as continuous dependent data. The results demonstrate that our model, combined with the proposed loss functions, outperforms approaches based on the assumption of semantic independence between data, i.e., when all elements of the sample are semantically unrelated. These findings highlight the importance of considering such dependencies for developing high-quality encoders.
{"title":"Theoretically Justified Contrastive Self-Supervised Methods for Continuous Dependent Data","authors":"A. E. Marusov, A. A. Zaytsev","doi":"10.1134/S1064562425700309","DOIUrl":"10.1134/S1064562425700309","url":null,"abstract":"<p>The task of obtaining informative object representations involves training a model, called an encoder, which constructs informative, compressed representations of signals it receives as input. One approach to solving this problem is through the use of self-supervised learning (SSL) methods. An advantage of these methods lies in utilizing only unlabeled data, which is significantly more abundant than labeled data. Among SSL methods, contrastive approaches are particularly prominent; these are based on bringing representations of semantically similar objects (positive pairs) closer together and pushing representations of different signals (negative pairs) apart. Many modern contrastive SSL methods used for obtaining representations of dependent data—where elements within a sample are semantically related—employ a loss function originally designed for independent data. In this work, we propose a theoretically justified approach for selecting a loss function suitable for continuous dependent data, i.e., data in which neighboring elements within the sample can be considered a positive pair. The analysis presented introduces various ways to model similarity between objects and corresponding loss functions, explicitly accounting for correlations between objects. To empirically assess the effectiveness of the proposed loss functions, we focused on temperature and drought forecasting tasks, which can be classified as continuous dependent data. The results demonstrate that our model, combined with the proposed loss functions, outperforms approaches based on the assumption of semantic independence between data, i.e., when all elements of the sample are semantically unrelated. These findings highlight the importance of considering such dependencies for developing high-quality encoders.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":"112 1","pages":"263 - 272"},"PeriodicalIF":0.6,"publicationDate":"2026-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147335857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-02DOI: 10.1134/S1064562425700279
D. A. Grigoriev, D. I. Chernyshev
In light of the growing interest in using large language models (LLMs) as tools for generating scientific texts, the evaluation of their ability to produce encyclopedic content is becoming increasingly relevant. However, for Russian-language materials, this issue has not been sufficiently studied and existing benchmarks do not cover key aspects of analytical work with sources. This article presents RuWikiBench—an open benchmark based on Ruwiki for evaluating the ability of LLMs to reproduce Wikipedia-style articles, constructed around three tasks: selection of relevant sources, article structuring, and section generation. The results of testing popular open-source LLMs show that even under ideal conditions, the best models do not always follow the expert logic of composing encyclopedic content: even with a perfect source retrieval system, the models cannot reproduce the reference table of contents, and the quality of section generation shows almost no dependence on the number of parameters.
{"title":"RuWikiBench: Evaluating Large Language Models Through Replication of Encyclopedia Articles","authors":"D. A. Grigoriev, D. I. Chernyshev","doi":"10.1134/S1064562425700279","DOIUrl":"10.1134/S1064562425700279","url":null,"abstract":"<div><p>In light of the growing interest in using large language models (LLMs) as tools for generating scientific texts, the evaluation of their ability to produce encyclopedic content is becoming increasingly relevant. However, for Russian-language materials, this issue has not been sufficiently studied and existing benchmarks do not cover key aspects of analytical work with sources. This article presents RuWikiBench—an open benchmark based on Ruwiki for evaluating the ability of LLMs to reproduce Wikipedia-style articles, constructed around three tasks: selection of relevant sources, article structuring, and section generation. The results of testing popular open-source LLMs show that even under ideal conditions, the best models do not always follow the expert logic of composing encyclopedic content: even with a perfect source retrieval system, the models cannot reproduce the reference table of contents, and the quality of section generation shows almost no dependence on the number of parameters.</p></div>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":"112 1","pages":"299 - 307"},"PeriodicalIF":0.6,"publicationDate":"2026-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147335853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-02DOI: 10.1134/S1064562425700310
A. A. Onorpienko
We study the algorithmic complexity of the cooperative card game Hanabi. A feature of Hanabi is that players can see other players’ cards, but not their own, and exchange information through hints. Even in the model with one player who has full information about the deck, Hanabi remains NP-hard. We found the minimal parameters of the game that preserve NP-hardness. If these parameters are further reduced, the game turns out to be solvable in polynomial time.
{"title":"NP-Completeness of Hanabi Game with Minimal Parameters","authors":"A. A. Onorpienko","doi":"10.1134/S1064562425700310","DOIUrl":"10.1134/S1064562425700310","url":null,"abstract":"<p>We study the algorithmic complexity of the cooperative card game Hanabi. A feature of Hanabi is that players can see other players’ cards, but not their own, and exchange information through hints. Even in the model with one player who has full information about the deck, Hanabi remains NP-hard. We found the minimal parameters of the game that preserve NP-hardness. If these parameters are further reduced, the game turns out to be solvable in polynomial time.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":"112 1","pages":"255 - 262"},"PeriodicalIF":0.6,"publicationDate":"2026-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147335859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}