首页 > 最新文献

Doklady Mathematics最新文献

英文 中文
Optimal Data Splitting in Distributed Optimization for Machine Learning 机器学习分布式优化中的最佳数据分割
IF 0.6 4区 数学 Q3 Mathematics Pub Date : 2024-03-25 DOI: 10.1134/s1064562423701600
D. Medyakov, G. Molodtsov, A. Beznosikov, A. Gasnikov

Abstract

The distributed optimization problem has become increasingly relevant recently. It has a lot of advantages such as processing a large amount of data in less time compared to non-distributed methods. However, most distributed approaches suffer from a significant bottleneck—the cost of communications. Therefore, a large amount of research has recently been directed at solving this problem. One such approach uses local data similarity. In particular, there exists an algorithm provably optimally exploiting the similarity property. But this result, as well as results from other works solve the communication bottleneck by focusing only on the fact that communication is significantly more expensive than local computing and does not take into account the various capacities of network devices and the different relationship between communication time and local computing expenses. We consider this setup and the objective of this study is to achieve an optimal ratio of distributed data between the server and local machines for any costs of communications and local computations. The running times of the network are compared between uniform and optimal distributions. The superior theoretical performance of our solutions is experimentally validated.

摘要 分布式优化问题近来变得越来越重要。与非分布式方法相比,分布式方法有很多优点,比如能在更短的时间内处理大量数据。然而,大多数分布式方法都存在一个明显的瓶颈--通信成本。因此,最近有大量研究致力于解决这一问题。其中一种方法就是使用本地数据相似性。特别是,有一种算法可以证明是最佳地利用了相似性特性。但这一结果以及其他研究成果在解决通信瓶颈问题时,只关注了通信费用明显高于本地计算费用这一事实,而没有考虑到网络设备的不同容量以及通信时间与本地计算费用之间的不同关系。我们考虑了这种设置,本研究的目标是在通信和本地计算成本不变的情况下,使服务器和本地机器之间的分布式数据达到最佳比例。我们比较了均匀分布和最优分布的网络运行时间。实验验证了我们解决方案的卓越理论性能。
{"title":"Optimal Data Splitting in Distributed Optimization for Machine Learning","authors":"D. Medyakov, G. Molodtsov, A. Beznosikov, A. Gasnikov","doi":"10.1134/s1064562423701600","DOIUrl":"https://doi.org/10.1134/s1064562423701600","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>The distributed optimization problem has become increasingly relevant recently. It has a lot of advantages such as processing a large amount of data in less time compared to non-distributed methods. However, most distributed approaches suffer from a significant bottleneck—the cost of communications. Therefore, a large amount of research has recently been directed at solving this problem. One such approach uses local data similarity. In particular, there exists an algorithm provably optimally exploiting the similarity property. But this result, as well as results from other works solve the communication bottleneck by focusing only on the fact that communication is significantly more expensive than local computing and does not take into account the various capacities of network devices and the different relationship between communication time and local computing expenses. We consider this setup and the objective of this study is to achieve an optimal ratio of distributed data between the server and local machines for any costs of communications and local computations. The running times of the network are compared between uniform and optimal distributions. The superior theoretical performance of our solutions is experimentally validated.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140299620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
1-Dimensional Topological Invariants to Estimate Loss Surface Non-Convexity 估算损失面非凸性的一维拓扑不变式
IF 0.6 4区 数学 Q3 Mathematics Pub Date : 2024-03-25 DOI: 10.1134/s1064562423701569
D. S. Voronkova, S. A. Barannikov, E. V. Burnaev

Abstract

We utilize the framework of topological data analysis to examine the geometry of loss landscape. With the use of topology and Morse theory, we propose to analyse 1-dimensional topological invariants as a measure of loss function non-convexity up to arbitrary re-parametrization. The proposed approach uses optimization of 2-dimensional simplices in network weights space and allows to conduct both qualitative and quantitative evaluation of loss landscape to gain insights into behavior and optimization of neural networks. We provide geometrical interpretation of the topological invariants and describe the algorithm for their computation. We expect that the proposed approach can complement the existing tools for analysis of loss landscape and shed light on unresolved issues in the field of deep learning.

摘要 我们利用拓扑数据分析框架来研究损失景观的几何形状。利用拓扑学和莫尔斯理论,我们提出分析一维拓扑不变量,将其作为损失函数在任意重参数化条件下的非凸性度量。我们提出的方法使用网络权重空间中的二维简约优化,可以对损失景观进行定性和定量评估,从而深入了解神经网络的行为和优化。我们提供了拓扑不变式的几何解释,并描述了计算拓扑不变式的算法。我们希望所提出的方法能够补充现有的损失景观分析工具,并揭示深度学习领域尚未解决的问题。
{"title":"1-Dimensional Topological Invariants to Estimate Loss Surface Non-Convexity","authors":"D. S. Voronkova, S. A. Barannikov, E. V. Burnaev","doi":"10.1134/s1064562423701569","DOIUrl":"https://doi.org/10.1134/s1064562423701569","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>We utilize the framework of topological data analysis to examine the geometry of loss landscape. With the use of topology and Morse theory, we propose to analyse 1-dimensional topological invariants as a measure of loss function non-convexity up to arbitrary re-parametrization. The proposed approach uses optimization of 2-dimensional simplices in network weights space and allows to conduct both qualitative and quantitative evaluation of loss landscape to gain insights into behavior and optimization of neural networks. We provide geometrical interpretation of the topological invariants and describe the algorithm for their computation. We expect that the proposed approach can complement the existing tools for analysis of loss landscape and shed light on unresolved issues in the field of deep learning.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140884602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Safe Pretraining of Deep Language Models in a Synthetic Pseudo-Language 在合成伪语言中安全预训练深度语言模型
IF 0.6 4区 数学 Q3 Mathematics Pub Date : 2024-03-25 DOI: 10.1134/s1064562423701636
T. E. Gorbacheva, I. Y. Bondarenko

Absract

This paper compares the pretraining of a transformer on natural language texts and on sentences of a synthetic pseudo-language. The artificial texts are automatically generated according to the rules written in a context-free grammar. The results of fine-tuning to complete tasks of the RussianSuperGLUE project statistically reliably showed that the models had the same scores. That is, the use of artificial texts facilitates the AI safety, because it can completely control the composition of the dataset. In addition, at the pretraining stage of a RoBERTa-like model, it is enough to learn recognizing only the syntactic and morphological patterns of the language, which can be successfully created in a fairly simple way, such as a context-free grammar.

摘要 本文比较了转换器对自然语言文本和合成伪语言句子的预训练。人工文本是根据无上下文语法规则自动生成的。为完成 RussianSuperGLUE 项目的任务而进行的微调结果可靠地表明,两种模型的得分相同。也就是说,人工文本的使用有利于人工智能的安全性,因为它可以完全控制数据集的组成。此外,在类 RoBERTa 模型的预训练阶段,只需学习识别语言的句法和形态模式即可,这可以通过相当简单的方法(如无上下文语法)成功创建。
{"title":"Safe Pretraining of Deep Language Models in a Synthetic Pseudo-Language","authors":"T. E. Gorbacheva, I. Y. Bondarenko","doi":"10.1134/s1064562423701636","DOIUrl":"https://doi.org/10.1134/s1064562423701636","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Absract</h3><p>This paper compares the pretraining of a transformer on natural language texts and on sentences of a synthetic pseudo-language. The artificial texts are automatically generated according to the rules written in a context-free grammar. The results of fine-tuning to complete tasks of the RussianSuperGLUE project statistically reliably showed that the models had the same scores. That is, the use of artificial texts facilitates the AI safety, because it can completely control the composition of the dataset. In addition, at the pretraining stage of a RoBERTa-like model, it is enough to learn recognizing only the syntactic and morphological patterns of the language, which can be successfully created in a fairly simple way, such as a context-free grammar.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140884547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimal Analysis of Method with Batching for Monotone Stochastic Finite-Sum Variational Inequalities 单调随机有限和变分不等式的批处理方法优化分析
IF 0.6 4区 数学 Q3 Mathematics Pub Date : 2024-03-25 DOI: 10.1134/s1064562423701582
A. Pichugin, M. Pechin, A. Beznosikov, A. Savchenko, A. Gasnikov

Abstract

Variational inequalities are a universal optimization paradigm that is interesting in itself, but also incorporates classical minimization and saddle point problems. Modern realities encourage to consider stochastic formulations of optimization problems. In this paper, we present an analysis of a method that gives optimal convergence estimates for monotone stochastic finite-sum variational inequalities. In contrast to the previous works, our method supports batching and does not lose the oracle complexity optimality. The effectiveness of the algorithm, especially in the case of small but not single batches is confirmed experimentally.

摘要 变量不等式是一种通用的优化范式,它本身就很有趣,而且还包含经典的最小化和鞍点问题。现代社会鼓励考虑优化问题的随机形式。本文分析了一种方法,它能给出单调随机有限和变分不等式的最佳收敛估计值。与前人的研究相比,我们的方法支持批处理,而且不会失去甲骨文复杂性的最优性。实验证实了该算法的有效性,尤其是在批量较小但不单一的情况下。
{"title":"Optimal Analysis of Method with Batching for Monotone Stochastic Finite-Sum Variational Inequalities","authors":"A. Pichugin, M. Pechin, A. Beznosikov, A. Savchenko, A. Gasnikov","doi":"10.1134/s1064562423701582","DOIUrl":"https://doi.org/10.1134/s1064562423701582","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>Variational inequalities are a universal optimization paradigm that is interesting in itself, but also incorporates classical minimization and saddle point problems. Modern realities encourage to consider stochastic formulations of optimization problems. In this paper, we present an analysis of a method that gives optimal convergence estimates for monotone stochastic finite-sum variational inequalities. In contrast to the previous works, our method supports batching and does not lose the oracle complexity optimality. The effectiveness of the algorithm, especially in the case of small but not single batches is confirmed experimentally.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140884735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Barcodes as Summary of Loss Function Topology 条形码作为损失函数拓扑结构的总结
IF 0.6 4区 数学 Q3 Mathematics Pub Date : 2024-03-25 DOI: 10.1134/s1064562423701570
S. A. Barannikov, A. A. Korotin, D. A. Oganesyan, D. I. Emtsev, E. V. Burnaev

Abstract

We propose to study neural networks’ loss surfaces by methods of topological data analysis. We suggest to apply barcodes of Morse complexes to explore topology of loss surfaces. An algorithm for calculations of the loss function’s barcodes of local minima is described. We have conducted experiments for calculating barcodes of local minima for benchmark functions and for loss surfaces of small neural networks. Our experiments confirm our two principal observations for neural networks’ loss surfaces. First, the barcodes of local minima are located in a small lower part of the range of values of neural networks’ loss function. Secondly, increase of the neural network’s depth and width lowers the barcodes of local minima. This has some natural implications for the neural network’s learning and for its generalization properties.

摘要 我们建议用拓扑数据分析方法研究神经网络的损失面。我们建议应用莫尔斯复合条形码来探索损失面的拓扑结构。本文介绍了计算损失函数局部极小值条形码的算法。我们对基准函数和小型神经网络损失面的局部极小值条形码进行了计算实验。我们的实验证实了我们对神经网络损失面的两个主要观察结果。首先,局部极小值条形码位于神经网络损失函数值范围的较低小部分。其次,神经网络深度和宽度的增加会降低局部最小值的条形码。这对神经网络的学习及其泛化特性自然会产生一些影响。
{"title":"Barcodes as Summary of Loss Function Topology","authors":"S. A. Barannikov, A. A. Korotin, D. A. Oganesyan, D. I. Emtsev, E. V. Burnaev","doi":"10.1134/s1064562423701570","DOIUrl":"https://doi.org/10.1134/s1064562423701570","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>We propose to study neural networks’ loss surfaces by methods of topological data analysis. We suggest to apply barcodes of Morse complexes to explore topology of loss surfaces. An algorithm for calculations of the loss function’s barcodes of local minima is described. We have conducted experiments for calculating barcodes of local minima for benchmark functions and for loss surfaces of small neural networks. Our experiments confirm our two principal observations for neural networks’ loss surfaces. First, the barcodes of local minima are located in a small lower part of the range of values of neural networks’ loss function. Secondly, increase of the neural network’s depth and width lowers the barcodes of local minima. This has some natural implications for the neural network’s learning and for its generalization properties.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140299614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
No Two Users Are Alike: Generating Audiences with Neural Clustering for Temporal Point Processes 没有两个用户是相同的:用神经聚类生成时点过程受众
IF 0.6 4区 数学 Q3 Mathematics Pub Date : 2024-03-25 DOI: 10.1134/s1064562423701661
V. Zhuzhel, V. Grabar, N. Kaploukhaya, R. Rivera-Castro, L. Mironova, A. Zaytsev, E. Burnaev

Abstract

Identifying the right user to target is a common problem for different Internet platforms. Although numerous systems address this task, they are heavily tailored for specific environments and settings. It is challenging for practitioners to apply these findings to their problems. The reason is that most systems are designed for settings with millions of highly active users and with personal information, as is the case in social networks or other services with high virality. There exists a gap in the literature for systems that are for medium-sized data and where the only data available are the event sequences of a user. It motivates us to present Look-A-Liker (LAL) as an unsupervised deep cluster system. It uses temporal point processes to identify similar users for targeting tasks. We use data from the leading Internet marketplace for the gastronomic sector for experiments. LAL generalizes beyond proprietary data. Using event sequences of users, it is possible to obtain state-of-the-art results compared to novel methods such as Transformer architectures and multimodal learning. Our approach produces the up to 20% ROC AUC score improvement on real-world datasets from 0.803 to 0.959. Although LAL focuses on hundreds of thousands of sequences, we show how it quickly expands to millions of user sequences. We provide a fully reproducible implementation with code and datasets in https://github.com/adasegroup/sequence_clusterers.

摘要识别正确的目标用户是不同互联网平台面临的共同问题。虽然有许多系统可以解决这一问题,但它们都是针对特定环境和设置而量身定制的。对于从业人员来说,将这些研究成果应用到他们的问题中具有挑战性。原因在于,大多数系统都是针对拥有数百万高活跃度用户和个人信息的环境而设计的,如社交网络或其他具有高病毒性的服务。对于中等规模数据的系统,以及仅有用户事件序列数据的系统,文献中存在空白。这促使我们提出了无监督深度聚类系统 Look-A-Liker (LAL)。它利用时间点过程来识别目标任务中的相似用户。我们使用领先的美食行业互联网市场的数据进行实验。LAL 不局限于专有数据。通过使用用户的事件序列,我们可以获得与 Transformer 架构和多模态学习等新方法相比最先进的结果。在实际数据集上,我们的方法可将 ROC AUC 分数从 0.803 提高到 0.959,最高提高 20%。虽然 LAL 专注于数十万个序列,但我们展示了它如何快速扩展到数百万个用户序列。我们在 https://github.com/adasegroup/sequence_clusterers 中提供了完全可重现的实现方法,包括代码和数据集。
{"title":"No Two Users Are Alike: Generating Audiences with Neural Clustering for Temporal Point Processes","authors":"V. Zhuzhel, V. Grabar, N. Kaploukhaya, R. Rivera-Castro, L. Mironova, A. Zaytsev, E. Burnaev","doi":"10.1134/s1064562423701661","DOIUrl":"https://doi.org/10.1134/s1064562423701661","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>Identifying the right user to target is a common problem for different Internet platforms. Although numerous systems address this task, they are heavily tailored for specific environments and settings. It is challenging for practitioners to apply these findings to their problems. The reason is that most systems are designed for settings with millions of highly active users and with personal information, as is the case in social networks or other services with high virality. There exists a gap in the literature for systems that are for medium-sized data and where the only data available are the event sequences of a user. It motivates us to present Look-A-Liker (LAL) as an unsupervised deep cluster system. It uses temporal point processes to identify similar users for targeting tasks. We use data from the leading Internet marketplace for the gastronomic sector for experiments. LAL generalizes beyond proprietary data. Using event sequences of users, it is possible to obtain state-of-the-art results compared to novel methods such as Transformer architectures and multimodal learning. Our approach produces the up to 20% ROC AUC score improvement on real-world datasets from 0.803 to 0.959. Although LAL focuses on hundreds of thousands of sequences, we show how it quickly expands to millions of user sequences. We provide a fully reproducible implementation with code and datasets in https://github.com/adasegroup/sequence_clusterers.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140884601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Investigation of Neural Network Algorithms for Human Movement Prediction Based on LSTM and Transformers 基于 LSTM 和变压器的人体运动预测神经网络算法研究
IF 0.6 4区 数学 Q3 Mathematics Pub Date : 2024-03-25 DOI: 10.1134/s1064562423701624
S. V. Zhiganov, Y. S. Ivanov, D. M. Grabar

Abstract

The problem of predicting the position of a person on future frames of a video stream is solved, and in-depth experimental studies on the application of traditional and SOTA blocks for this task are carried out. An original architecture of KeyFNet and its modifications based on transform blocks is presented, which is able to predict coordinates in the video stream for 30, 60, 90, and 120 frames ahead with high accuracy. The novelty lies in the application of a combined algorithm based on multiple FNet blocks with fast Fourier transform as an attention mechanism concatenating the coordinates of key points. Experiments on Human3.6M and on our own real data confirmed the effectiveness of the proposed approach based on FNet blocks, compared to the traditional approach based on LSTM. The proposed algorithm matches the accuracy of advanced models, but outperforms them in terms of speed, uses less computational resources, and thus can be applied in collaborative robotic solutions.

摘要 解决了在视频流的未来帧上预测人物位置的问题,并对传统块和 SOTA 块在此任务中的应用进行了深入的实验研究。本文介绍了 KeyFNet 的原始架构及其基于变换块的修改,该架构能够高精度地预测视频流中未来 30、60、90 和 120 帧的坐标。其新颖之处在于应用了基于多个 FNet 块的组合算法,并将快速傅立叶变换作为一种关注机制,将关键点的坐标串联起来。在 Human3.6M 和我们自己的真实数据上进行的实验证实,与基于 LSTM 的传统方法相比,基于 FNet 块的拟议方法非常有效。所提出的算法与先进模型的准确性相当,但在速度方面优于它们,使用的计算资源更少,因此可以应用于协作机器人解决方案中。
{"title":"Investigation of Neural Network Algorithms for Human Movement Prediction Based on LSTM and Transformers","authors":"S. V. Zhiganov, Y. S. Ivanov, D. M. Grabar","doi":"10.1134/s1064562423701624","DOIUrl":"https://doi.org/10.1134/s1064562423701624","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>The problem of predicting the position of a person on future frames of a video stream is solved, and in-depth experimental studies on the application of traditional and SOTA blocks for this task are carried out. An original architecture of KeyFNet and its modifications based on transform blocks is presented, which is able to predict coordinates in the video stream for 30, 60, 90, and 120 frames ahead with high accuracy. The novelty lies in the application of a combined algorithm based on multiple FNet blocks with fast Fourier transform as an attention mechanism concatenating the coordinates of key points. Experiments on Human3.6M and on our own real data confirmed the effectiveness of the proposed approach based on FNet blocks, compared to the traditional approach based on LSTM. The proposed algorithm matches the accuracy of advanced models, but outperforms them in terms of speed, uses less computational resources, and thus can be applied in collaborative robotic solutions.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140884692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ESGify: Automated Classification of Environmental, Social, and Corporate Governance Risks ESGify:环境、社会和公司治理风险的自动分类
IF 0.6 4区 数学 Q3 Mathematics Pub Date : 2024-03-25 DOI: 10.1134/s1064562423701673
A. Kazakov, S. Denisova, I. Barsola, E. Kalugina, I. Molchanova, I. Egorov, A. Kosterina, E. Tereshchenko, L. Shutikhina, I. Doroshchenko, N. Sotiriadi, S. Budennyy

Abstract

The growing recognition of environmental, social, and governance (ESG) factors in financial decision-making has spurred the need for effective and comprehensive ESG risk assessment tools. In this study, we introduce an open-source Natural Language Processing (NLP) model, “ESGify”1,2, based on MPNet-base architecture and aimed to classify texts within the frames of ESG risks. We also present a hierarchical and detailed methodology for ESG risk classification, leveraging the expertise of ESG professionals and global best practices. Anchored by a manually annotated multilabel dataset of 2000 news articles and domain adaptation with texts of sustainability reports, ESGify is developed to automate ESG risk classification following the established methodology. We compare augmentation techniques based on back translation and Large Language Models (LLMs) to improve the model quality and achieve 0.5 F1-weighted model quality in the dataset with 47 classes. This result outperforms ChatGPT 3.5 with a simple prompt. The model weights and documentation is hosted on Github https://github.com/sb-ai-lab/ESGify under the Apache 2.0 license.

摘要 人们日益认识到金融决策中的环境、社会和治理(ESG)因素,因此需要有效而全面的 ESG 风险评估工具。在本研究中,我们介绍了一个开源的自然语言处理(NLP)模型 "ESGify "1,2,该模型基于 MPNet 基础架构,旨在对 ESG 风险框架内的文本进行分类。我们还利用 ESG 专业人士的专业知识和全球最佳实践,提出了分层的 ESG 风险分类详细方法。ESGify 以包含 2000 篇新闻文章的人工标注多标签数据集和可持续发展报告文本的领域适应性为基础,按照既定方法自动进行 ESG 风险分类。我们比较了基于反向翻译和大型语言模型(LLM)的增强技术,以提高模型质量,并在包含 47 个类别的数据集中实现了 0.5 的 F1 加权模型质量。这一结果优于使用简单提示的 ChatGPT 3.5。模型权重和文档在 Apache 2.0 许可下托管于 Github https://github.com/sb-ai-lab/ESGify。
{"title":"ESGify: Automated Classification of Environmental, Social, and Corporate Governance Risks","authors":"A. Kazakov, S. Denisova, I. Barsola, E. Kalugina, I. Molchanova, I. Egorov, A. Kosterina, E. Tereshchenko, L. Shutikhina, I. Doroshchenko, N. Sotiriadi, S. Budennyy","doi":"10.1134/s1064562423701673","DOIUrl":"https://doi.org/10.1134/s1064562423701673","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>The growing recognition of environmental, social, and governance (ESG) factors in financial decision-making has spurred the need for effective and comprehensive ESG risk assessment tools. In this study, we introduce an open-source Natural Language Processing (NLP) model, “ESGify”<sup>1,2</sup>, based on MPNet-base architecture and aimed to classify texts within the frames of ESG risks. We also present a hierarchical and detailed methodology for ESG risk classification, leveraging the expertise of ESG professionals and global best practices. Anchored by a manually annotated multilabel dataset of 2000 news articles and domain adaptation with texts of sustainability reports, ESGify is developed to automate ESG risk classification following the established methodology. We compare augmentation techniques based on back translation and Large Language Models (LLMs) to improve the model quality and achieve 0.5 F1-weighted model quality in the dataset with 47 classes. This result outperforms ChatGPT 3.5 with a simple prompt. The model weights and documentation is hosted on Github https://github.com/sb-ai-lab/ESGify under the Apache 2.0 license.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140884603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On Barenblatt–Zeldovich Intermediate Asymptotics 论巴伦布拉特-塞尔多维奇中间渐近线
IF 0.6 4区 数学 Q3 Mathematics Pub Date : 2024-03-14 DOI: 10.1134/s1064562423701351
V. A. Kostin, D. V. Kostin, A. V. Kostin

Abstract

The concept of intermediate asymptotics for the solution of an evolution equation with initial data and a related solution obtained without initial conditions was introduced by G.N. Barenblatt and Ya.B. Zeldovich in the context of extending the concept of strict determinism in statistical physics and quantum mechanics. Here, according to V.P. Maslov, to axiomatize the mathematical theory, we need to know the conditions satisfied by the initial data of the problem. We show that the correct solvability of a problem without initial conditions for fractional differential equations in a Banach space is a necessary, but not sufficient, condition for intermediate asymptotics. Examples of intermediate asymptotics are given.

摘要 G.N. Barenblatt 和 Ya.B. Zeldovich 在扩展统计物理学和量子力学中的严格确定性概念时,提出了有初始数据的演化方程解和无初始条件的相关解的中间渐近概念。在这里,根据 V.P. Maslov 的观点,为了使数学理论公理化,我们需要知道问题的初始数据所满足的条件。我们证明,巴拿赫空间中分数微分方程无初始条件问题的正确可解性是中间渐近的必要条件,但不是充分条件。我们给出了中间渐近的例子。
{"title":"On Barenblatt–Zeldovich Intermediate Asymptotics","authors":"V. A. Kostin, D. V. Kostin, A. V. Kostin","doi":"10.1134/s1064562423701351","DOIUrl":"https://doi.org/10.1134/s1064562423701351","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>The concept of intermediate asymptotics for the solution of an evolution equation with initial data and a related solution obtained without initial conditions was introduced by G.N. Barenblatt and Ya.B. Zeldovich in the context of extending the concept of strict determinism in statistical physics and quantum mechanics. Here, according to V.P. Maslov, to axiomatize the mathematical theory, we need to know the conditions satisfied by the initial data of the problem. We show that the correct solvability of a problem without initial conditions for fractional differential equations in a Banach space is a necessary, but not sufficient, condition for intermediate asymptotics. Examples of intermediate asymptotics are given.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2024-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140299836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Existence of a Maximum of Time-Averaged Harvesting in the KPP Model on Sphere with Permanent and Impulse Harvesting 具有永久和脉冲收获的球面 KPP 模型中时间平均收获最大值的存在性
IF 0.6 4区 数学 Q3 Mathematics Pub Date : 2024-03-14 DOI: 10.1134/s1064562423701387
E. V. Vinnikov, A. A. Davydov, D. V. Tunitsky

Abstract

A distributed renewable resource of any nature is considered on a two-dimensional sphere. Its dynamics is described by a model of the Kolmogorov–Petrovsky–Piskunov–Fisher type, and the exploitation of this resource is carried out by constant or periodic impulse harvesting. It is shown that, after choosing an admissible exploitation strategy, the dynamics of the resource tend to limiting dynamics corresponding to this strategy and there is an admissible harvesting strategy that maximizes the time-averaged harvesting of the resource.

摘要 在一个二维球体上考虑任何性质的分布式可再生资源。该资源的动态由 Kolmogorov-Petrovsky-Piskunov-Fisher 型模型描述,资源的开发利用是通过恒定或周期性脉冲采伐进行的。研究表明,在选择一种可接受的开采策略后,资源的动力学趋向于与该策略相对应的极限动力学,并且存在一种可接受的收获策略,该策略可使资源的时间平均收获量最大化。
{"title":"Existence of a Maximum of Time-Averaged Harvesting in the KPP Model on Sphere with Permanent and Impulse Harvesting","authors":"E. V. Vinnikov, A. A. Davydov, D. V. Tunitsky","doi":"10.1134/s1064562423701387","DOIUrl":"https://doi.org/10.1134/s1064562423701387","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>A distributed renewable resource of any nature is considered on a two-dimensional sphere. Its dynamics is described by a model of the Kolmogorov–Petrovsky–Piskunov–Fisher type, and the exploitation of this resource is carried out by constant or periodic impulse harvesting. It is shown that, after choosing an admissible exploitation strategy, the dynamics of the resource tend to limiting dynamics corresponding to this strategy and there is an admissible harvesting strategy that maximizes the time-averaged harvesting of the resource.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2024-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140299739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Doklady Mathematics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1