首页 > 最新文献

Optical Memory and Neural Networks最新文献

英文 中文
Leveraging Graph Representations to Enhance Critical Path Delay Prediction in Digital Complex Functional Blocks Using Neural Networks 利用图表示增强数字复杂功能块关键路径延迟预测的神经网络
IF 0.8 Q4 OPTICS Pub Date : 2025-12-19 DOI: 10.3103/S1060992X25601691
M. Dashiev, N. Zheludkov, I. Karandashev

Accurate critical path delay estimation plays a vital role in reducing unnecessary routing iterations and identifying potentially unsuccessful design runs early in the flow. This study proposes an architecture that integrates graph representations derived from digital complex functional blocks netlist and design constraints, leveraging a Multi-head cross-attention mechanism. This architecture significantly improves the accuracy of critical path delay estimation compared to standard tools provided by the OpenROAD EDA. The mean absolute percentage error (MAPE) of the OpenRoad standard tool—openSTA is 12.60%, whereas our algorithm achieves a substantially lower error of 7.57%. A comparison of various architectures was conducted, along with an investigation into the impact of incorporating netlist-derived information.

准确的关键路径延迟估计在减少不必要的路由迭代和识别流早期可能不成功的设计运行方面起着至关重要的作用。本研究提出了一种架构,该架构集成了源自数字复杂功能块网表和设计约束的图形表示,利用了多头交叉注意机制。与OpenROAD EDA提供的标准工具相比,该架构显著提高了关键路径延迟估计的准确性。OpenRoad标准工具opensta的平均绝对百分比误差(MAPE)为12.60%,而我们的算法实现的误差要低得多,为7.57%。我们对各种架构进行了比较,并对合并netlist派生信息的影响进行了调查。
{"title":"Leveraging Graph Representations to Enhance Critical Path Delay Prediction in Digital Complex Functional Blocks Using Neural Networks","authors":"M. Dashiev,&nbsp;N. Zheludkov,&nbsp;I. Karandashev","doi":"10.3103/S1060992X25601691","DOIUrl":"10.3103/S1060992X25601691","url":null,"abstract":"<p>Accurate critical path delay estimation plays a vital role in reducing unnecessary routing iterations and identifying potentially unsuccessful design runs early in the flow. This study proposes an architecture that integrates graph representations derived from digital complex functional blocks netlist and design constraints, leveraging a Multi-head cross-attention mechanism. This architecture significantly improves the accuracy of critical path delay estimation compared to standard tools provided by the OpenROAD EDA. The mean absolute percentage error (MAPE) of the OpenRoad standard tool—openSTA is 12.60%, whereas our algorithm achieves a substantially lower error of 7.57%. A comparison of various architectures was conducted, along with an investigation into the impact of incorporating netlist-derived information.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"34 1","pages":"S135 - S147"},"PeriodicalIF":0.8,"publicationDate":"2025-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145779310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Mapping Algorithm for More Effective Neural Network Training 深度映射算法用于更有效的神经网络训练
IF 0.8 Q4 OPTICS Pub Date : 2025-12-19 DOI: 10.3103/S1060992X25700195
H. Shen, V. S. Smolin

The problem of approximating nonlinear vector transformations using neural network algorithms is considered. In addition to approximation, one of the reasons for algorithms reaching local minima rather than global minima of the loss function during optimization is identified: the “switching off” or “death” of a significant number of neurons during training. A multidimensional neural mapping algorithm is proposed, programmatically implemented, and numerically investigated to drastically reduce the influence of this factor on approximation accuracy. The theory and results of numerical experiments on approximation using neural mapping are presented.

研究了用神经网络算法逼近非线性向量变换的问题。除了近似之外,还确定了算法在优化过程中达到损失函数的局部最小值而不是全局最小值的原因之一:在训练过程中大量神经元的“关闭”或“死亡”。提出了一种多维神经映射算法,并对其进行了编程实现和数值研究,以大幅度降低该因素对近似精度的影响。给出了神经映射逼近的理论和数值实验结果。
{"title":"Deep Mapping Algorithm for More Effective Neural Network Training","authors":"H. Shen,&nbsp;V. S. Smolin","doi":"10.3103/S1060992X25700195","DOIUrl":"10.3103/S1060992X25700195","url":null,"abstract":"<p>The problem of approximating nonlinear vector transformations using neural network algorithms is considered. In addition to approximation, one of the reasons for algorithms reaching local minima rather than global minima of the loss function during optimization is identified: the “switching off” or “death” of a significant number of neurons during training. A multidimensional neural mapping algorithm is proposed, programmatically implemented, and numerically investigated to drastically reduce the influence of this factor on approximation accuracy. The theory and results of numerical experiments on approximation using neural mapping are presented.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"34 1","pages":"S83 - S93"},"PeriodicalIF":0.8,"publicationDate":"2025-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145779313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RCDINO: Enhancing Radar–Camera 3D Object Detection with DINOv2 Semantic Features RCDINO:利用DINOv2语义特征增强雷达-相机三维目标检测
IF 0.8 Q4 OPTICS Pub Date : 2025-12-19 DOI: 10.3103/S1060992X25601708
O. Matykina, D. Yudin

Three-dimensional object detection is essential for autonomous driving and robotics, relying on effective fusion of multimodal data from cameras and radar. This work proposes RCDINO, a multimodal transformer-based model that enhances visual backbone features by fusing them with semantically rich representations from the pretrained DINOv2 foundation model. This approach enriches visual representations and improves the model’s detection performance while preserving compatibility with the baseline architecture. Experiments on the nuScenes dataset demonstrate that RCDINO achieves state-of-the-art performance among radar–camera models, with 56.4 NDS and 48.1 mAP. Our implementation is available at https://github.com/OlgaMatykina/RCDINO.

三维物体检测对于自动驾驶和机器人技术至关重要,它依赖于来自摄像头和雷达的多模态数据的有效融合。这项工作提出了RCDINO,一个基于多模态变压器的模型,通过将视觉骨干特征与来自预训练DINOv2基础模型的语义丰富的表示融合来增强视觉骨干特征。这种方法丰富了可视化表示,提高了模型的检测性能,同时保持了与基线体系结构的兼容性。在nuScenes数据集上的实验表明,RCDINO在雷达-相机模型中达到了最先进的性能,NDS为56.4,mAP为48.1。我们的实现可以在https://github.com/OlgaMatykina/RCDINO上获得。
{"title":"RCDINO: Enhancing Radar–Camera 3D Object Detection with DINOv2 Semantic Features","authors":"O. Matykina,&nbsp;D. Yudin","doi":"10.3103/S1060992X25601708","DOIUrl":"10.3103/S1060992X25601708","url":null,"abstract":"<p>Three-dimensional object detection is essential for autonomous driving and robotics, relying on effective fusion of multimodal data from cameras and radar. This work proposes RCDINO, a multimodal transformer-based model that enhances visual backbone features by fusing them with semantically rich representations from the pretrained DINOv2 foundation model. This approach enriches visual representations and improves the model’s detection performance while preserving compatibility with the baseline architecture. Experiments on the nuScenes dataset demonstrate that RCDINO achieves state-of-the-art performance among radar–camera models, with 56.4 NDS and 48.1 mAP. Our implementation is available at https://github.com/OlgaMatykina/RCDINO.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"34 1","pages":"S47 - S57"},"PeriodicalIF":0.8,"publicationDate":"2025-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145779065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Wire-Structured Object 3D Point Cloud Filtering Using a Transformer Model 使用变压器模型的线结构对象3D点云过滤
IF 0.8 Q4 OPTICS Pub Date : 2025-12-19 DOI: 10.3103/S1060992X25601812
V. Kniaz, V. Knyaz, T. Skrypitsyna, P. Moshkantsev, A. Bordodymov

The rapid reconstruction of partially destroyed cultural heritage objects is crucial in architectural history. Many significant structures have suffered damage from erosion, earthquakes, or human activity, often leaving only the armature intact. Simplified 3D reconstruction techniques using digital cameras and laser rangefinders are essential for these monuments, frequently located in abandoned areas. However, interior surfaces visible through exterior openings complicate reconstruction by introducing outliers in the 3D point cloud. This paper introduces the WireNetV3 model for precise 3D segmentation of wire structures in color images. The model distinguishes between front and interior surfaces, filtering outliers during feature matching. Building on SegFormer 3D and WireNetV2, our approach integrates transformers with task-specific features and introduces a novel loss function, WireSDF, for distance calculation from wire axes. Evaluations on datasets featuring the Shukhov Tower and a church dome demonstrate that WireNetV3 surpasses existing methods in Intersection-over-Union metrics and 3D model accuracy.

快速重建部分毁坏的文化遗产在建筑史上是至关重要的。许多重要的建筑都遭受了侵蚀、地震或人类活动的破坏,通常只留下了完好无损的电枢。使用数码相机和激光测距仪的简化3D重建技术对于这些经常位于废弃地区的纪念碑至关重要。然而,通过外部开口可见的内部表面通过在3D点云中引入异常值使重建复杂化。本文介绍了用于彩色图像中线材结构精确三维分割的WireNetV3模型。该模型区分前表面和内表面,在特征匹配过程中过滤异常值。在SegFormer 3D和WireNetV2的基础上,我们的方法集成了具有特定任务功能的变压器,并引入了一种新的损失函数WireSDF,用于从线轴计算距离。对以舒霍夫塔和教堂穹顶为特征的数据集的评估表明,WireNetV3在交叉联盟度量和3D模型精度方面超越了现有方法。
{"title":"Wire-Structured Object 3D Point Cloud Filtering Using a Transformer Model","authors":"V. Kniaz,&nbsp;V. Knyaz,&nbsp;T. Skrypitsyna,&nbsp;P. Moshkantsev,&nbsp;A. Bordodymov","doi":"10.3103/S1060992X25601812","DOIUrl":"10.3103/S1060992X25601812","url":null,"abstract":"<p>The rapid reconstruction of partially destroyed cultural heritage objects is crucial in architectural history. Many significant structures have suffered damage from erosion, earthquakes, or human activity, often leaving only the armature intact. Simplified 3D reconstruction techniques using digital cameras and laser rangefinders are essential for these monuments, frequently located in abandoned areas. However, interior surfaces visible through exterior openings complicate reconstruction by introducing outliers in the 3D point cloud. This paper introduces the <span>WireNetV3</span> model for precise 3D segmentation of wire structures in color images. The model distinguishes between front and interior surfaces, filtering outliers during feature matching. Building on SegFormer 3D and <span>WireNetV2</span>, our approach integrates transformers with task-specific features and introduces a novel loss function, WireSDF, for distance calculation from wire axes. Evaluations on datasets featuring the Shukhov Tower and a church dome demonstrate that <span>WireNetV3</span> surpasses existing methods in Intersection-over-Union metrics and 3D model accuracy.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"34 1","pages":"S175 - S184"},"PeriodicalIF":0.8,"publicationDate":"2025-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145779067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interaction between Learning and Evolution at the Formation of Functional Systems 学习与进化在功能系统形成中的相互作用
IF 0.8 Q4 OPTICS Pub Date : 2025-12-19 DOI: 10.3103/S1060992X25601666
V. G. Red’ko, M. S. Burtsev

In the present work, a model of the interaction between learning and evolution at the formation of functional systems is constructed and studied. The behavior of a population of learning agents is analyzed. The agent’s control system consists of a set of functional systems. Each functional system includes a set of elements. The presence or absence of an element in the considered functional system is encoded by binary symbols 1 or 0. Each agent has a genotype and phenotype, which are encoded by chains of binary symbols and represent the combined chains of functional systems. A functional system is completely formed when all its elements are present in it. The more is the number of completely formed functional systems that an agent has, the higher is the agent’s fitness. The evolution of a population of agents consists of generations. During each generation, the genotypes of agents do not change, and the phenotypes are optimized via learning, namely, via the formation of new functional systems. The phenotype of an agent at the beginning of a generation is equal to its genotype. At the end of the generation, the number of functional systems in the agent’s phenotype is determined; the larger is this number, the higher is the agent’s fitness. Agents are selected into a new generation with probabilities that are proportional to their fitness. The descendant agent receives the genotype of the parent agent (with small mutations). Thus, the selection of agents occurs in accordance with their phenotypes, which are optimized by learning, and the genotypes of agents are inherited. The model was studied by computer simulation; the effects of the interaction between learning and evolution in the processes of formation of functional systems were analyzed.

在本工作中,构建和研究了功能系统形成过程中学习和进化之间相互作用的模型。分析了一群学习智能体的行为。agent的控制系统由一组功能系统组成。每个功能系统都包含一组元素。所考虑的功能系统中元素的存在或不存在由二进制符号1或0编码。每种制剂都有基因型和表型,它们由二进制符号链编码,代表功能系统的组合链。当一个功能系统的所有元素都存在于其中时,它就完全形成了。一个智能体拥有的完全形成的功能系统越多,智能体的适应度就越高。一个主体群体的进化是由几代人组成的。在每一代中,试剂的基因型不会改变,表型通过学习优化,即通过形成新的功能系统。一种物质在一代开始时的表型等于它的基因型。在一代结束时,确定代理表型中功能系统的数量;这个数字越大,智能体的适应度就越高。智能体被选择进入新一代,其概率与它们的适应度成正比。后代代理接受亲本代理的基因型(具有小突变)。因此,药物的选择是根据其表型进行的,并通过学习进行优化,并且药物的基因型是遗传的。通过计算机仿真对模型进行了研究;分析了学习与进化在功能系统形成过程中的交互作用。
{"title":"Interaction between Learning and Evolution at the Formation of Functional Systems","authors":"V. G. Red’ko,&nbsp;M. S. Burtsev","doi":"10.3103/S1060992X25601666","DOIUrl":"10.3103/S1060992X25601666","url":null,"abstract":"<p>In the present work, a model of the interaction between learning and evolution at the formation of functional systems is constructed and studied. The behavior of a population of learning agents is analyzed. The agent’s control system consists of a set of functional systems. Each functional system includes a set of elements. The presence or absence of an element in the considered functional system is encoded by binary symbols 1 or 0. Each agent has a genotype and phenotype, which are encoded by chains of binary symbols and represent the combined chains of functional systems. A functional system is completely formed when all its elements are present in it. The more is the number of completely formed functional systems that an agent has, the higher is the agent’s fitness. The evolution of a population of agents consists of generations. During each generation, the genotypes of agents do not change, and the phenotypes are optimized via learning, namely, via the formation of new functional systems. The phenotype of an agent at the beginning of a generation is equal to its genotype. At the end of the generation, the number of functional systems in the agent’s phenotype is determined; the larger is this number, the higher is the agent’s fitness. Agents are selected into a new generation with probabilities that are proportional to their fitness. The descendant agent receives the genotype of the parent agent (with small mutations). Thus, the selection of agents occurs in accordance with their phenotypes, which are optimized by learning, and the genotypes of agents are inherited. The model was studied by computer simulation; the effects of the interaction between learning and evolution in the processes of formation of functional systems were analyzed.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"34 1","pages":"S30 - S46"},"PeriodicalIF":0.8,"publicationDate":"2025-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145779066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatial Traces: Enhancing VLA Models with Spatial-Temporal Understanding 空间轨迹:增强VLA模型的时空理解
IF 0.8 Q4 OPTICS Pub Date : 2025-12-19 DOI: 10.3103/S1060992X25601654
M. A. Patratskiy, A. K. Kovalev, A. I. Panov

Vision-Language-Action models have demonstrated remarkable capabilities in predicting agent movements within virtual environments and real-world scenarios based on visual observations and textual instructions. Although recent research has focused on enhancing spatial and temporal understanding independently, this paper presents a novel approach that integrates both aspects through visual prompting. We introduce a method that projects visual traces of key points from observations onto depth maps, enabling models to capture both spatial and temporal information simultaneously. The experiments in SimplerEnv show that the mean number of tasks successfully solved increased for 4% compared to SpatialVLA and 19% compared to TraceVLA. Furthermore, we show that this enhancement can be achieved with minimal training data, making it particularly valuable for real-world applications where data collection is challenging. The project page is available at https://ampiromax.github.io/ST-VLA.

基于视觉观察和文本指令,视觉-语言-动作模型在预测虚拟环境和现实世界场景中的智能体运动方面表现出了卓越的能力。虽然最近的研究主要集中在提高空间和时间的理解上,但本文提出了一种通过视觉提示将这两个方面结合起来的新方法。我们介绍了一种方法,将观测到的关键点的视觉轨迹投影到深度图上,使模型能够同时捕获空间和时间信息。SimplerEnv中的实验表明,与SpatialVLA相比,成功解决的任务平均数量增加了4%,与TraceVLA相比增加了19%。此外,我们表明这种增强可以用最少的训练数据实现,这对于数据收集具有挑战性的实际应用程序特别有价值。项目页面可在https://ampiromax.github.io/ST-VLA上找到。
{"title":"Spatial Traces: Enhancing VLA Models with Spatial-Temporal Understanding","authors":"M. A. Patratskiy,&nbsp;A. K. Kovalev,&nbsp;A. I. Panov","doi":"10.3103/S1060992X25601654","DOIUrl":"10.3103/S1060992X25601654","url":null,"abstract":"<p>Vision-Language-Action models have demonstrated remarkable capabilities in predicting agent movements within virtual environments and real-world scenarios based on visual observations and textual instructions. Although recent research has focused on enhancing spatial and temporal understanding independently, this paper presents a novel approach that integrates both aspects through visual prompting. We introduce a method that projects visual traces of key points from observations onto depth maps, enabling models to capture both spatial and temporal information simultaneously. The experiments in SimplerEnv show that the mean number of tasks successfully solved increased for 4% compared to SpatialVLA and 19% compared to TraceVLA. Furthermore, we show that this enhancement can be achieved with minimal training data, making it particularly valuable for real-world applications where data collection is challenging. The project page is available at https://ampiromax.github.io/ST-VLA.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"34 1","pages":"S72 - S82"},"PeriodicalIF":0.8,"publicationDate":"2025-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145779303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Decoding EEG Data with Deep Learning for Intelligence Quotient Assessment 基于深度学习的脑电数据解码与智商评估
IF 0.8 Q4 OPTICS Pub Date : 2025-09-17 DOI: 10.3103/S1060992X24601921
Prithwijit Mukherjee,  Anisha Halder Roy

Intelligence quotient (IQ) serves as a statistical gauge for evaluating an individual’s cognitive prowess. Measuring IQ is a formidable undertaking, mainly due to the intricate intricacies of the human brain’s composition. Presently, the assessment of human intelligence relies solely on conventional paper-based psychometric tests. However, these approaches suffer from inherent discrepancies arising from the diversity of test formats and language barriers. The primary objective of this study is to introduce an innovative, deep learning-driven methodology for IQ measurement using Electroencephalogram (EEG) signals. In this investigation, EEG signals are captured from participants during an IQ assessment session. Subsequently, participants' IQ levels are categorized into six distinct tiers, encompassing extremely low IQ, borderline IQ, low average IQ, high average IQ, superior IQ, and very superior IQ, based on their test results. An attention mechanism-based Convolution Neural Network-modified tanh Long-Short-term-Memory (CNN-MTLSTM) model has been meticulously devised for adeptly classifying individuals into the aforementioned IQ categories by using EEG signals. A layer named 'input enhancement layer' is proposed and incorporated in CNN-MTLSTM for enhancing its prediction accuracy. Notably, a CNN is harnessed to automate the process of extracting important information from the extracted EEG features. A new model, i.e., MTLSTM, is proposed, which works as a classifier. The paper’s contributions encompass proposing the novel MTLSTM architecture and leveraging attention mechanism to enhance the classification accuracy of the CNN-MTLSTM model. The innovative CNN-MTLSTM model, incorporating an attention mechanism within the MTLSTM network, attains a remarkable average accuracy of 97.41% in assessing a person’s IQ level.

智商(IQ)是评估个人认知能力的统计指标。测量智商是一项艰巨的任务,主要是由于人类大脑的组成错综复杂。目前,人类智力的评估完全依赖于传统的纸质心理测试。然而,由于测试形式的多样性和语言障碍,这些方法存在固有的差异。本研究的主要目的是介绍一种创新的、深度学习驱动的方法,用于使用脑电图(EEG)信号进行智商测量。在这项研究中,在智商评估过程中,从参与者身上捕获脑电图信号。随后,根据测试结果,参与者的智商水平被分为六个不同的等级,包括极低智商、边缘智商、低平均智商、高平均智商、高智商和超高智商。本文精心设计了一个基于注意机制的卷积神经网络修正长短期记忆(CNN-MTLSTM)模型,利用脑电图信号熟练地将个体划分为上述智商类别。为了提高CNN-MTLSTM的预测精度,提出了一层“输入增强层”并将其加入到CNN-MTLSTM中。值得注意的是,利用CNN从提取的EEG特征中自动提取重要信息。提出了一种新的分类器模型MTLSTM。本文的贡献包括提出新的MTLSTM架构和利用注意力机制来提高CNN-MTLSTM模型的分类精度。创新的CNN-MTLSTM模型在MTLSTM网络中加入了注意机制,在评估一个人的智商水平时达到了97.41%的平均准确率。
{"title":"Decoding EEG Data with Deep Learning for Intelligence Quotient Assessment","authors":"Prithwijit Mukherjee,&nbsp; Anisha Halder Roy","doi":"10.3103/S1060992X24601921","DOIUrl":"10.3103/S1060992X24601921","url":null,"abstract":"<p>Intelligence quotient (IQ) serves as a statistical gauge for evaluating an individual’s cognitive prowess. Measuring IQ is a formidable undertaking, mainly due to the intricate intricacies of the human brain’s composition. Presently, the assessment of human intelligence relies solely on conventional paper-based psychometric tests. However, these approaches suffer from inherent discrepancies arising from the diversity of test formats and language barriers. The primary objective of this study is to introduce an innovative, deep learning-driven methodology for IQ measurement using Electroencephalogram (EEG) signals. In this investigation, EEG signals are captured from participants during an IQ assessment session. Subsequently, participants' IQ levels are categorized into six distinct tiers, encompassing extremely low IQ, borderline IQ, low average IQ, high average IQ, superior IQ, and very superior IQ, based on their test results. An attention mechanism-based Convolution Neural Network-modified tanh Long-Short-term-Memory (CNN-MTLSTM) model has been meticulously devised for adeptly classifying individuals into the aforementioned IQ categories by using EEG signals. A layer named 'input enhancement layer' is proposed and incorporated in CNN-MTLSTM for enhancing its prediction accuracy. Notably, a CNN is harnessed to automate the process of extracting important information from the extracted EEG features. A new model, i.e., MTLSTM, is proposed, which works as a classifier. The paper’s contributions encompass proposing the novel MTLSTM architecture and leveraging attention mechanism to enhance the classification accuracy of the CNN-MTLSTM model. The innovative CNN-MTLSTM model, incorporating an attention mechanism within the MTLSTM network, attains a remarkable average accuracy of 97.41% in assessing a person’s IQ level.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"34 3","pages":"441 - 456"},"PeriodicalIF":0.8,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145073619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Open-Vocabulary Indoor Object Grounding with 3D Hierarchical Scene Graph 基于三维分层场景图的开放词汇室内物体接地
IF 0.8 Q4 OPTICS Pub Date : 2025-09-17 DOI: 10.3103/S1060992X25600673
S. Linok, G. Naumov

We propose OVIGo-3DHSG method—Open-Vocabulary Indoor Grounding of objects using 3D Hierarchical Scene Graph. OVIGo-3DHSG represents an extensive indoor environment over a Hierarchical Scene Graph derived from sequences of RGB-D frames utilizing a set of open-vocabulary foundation models and sensor data processing. The hierarchical representation explicitly models spatial relations across floors, rooms, locations, and objects. To effectively address complex queries involving spatial reference to other objects, we integrate the hierarchical scene graph with a Large Language Model for multistep reasoning. This integration leverages inter-layer (e.g., room-to-object) and intra-layer (e.g., object-to-object) connections, enhancing spatial contextual understanding. We investigate the semantic and geometry accuracy of hierarchical representation on Habitat Matterport 3D Semantic multi-floor scenes. Our approach demonstrates efficient scene comprehension and robust object grounding compared to existing methods. Overall OVIGo-3DHSG demonstrates strong potential for applications requiring spatial reasoning and understanding of indoor environments. Related materials can be found at https://github.com/linukc/OVIGo-3DHSG.

我们提出了OVIGo-3DHSG方法——基于三维层次场景图的开放词汇室内物体接地。OVIGo-3DHSG利用一组开放词汇基础模型和传感器数据处理,在RGB-D帧序列派生的分层场景图上表示广泛的室内环境。分层表示显式地对跨楼层、房间、位置和对象的空间关系进行建模。为了有效地处理涉及到其他对象的空间引用的复杂查询,我们将分层场景图与用于多步推理的大型语言模型相结合。这种集成利用了层间(例如,房间到对象)和层内(例如,对象到对象)的连接,增强了空间上下文的理解。研究了Habitat Matterport三维语义多层场景的分层表示的语义和几何精度。与现有方法相比,我们的方法展示了高效的场景理解和鲁棒的对象基础。总体而言,OVIGo-3DHSG在需要空间推理和室内环境理解的应用中显示出强大的潜力。相关资料可在https://github.com/linukc/OVIGo-3DHSG找到。
{"title":"Open-Vocabulary Indoor Object Grounding with 3D Hierarchical Scene Graph","authors":"S. Linok,&nbsp;G. Naumov","doi":"10.3103/S1060992X25600673","DOIUrl":"10.3103/S1060992X25600673","url":null,"abstract":"<p>We propose <b>OVIGo-3DHSG</b> method—<b>O</b>pen-<b>V</b>ocabulary <b>I</b>ndoor <b>G</b>rounding of <b>o</b>bjects using <b>3D</b> <b>H</b>ierarchical <b>S</b>cene <b>G</b>raph. OVIGo-3DHSG represents an extensive indoor environment over a Hierarchical Scene Graph derived from sequences of RGB-D frames utilizing a set of open-vocabulary foundation models and sensor data processing. The hierarchical representation explicitly models spatial relations across floors, rooms, locations, and objects. To effectively address complex queries involving spatial reference to other objects, we integrate the hierarchical scene graph with a Large Language Model for multistep reasoning. This integration leverages inter-layer (e.g., room-to-object) and intra-layer (e.g., object-to-object) connections, enhancing spatial contextual understanding. We investigate the semantic and geometry accuracy of hierarchical representation on Habitat Matterport 3D Semantic multi-floor scenes. Our approach demonstrates efficient scene comprehension and robust object grounding compared to existing methods. Overall OVIGo-3DHSG demonstrates strong potential for applications requiring spatial reasoning and understanding of indoor environments. Related materials can be found at https://github.com/linukc/OVIGo-3DHSG.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"34 3","pages":"323 - 333"},"PeriodicalIF":0.8,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145073784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AS-ODB: Multivariate Attention Supervised Learning Based Optimized DBN Approach for Cloud Workload Prediction 基于多变量注意监督学习的优化DBN云工作负载预测方法
IF 0.8 Q4 OPTICS Pub Date : 2025-09-17 DOI: 10.3103/S1060992X25700122
G. M. Kiran, A. Aparna Rajesh, D. Basavesha

Attainable on demand cloud computing makes it feasible to access a centralized shared pool of computing resources. Accurate estimation of cloud workload is necessary for optimal performance and effective use of cloud computing resources. Because cloud workloads are dynamic and unpredictable, this is a problematic problem. In this case, deep learning can provide reliable foundations for workload prediction in data centres when trained appropriately. In the proposed model, efficient workload prediction is executed out using novel deep learning. Efficient management of these hyperparameters may significantly improve the neural network model’s performance. Using the data centre’s workload traces at many consecutive time steps, the suggested approach is shown to be able to estimate Central Processing Unit (CPU) utilization. Collects raw data retrieved from the storage, including the number and type of requests, virtual machine (VMs) costs, and resource usage. Discover patterns and oscillations in the workload trace by preprocessing the data to increase the prediction efficacy of this model. During data pre-processing, the KCR approach, min max normalization, and data cleaning are used to select the important properties from raw data samples, eliminate noise, and normalize them. After that, a sliding window is used for deep learning processing to convert multivariate data into time series with supervised learning. Next, utilize a deep belief network based on green anaconda optimization (GrA-DBN) to attain precise workload forecasting. Comparing the suggested methodology with existing models, experimental results show that it provides a better trade-off between accuracy and training time. The suggested method provides higher performance, with an execution time of 28.5 s and an accuracy rate of 93.60%. According to the simulation results, the GrA-DBN workload prediction method performs better than other algorithms.

可实现的随需应变云计算使得访问集中的共享计算资源池成为可能。准确估计云工作负载对于优化性能和有效使用云计算资源是必要的。因为云工作负载是动态的和不可预测的,所以这是一个有问题的问题。在这种情况下,经过适当的训练,深度学习可以为数据中心的工作负载预测提供可靠的基础。在该模型中,利用新颖的深度学习实现了高效的工作负荷预测。对这些超参数的有效管理可以显著提高神经网络模型的性能。使用数据中心在许多连续时间步长的工作负载跟踪,所建议的方法能够估计中央处理单元(Central Processing Unit, CPU)的利用率。收集从存储检索到的原始数据,包括请求的数量和类型、虚拟机(vm)成本和资源使用情况。通过对数据进行预处理,发现工作负载跟踪中的模式和振荡,从而提高该模型的预测效率。在数据预处理过程中,使用KCR方法、最小最大归一化和数据清洗从原始数据样本中选择重要属性,消除噪声并进行归一化。之后,使用滑动窗口进行深度学习处理,通过监督学习将多变量数据转换为时间序列。其次,利用基于绿蟒蛇优化的深度信念网络(GrA-DBN)实现精确的工作量预测。实验结果表明,该方法能较好地平衡训练时间和准确率。该方法具有较高的性能,执行时间为28.5 s,准确率为93.60%。仿真结果表明,GrA-DBN工作负载预测方法的性能优于其他算法。
{"title":"AS-ODB: Multivariate Attention Supervised Learning Based Optimized DBN Approach for Cloud Workload Prediction","authors":"G. M. Kiran,&nbsp;A. Aparna Rajesh,&nbsp;D. Basavesha","doi":"10.3103/S1060992X25700122","DOIUrl":"10.3103/S1060992X25700122","url":null,"abstract":"<p>Attainable on demand cloud computing makes it feasible to access a centralized shared pool of computing resources. Accurate estimation of cloud workload is necessary for optimal performance and effective use of cloud computing resources. Because cloud workloads are dynamic and unpredictable, this is a problematic problem. In this case, deep learning can provide reliable foundations for workload prediction in data centres when trained appropriately. In the proposed model, efficient workload prediction is executed out using novel deep learning. Efficient management of these hyperparameters may significantly improve the neural network model’s performance. Using the data centre’s workload traces at many consecutive time steps, the suggested approach is shown to be able to estimate Central Processing Unit (CPU) utilization. Collects raw data retrieved from the storage, including the number and type of requests, virtual machine (VMs) costs, and resource usage. Discover patterns and oscillations in the workload trace by preprocessing the data to increase the prediction efficacy of this model. During data pre-processing, the KCR approach, min max normalization, and data cleaning are used to select the important properties from raw data samples, eliminate noise, and normalize them. After that, a sliding window is used for deep learning processing to convert multivariate data into time series with supervised learning. Next, utilize a deep belief network based on green anaconda optimization (GrA-DBN) to attain precise workload forecasting. Comparing the suggested methodology with existing models, experimental results show that it provides a better trade-off between accuracy and training time. The suggested method provides higher performance, with an execution time of 28.5 s and an accuracy rate of 93.60%. According to the simulation results, the GrA-DBN workload prediction method performs better than other algorithms.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"34 3","pages":"389 - 401"},"PeriodicalIF":0.8,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145073823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
M3DMap: Object-Aware Multimodal 3D Mapping for Dynamic Environments M3DMap:动态环境的对象感知多模态3D映射
IF 0.8 Q4 OPTICS Pub Date : 2025-09-17 DOI: 10.3103/S1060992X25700092
D. A. Yudin

3D mapping in dynamic environments poses a challenge for modern researchers in robotics and autonomous transportation. There are no universal representations for dynamic 3D scenes that incorporate multimodal data such as images, point clouds, and text. This article takes a step toward solving this problem. It proposes a taxonomy of methods for constructing multimodal 3D maps, classifying contemporary approaches based on scene types and representations, learning methods, and practical applications. Using this taxonomy, a brief structured analysis of recent methods is provided. The article also describes an original modular method called M3DMap, designed for object-aware construction of multimodal 3D maps for both static and dynamic scenes. It consists of several interconnected components: a neural multimodal object segmentation and tracking module; an odometry estimation module, including trainable algorithms; a module for 3D map construction and updating with various implementations depending on the desired scene representation; and a multimodal data retrieval module. The article highlights original implementations of these modules and their advantages in solving various practical tasks, from 3D object grounding to mobile manipulation. Additionally, it presents theoretical propositions demonstrating the positive effect of using multimodal data and modern foundational models in 3D mapping methods. Details of the taxonomy and method implementation are available at https://yuddim.github.io/M3DMap.

动态环境下的三维映射对现代机器人和自动运输研究人员提出了挑战。对于包含多模态数据(如图像、点云和文本)的动态3D场景,没有通用的表示。本文为解决这个问题迈出了一步。它提出了构建多模态3D地图的方法分类,根据场景类型和表示、学习方法和实际应用对当代方法进行分类。使用这种分类法,对最近的方法进行了简要的结构化分析。本文还介绍了一种名为M3DMap的原始模块化方法,用于静态和动态场景的多模态3D地图的对象感知构建。它由几个相互关联的组件组成:一个神经多模态目标分割和跟踪模块;里程计估计模块,包括可训练算法;一个用于3D地图构建和更新的模块,根据所需的场景表示使用各种实现;以及一个多模态数据检索模块。本文重点介绍了这些模块的原始实现及其在解决各种实际任务中的优势,从3D对象接地到移动操作。此外,本文还提出了一些理论命题,证明了在三维制图方法中使用多模态数据和现代基础模型的积极作用。分类法和方法实现的详细信息可在https://yuddim.github.io/M3DMap上获得。
{"title":"M3DMap: Object-Aware Multimodal 3D Mapping for Dynamic Environments","authors":"D. A. Yudin","doi":"10.3103/S1060992X25700092","DOIUrl":"10.3103/S1060992X25700092","url":null,"abstract":"<p>3D mapping in dynamic environments poses a challenge for modern researchers in robotics and autonomous transportation. There are no universal representations for dynamic 3D scenes that incorporate multimodal data such as images, point clouds, and text. This article takes a step toward solving this problem. It proposes a taxonomy of methods for constructing multimodal 3D maps, classifying contemporary approaches based on scene types and representations, learning methods, and practical applications. Using this taxonomy, a brief structured analysis of recent methods is provided. The article also describes an original modular method called M3DMap, designed for object-aware construction of multimodal 3D maps for both static and dynamic scenes. It consists of several interconnected components: a neural multimodal object segmentation and tracking module; an odometry estimation module, including trainable algorithms; a module for 3D map construction and updating with various implementations depending on the desired scene representation; and a multimodal data retrieval module. The article highlights original implementations of these modules and their advantages in solving various practical tasks, from 3D object grounding to mobile manipulation. Additionally, it presents theoretical propositions demonstrating the positive effect of using multimodal data and modern foundational models in 3D mapping methods. Details of the taxonomy and method implementation are available at https://yuddim.github.io/M3DMap.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"34 3","pages":"285 - 312"},"PeriodicalIF":0.8,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145073824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Optical Memory and Neural Networks
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1