Neural Computing and Applications最新文献

Neuroevolution gives rise to more focused information transfer compared to backpropagation in recurrent neural networks. 与递归神经网络中的反向传播相比，神经进化产生了更集中的信息传递

Neural Computing and Applications

Pub Date : 2025-01-01 Epub Date: 2022-12-17 DOI: 10.1007/s00521-022-08125-0

Arend Hintze, Christoph Adami

Artificial neural networks (ANNs) are one of the most promising tools in the quest to develop general artificial intelligence. Their design was inspired by how neurons in natural brains connect and process, the only other substrate to harbor intelligence. Compared to biological brains that are sparsely connected and that form sparsely distributed representations, ANNs instead process information by connecting all nodes of one layer to all nodes of the next. In addition, modern ANNs are trained with backpropagation, while their natural counterparts have been optimized by natural evolution over eons. We study whether the training method influences how information propagates through the brain by measuring the transfer entropy, that is, the information that is transferred from one group of neurons to another. We find that while the distribution of connection weights in optimized networks is largely unaffected by the training method, neuroevolution leads to networks in which information transfer is significantly more focused on small groups of neurons (compared to those trained by backpropagation) while also being more robust to perturbations of the weights. We conclude that the specific attributes of a training method (local vs. global) can significantly affect how information is processed and relayed through the brain, even when the overall performance is similar.

{"title":"Neuroevolution gives rise to more focused information transfer compared to backpropagation in recurrent neural networks.","authors":"Arend Hintze, Christoph Adami","doi":"10.1007/s00521-022-08125-0","DOIUrl":"10.1007/s00521-022-08125-0","url":null,"abstract":"Artificial neural networks (ANNs) are one of the most promising tools in the quest to develop general artificial intelligence. Their design was inspired by how neurons in natural brains connect and process, the only other substrate to harbor intelligence. Compared to biological brains that are sparsely connected and that form sparsely distributed representations, ANNs instead process information by connecting all nodes of one layer to all nodes of the next. In addition, modern ANNs are trained with backpropagation, while their natural counterparts have been optimized by natural evolution over eons. We study whether the training method influences how information propagates through the brain by measuring the transfer entropy, that is, the information that is transferred from one group of neurons to another. We find that while the distribution of connection weights in optimized networks is largely unaffected by the training method, neuroevolution leads to networks in which information transfer is significantly more focused on small groups of neurons (compared to those trained by backpropagation) while also being more robust to perturbations of the weights. We conclude that the specific attributes of a training method (local vs. global) can significantly affect how information is processed and relayed through the brain, even when the overall performance is similar.","PeriodicalId":18925,"journal":{"name":"Neural Computing and Applications","volume":"14 1","pages":"757-767"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11757640/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81945230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Firearm detection using DETR with multiple self-coordinated neural networks 利用 DETR 和多个自协调神经网络进行枪支探测

Neural Computing and Applications

Pub Date : 2024-09-19 DOI: 10.1007/s00521-024-10373-1

Romulo Augusto Aires Soares, Alexandre Cesar Muniz de Oliveira, Paulo Rogerio de Almeida Ribeiro, Areolino de Almeida Neto

This paper presents a new strategy that uses multiple neural networks in conjunction with the DEtection TRansformer (DETR) network to detect firearms in surveillance images. The strategy developed in this work presents a methodology that promotes collaboration and self-coordination of networks in the fully connected layers of DETR through the technique of multiple self-coordinating artificial neural networks (MANN), which does not require a coordinator. This self-coordination consists of training the networks one after the other and integrating their outputs without an extra element called a coordinator. The results indicate that the proposed network is highly effective, achieving high-level outcomes in firearm detection. The network’s high precision of 84% and its ability to perform classifications are noteworthy.

本文提出了一种新策略，利用多个神经网络结合 DETR（DEtection TRansformer）网络来检测监控图像中的枪支。这项工作中开发的策略提出了一种方法，通过无需协调者的多自协调人工神经网络（MANN）技术，促进 DETR 全连接层中网络的协作和自协调。这种自协调包括一个接一个地训练网络，并整合其输出，而无需额外的协调器。结果表明，所提议的网络非常有效，在枪支检测方面取得了高水平的成果。值得注意的是，该网络的精确度高达 84%，并且能够进行分类。

引用次数: 0

Potential analysis of radiographic images to determine infestation of rice seeds 利用放射影像分析确定水稻种子虫害的可能性

Neural Computing and Applications

Pub Date : 2024-09-19 DOI: 10.1007/s00521-024-10379-9

Ivan David Briceño-Pinzón, Raquel Maria de Oliveira Pires, Geraldo Andrade Carvalho, Flávia Barbosa Silva Botelho, Júlia Lima Baute, Marcela Carlota Nery

The X-ray method, together with image analysis tools, has been used to evaluate the internal structures of seeds and correlate them with the physical, physiological and sanitary quality, providing significant and accurate results. The objective of this study was to analyze radiographic images of rice seeds infested by the rice weevil Sitophilus oryzae (Linnaeus, 1763) (Coleoptera: Curculionidae). Rice seed samples from three different cultivars were infested with S. oryzae for 90 days. Next, seed samples collected at random were analyzed by X-ray testing. The radiographic images were analyzed by ImageJ® software to extract color and shape features. Scanning electron microscopy analyses were also performed. The results showed that X-ray testing was effective in detecting infestation. The gray distribution histograms revealed differences between healthy seeds and those infested by adult insects or empty seeds, confirmed by the significant differences obtained for the area and relative and integrated density variables. The study demonstrated that the analysis of radiographic images can provide quantitative information on insect infestation of rice seeds, which is useful in the evaluation of seed quality and for detecting the presence of pests in rice seeds.

X 射线方法和图像分析工具已被用于评估种子的内部结构，并将其与物理、生理和卫生质量联系起来，从而提供重要而准确的结果。本研究的目的是分析受稻象虫（Sitophilus oryzae，Linnaeus，1763）（鞘翅目：卷须科）侵染的水稻种子的射线图像。来自三个不同栽培品种的水稻种子样本被 S. oryzae 侵染了 90 天。然后，对随机采集的种子样本进行 X 射线检测分析。利用 ImageJ® 软件对射线图像进行分析，提取颜色和形状特征。同时还进行了扫描电子显微镜分析。结果表明，X 射线检测能有效发现虫害。灰色分布直方图显示了健康种子与受成虫侵染种子或空种子之间的差异，面积、相对密度和综合密度变量的显著差异也证实了这一点。该研究表明，射线图像分析可提供有关水稻种子虫害的定量信息，有助于评估种子质量和检测水稻种子中是否存在害虫。

{"title":"Potential analysis of radiographic images to determine infestation of rice seeds","authors":"Ivan David Briceño-Pinzón, Raquel Maria de Oliveira Pires, Geraldo Andrade Carvalho, Flávia Barbosa Silva Botelho, Júlia Lima Baute, Marcela Carlota Nery","doi":"10.1007/s00521-024-10379-9","DOIUrl":"https://doi.org/10.1007/s00521-024-10379-9","url":null,"abstract":"The X-ray method, together with image analysis tools, has been used to evaluate the internal structures of seeds and correlate them with the physical, physiological and sanitary quality, providing significant and accurate results. The objective of this study was to analyze radiographic images of rice seeds infested by the rice weevil Sitophilus oryzae (Linnaeus, 1763) (Coleoptera: Curculionidae). Rice seed samples from three different cultivars were infested with S. oryzae for 90 days. Next, seed samples collected at random were analyzed by X-ray testing. The radiographic images were analyzed by ImageJ® software to extract color and shape features. Scanning electron microscopy analyses were also performed. The results showed that X-ray testing was effective in detecting infestation. The gray distribution histograms revealed differences between healthy seeds and those infested by adult insects or empty seeds, confirmed by the significant differences obtained for the area and relative and integrated density variables. The study demonstrated that the analysis of radiographic images can provide quantitative information on insect infestation of rice seeds, which is useful in the evaluation of seed quality and for detecting the presence of pests in rice seeds.","PeriodicalId":18925,"journal":{"name":"Neural Computing and Applications","volume":"29 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142251065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Recommendation systems with user and item profiles based on symbolic modal data 基于符号模态数据的具有用户和项目特征的推荐系统

Neural Computing and Applications

Pub Date : 2024-09-19 DOI: 10.1007/s00521-024-10411-y

Delmiro D. Sampaio-Neto, Telmo M. Silva Filho, Renata M. C. R. Souza

Most recommendation systems are implemented using numerical or categorical data, that is, traditional data. This type of data can be a limiting factor when used to model complex concepts where there is internal variability or internal structure in the data. To overcome these limitations, symbolic data is used, where data can be represented by different types of values, such as intervals, lists, or histograms. This work introduces a single approach to constructing recommendation systems based on content or based on collaborative filtering using modal variables for users and items. In the content-based system, user profiles and item profiles are created from modal representations of their features, and a list of items is matched against a user profile. For collaborative filtering, user profiles are built, and users are grouped to form a neighborhood, products rated by users of this neighborhood are recommended based on the similarity between the neighbor and the user who will receive the recommendation. Experiments are carried out, using a movie domain dataset, to evaluate the effectiveness of the proposed approach. The outcomes suggest our ability to generate ranked lists of superior quality compared to previous methods utilizing symbolic data. Specifically, the lists created through the proposed method exhibit higher normalized discounted cumulative gain and, in qualitative terms, showcase more diverse content.

大多数推荐系统都是使用数值数据或分类数据（即传统数据）实现的。这类数据在用于复杂概念建模时可能会成为一个限制因素，因为数据中存在内部变异或内部结构。为了克服这些限制，我们使用了符号数据，在符号数据中，数据可以用不同类型的值来表示，如区间、列表或直方图。本作品介绍了一种基于内容或基于协同过滤的单一方法，使用模态变量为用户和项目构建推荐系统。在基于内容的系统中，用户配置文件和项目配置文件是根据其特征的模态表示创建的，项目列表与用户配置文件相匹配。在协同过滤系统中，建立用户档案，并将用户分组形成一个邻域，根据邻域用户与接受推荐的用户之间的相似度，推荐该邻域用户评价的产品。我们使用一个电影领域的数据集进行了实验，以评估所建议方法的有效性。实验结果表明，与之前使用符号数据的方法相比，我们有能力生成质量更高的排名列表。具体来说，通过建议的方法创建的列表显示出更高的归一化折现累积增益，而且从质量上来说，展示的内容更加多样化。

{"title":"Recommendation systems with user and item profiles based on symbolic modal data","authors":"Delmiro D. Sampaio-Neto, Telmo M. Silva Filho, Renata M. C. R. Souza","doi":"10.1007/s00521-024-10411-y","DOIUrl":"https://doi.org/10.1007/s00521-024-10411-y","url":null,"abstract":"Most recommendation systems are implemented using numerical or categorical data, that is, traditional data. This type of data can be a limiting factor when used to model complex concepts where there is internal variability or internal structure in the data. To overcome these limitations, symbolic data is used, where data can be represented by different types of values, such as intervals, lists, or histograms. This work introduces a single approach to constructing recommendation systems based on content or based on collaborative filtering using modal variables for users and items. In the content-based system, user profiles and item profiles are created from modal representations of their features, and a list of items is matched against a user profile. For collaborative filtering, user profiles are built, and users are grouped to form a neighborhood, products rated by users of this neighborhood are recommended based on the similarity between the neighbor and the user who will receive the recommendation. Experiments are carried out, using a movie domain dataset, to evaluate the effectiveness of the proposed approach. The outcomes suggest our ability to generate ranked lists of superior quality compared to previous methods utilizing symbolic data. Specifically, the lists created through the proposed method exhibit higher normalized discounted cumulative gain and, in qualitative terms, showcase more diverse content.","PeriodicalId":18925,"journal":{"name":"Neural Computing and Applications","volume":"21 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142251066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Effective affective EEG-based indicators in emotion-evoking VR environments: an evidence from machine learning 情感诱发 VR 环境中基于脑电图的有效情感指标：来自机器学习的证据

Neural Computing and Applications

Pub Date : 2024-09-19 DOI: 10.1007/s00521-024-10240-z

Ivonne Angelica Castiblanco Jimenez, Elena Carlotta Olivetti, Enrico Vezzetti, Sandro Moos, Alessia Celeghin, Federica Marcolin

This study investigates the use of electroencephalography (EEG) to characterize emotions and provides insights into the consistency between self-reported and machine learning outcomes. Thirty participants engaged in five virtual reality environments designed to elicit specific emotions, while their brain activity was recorded. The participants self-assessed their ground truth emotional state in terms of Arousal and Valence through a Self-Assessment Manikin. Gradient Boosted Decision Tree was adopted as a classification algorithm to test the EEG feasibility in the characterization of emotional states. Distinctive patterns of neural activation corresponding to different levels of Valence and Arousal emerged, and a noteworthy correspondence between the outcomes of the self-assessments and the classifier suggested that EEG-based affective indicators can be successfully applied in emotional characterization, shedding light on the possibility of using them as ground truth measurements. These findings provide compelling evidence for the validity of EEG as a tool for emotion characterization and its contribution to a better understanding of emotional activation.

本研究调查了利用脑电图（EEG）描述情绪的方法，并深入探讨了自我报告结果与机器学习结果之间的一致性。30 名参与者参与了五个旨在激发特定情绪的虚拟现实环境，同时记录了他们的大脑活动。参与者通过自我评估人体模型对其 "唤醒"（Arousal）和 "价值"（Valence）方面的基本真实情绪状态进行自我评估。采用梯度提升决策树作为分类算法，测试脑电图在描述情绪状态方面的可行性。结果表明，基于脑电图的情绪指标可以成功地应用于情绪特征描述，并揭示了将其用作基本真实测量的可能性。这些发现提供了令人信服的证据，证明脑电图作为情绪特征描述工具的有效性及其对更好地理解情绪激活的贡献。

{"title":"Effective affective EEG-based indicators in emotion-evoking VR environments: an evidence from machine learning","authors":"Ivonne Angelica Castiblanco Jimenez, Elena Carlotta Olivetti, Enrico Vezzetti, Sandro Moos, Alessia Celeghin, Federica Marcolin","doi":"10.1007/s00521-024-10240-z","DOIUrl":"https://doi.org/10.1007/s00521-024-10240-z","url":null,"abstract":"This study investigates the use of electroencephalography (EEG) to characterize emotions and provides insights into the consistency between self-reported and machine learning outcomes. Thirty participants engaged in five virtual reality environments designed to elicit specific emotions, while their brain activity was recorded. The participants self-assessed their ground truth emotional state in terms of Arousal and Valence through a Self-Assessment Manikin. Gradient Boosted Decision Tree was adopted as a classification algorithm to test the EEG feasibility in the characterization of emotional states. Distinctive patterns of neural activation corresponding to different levels of Valence and Arousal emerged, and a noteworthy correspondence between the outcomes of the self-assessments and the classifier suggested that EEG-based affective indicators can be successfully applied in emotional characterization, shedding light on the possibility of using them as ground truth measurements. These findings provide compelling evidence for the validity of EEG as a tool for emotion characterization and its contribution to a better understanding of emotional activation.","PeriodicalId":18925,"journal":{"name":"Neural Computing and Applications","volume":"17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142251070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Automated defect identification in coherent diffraction imaging with smart continual learning 利用智能持续学习技术自动识别相干衍射成像中的缺陷

Neural Computing and Applications

Pub Date : 2024-09-19 DOI: 10.1007/s00521-024-10415-8

Orcun Yildiz, Krishnan Raghavan, Henry Chan, Mathew J. Cherukara, Prasanna Balaprakash, Subramanian Sankaranarayanan, Tom Peterka

X-ray Bragg coherent diffraction imaging is a powerful technique for 3D materials characterization. However, obtaining X-ray diffraction data is difficult and computationally intensive, motivating the need for automated processing of coherent diffraction images, with the goal of minimizing the number of X-ray datasets needed. We automate a machine learning approach to identify crystalline line defects in samples from the raw coherent diffraction data, in a workflow coupling coherent diffraction data generation with training and inference of deep neural network defect classifiers. In particular, we adopt a continual learning approach, where we generate training data as needed based on the accuracy of the defect classifier instead of generating all training data a priori. Moreover, we develop a novel data generation mechanism to improve the efficiency of defect identification beyond the previously published continual learning approach. We call the improved method smart continual learning. The results show that our approach improves the accuracy of defect classifiers and reduces training data requirements by up to 98% compared with prior approaches.

X 射线布拉格相干衍射成像是三维材料表征的一项强大技术。然而，获取 X 射线衍射数据既困难又耗费计算资源，因此需要对相干衍射图像进行自动处理，以最大限度地减少所需的 X 射线数据集数量。我们将相干衍射数据生成与深度神经网络缺陷分类器的训练和推理相结合，在工作流程中采用机器学习方法，从原始相干衍射数据中自动识别样品中的结晶线缺陷。特别是，我们采用了一种持续学习方法，即根据缺陷分类器的准确性在需要时生成训练数据，而不是事先生成所有训练数据。此外，我们还开发了一种新颖的数据生成机制，以提高缺陷识别效率，超越之前发布的持续学习方法。我们将改进后的方法称为智能持续学习。结果表明，与之前的方法相比，我们的方法提高了缺陷分类器的准确性，并减少了高达 98% 的训练数据需求。

{"title":"Automated defect identification in coherent diffraction imaging with smart continual learning","authors":"Orcun Yildiz, Krishnan Raghavan, Henry Chan, Mathew J. Cherukara, Prasanna Balaprakash, Subramanian Sankaranarayanan, Tom Peterka","doi":"10.1007/s00521-024-10415-8","DOIUrl":"https://doi.org/10.1007/s00521-024-10415-8","url":null,"abstract":"X-ray Bragg coherent diffraction imaging is a powerful technique for 3D materials characterization. However, obtaining X-ray diffraction data is difficult and computationally intensive, motivating the need for automated processing of coherent diffraction images, with the goal of minimizing the number of X-ray datasets needed. We automate a machine learning approach to identify crystalline line defects in samples from the raw coherent diffraction data, in a workflow coupling coherent diffraction data generation with training and inference of deep neural network defect classifiers. In particular, we adopt a continual learning approach, where we generate training data as needed based on the accuracy of the defect classifier instead of generating all training data a priori. Moreover, we develop a novel data generation mechanism to improve the efficiency of defect identification beyond the previously published continual learning approach. We call the improved method smart continual learning. The results show that our approach improves the accuracy of defect classifiers and reduces training data requirements by up to 98% compared with prior approaches.","PeriodicalId":18925,"journal":{"name":"Neural Computing and Applications","volume":"198 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142251069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

End-to-end entity extraction from OCRed texts using summarization models 使用摘要模型从 OCR 文本中进行端到端实体提取

Neural Computing and Applications

Pub Date : 2024-09-19 DOI: 10.1007/s00521-024-10422-9

Pedro A. Villa-García, Raúl Alonso-Calvo, Miguel García-Remesal

A novel methodology is introduced for extracting entities from noisy scanned documents by using end-to-end data and reformulating the entity extraction task as a text summarization problem. This approach offers two significant advantages over traditional entity extraction methods while maintaining comparable performance. First, it utilizes preexisting data to construct datasets, thereby eliminating the need for labor-intensive annotation procedures. Second, it employs multitask learning, enabling the training of a model via a single dataset. To evaluate our approach against state-of-the-art methods, we adapted three commonly used datasets, namely, Conference on Natural Language Learning (CoNLL++), few-shot named entity recognition (Few-NERD), and WikiNEuRal domain adaptation (WikiNEuRal + DA), to the format required by our methodology. We subsequently fine-tuned four sequence-to-sequence models: text-to-text transfer transformer (T5), fine-tuned language net T5 (FLAN-T5), bidirectional autoregressive transformer (BART), and pretraining with extracted gap sentences for abstractive summarization sequence-to-sequence models (PEGASUS). The results indicate that, in the absence of optical character recognition (OCR) noise, the BART model performs comparably to state-of-the-art methods. Furthermore, the performance degradation was limited to 3.49–5.23% when 39–62% of the sentences contained OCR noise. This performance is significantly superior to that of previous studies, which reported a 10–20% decrease in the F1 score with texts that had a 20% OCR error rate. Our experimental results demonstrate that a single model trained via our methodology can reliably extract entities from noisy OCRed texts, unlike existing state-of-the-art approaches, which require separate models for correcting OCR errors and extracting entities.

通过使用端到端数据，并将实体提取任务重新表述为文本摘要问题，引入了一种从噪声扫描文档中提取实体的新方法。与传统的实体提取方法相比，这种方法有两个显著优势，同时还能保持相当的性能。首先，它利用已有数据构建数据集，从而省去了耗费大量人力的标注程序。其次，它采用了多任务学习技术，可以通过单个数据集来训练模型。为了将我们的方法与最先进的方法进行对比评估，我们将三个常用数据集，即自然语言学习会议（CoNLL++）、少量命名实体识别（Few-NERD）和 WikiNEuRal 领域适应（WikiNEuRal + DA），调整为我们的方法所需的格式。随后，我们对四种序列到序列模型进行了微调：文本到文本传输转换器（T5）、微调语言网 T5（FLAN-T5）、双向自回归转换器（BART），以及抽象概括序列到序列模型（PEGASUS）的提取空白句预训练。结果表明，在没有光学字符识别（OCR）噪声的情况下，BART 模型的性能与最先进的方法相当。此外，当 39-62% 的句子含有 OCR 噪音时，性能下降幅度限制在 3.49-5.23% 之间。这一性能明显优于之前的研究，之前的研究报告称，在 OCR 错误率为 20% 的文本中，F1 分数下降了 10-20%。我们的实验结果表明，通过我们的方法训练出的单一模型可以从有噪声的 OCR 文本中可靠地提取实体，这与现有的先进方法不同，后者需要单独的模型来纠正 OCR 错误和提取实体。

{"title":"End-to-end entity extraction from OCRed texts using summarization models","authors":"Pedro A. Villa-García, Raúl Alonso-Calvo, Miguel García-Remesal","doi":"10.1007/s00521-024-10422-9","DOIUrl":"https://doi.org/10.1007/s00521-024-10422-9","url":null,"abstract":"A novel methodology is introduced for extracting entities from noisy scanned documents by using end-to-end data and reformulating the entity extraction task as a text summarization problem. This approach offers two significant advantages over traditional entity extraction methods while maintaining comparable performance. First, it utilizes preexisting data to construct datasets, thereby eliminating the need for labor-intensive annotation procedures. Second, it employs multitask learning, enabling the training of a model via a single dataset. To evaluate our approach against state-of-the-art methods, we adapted three commonly used datasets, namely, Conference on Natural Language Learning (CoNLL++), few-shot named entity recognition (Few-NERD), and WikiNEuRal domain adaptation (WikiNEuRal + DA), to the format required by our methodology. We subsequently fine-tuned four sequence-to-sequence models: text-to-text transfer transformer (T5), fine-tuned language net T5 (FLAN-T5), bidirectional autoregressive transformer (BART), and pretraining with extracted gap sentences for abstractive summarization sequence-to-sequence models (PEGASUS). The results indicate that, in the absence of optical character recognition (OCR) noise, the BART model performs comparably to state-of-the-art methods. Furthermore, the performance degradation was limited to 3.49–5.23% when 39–62% of the sentences contained OCR noise. This performance is significantly superior to that of previous studies, which reported a 10–20% decrease in the F1 score with texts that had a 20% OCR error rate. Our experimental results demonstrate that a single model trained via our methodology can reliably extract entities from noisy OCRed texts, unlike existing state-of-the-art approaches, which require separate models for correcting OCR errors and extracting entities.","PeriodicalId":18925,"journal":{"name":"Neural Computing and Applications","volume":"15 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142251067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Edge detective weights initialization on Darknet-19 model for YOLOv2-based facemask detection 基于 YOLOv2 的面罩检测中 Darknet-19 模型的边缘检测权重初始化

Neural Computing and Applications

Pub Date : 2024-09-19 DOI: 10.1007/s00521-024-10427-4

Richard Ningthoujam, Keisham Pritamdas, Loitongbam Surajkumar Singh

The object detection model based on the transfer learning approach comprises feature extraction and detection layers. YOLOv2 is among the fastest detection algorithms, which can utilize various pretrained classifier networks for feature extraction. However, reducing the number of network layers and increasing the mean average precision (mAP) together have challenges. Darknet-19-based YOLOv2 model achieved an mAP of 76.78% by having a smaller number of layers than other existing models. This work proposes modification by adding layers that help enhance feature extraction for further increasing the mAP of the model. Above that, the initial weights of the new layers can be random or deterministic, fine-tuned during training. In our work, we introduce a block of layers initialized with deterministic weights derived from several edge detection filter weights. Integrating such a block to the darknet-19-based object detection model improves the mAP to 85.94%, outperforming the other existing model in terms of mAP and number of layers.

基于迁移学习方法的物体检测模型包括特征提取层和检测层。YOLOv2 是最快的检测算法之一，它可以利用各种预训练分类器网络进行特征提取。然而，减少网络层数和提高平均精度（mAP）都面临挑战。与其他现有模型相比，基于 Darknet-19 的 YOLOv2 模型层数较少，但 mAP 却达到了 76.78%。这项工作建议通过增加有助于加强特征提取的层数来进一步提高模型的 mAP。此外，新层的初始权重可以是随机的，也可以是确定的，在训练过程中进行微调。在我们的工作中，我们引入了一个层块，其初始化的确定性权重来源于几个边缘检测滤波器的权重。在基于 darknet-19 的物体检测模型中集成这样一个块，可将 mAP 提高到 85.94%，在 mAP 和层数方面优于其他现有模型。

引用次数: 0

AD-Net: Attention-based dilated convolutional residual network with guided decoder for robust skin lesion segmentation AD-Net：基于注意力的扩张卷积残差网络与引导解码器，用于稳健的皮损分割

Neural Computing and Applications

Pub Date : 2024-09-19 DOI: 10.1007/s00521-024-10362-4

Asim Naveed, Syed S. Naqvi, Tariq M. Khan, Shahzaib Iqbal, M. Yaqoob Wani, Haroon Ahmed Khan

In computer-aided diagnosis tools employed for skin cancer treatment and early diagnosis, skin lesion segmentation is important. However, achieving precise segmentation is challenging due to inherent variations in appearance, contrast, texture, and blurry lesion boundaries. This research presents a robust approach utilizing a dilated convolutional residual network, which incorporates an attention-based spatial feature enhancement block (ASFEB) and employs a guided decoder strategy. In each dilated convolutional residual block, dilated convolution is employed to broaden the receptive field with varying dilation rates. To improve the spatial feature information of the encoder, we employed an attention-based spatial feature enhancement block in the skip connections. The ASFEB in our proposed method combines feature maps obtained from average and maximum-pooling operations. These combined features are then weighted using the active outcome of global average pooling and convolution operations. Additionally, we have incorporated a guided decoder strategy, where each decoder block is optimized using an individual loss function to enhance the feature learning process in the proposed AD-Net. The proposed AD-Net presents a significant benefit by necessitating fewer model parameters compared to its peer methods. This reduction in parameters directly impacts the number of labeled data required for training, facilitating faster convergence during the training process. The effectiveness of the proposed AD-Net was evaluated using four public benchmark datasets. We conducted a Wilcoxon signed-rank test to verify the efficiency of the AD-Net. The outcomes suggest that our method surpasses other cutting-edge methods in performance, even without the implementation of data augmentation strategies.

在用于皮肤癌治疗和早期诊断的计算机辅助诊断工具中，皮损分割非常重要。然而，由于外观、对比度、纹理的固有变化以及模糊的病变边界，实现精确的分割具有挑战性。本研究提出了一种利用扩张卷积残差网络的稳健方法，该方法结合了基于注意力的空间特征增强块（ASFEB），并采用了引导解码器策略。在每个扩张卷积残差块中，都采用了扩张卷积，以不同的扩张率扩大感受野。为了提高编码器的空间特征信息，我们在跳转连接中采用了基于注意力的空间特征增强块。我们提出的方法中的 ASFEB 结合了从平均和最大池化操作中获得的特征图。然后，利用全局平均池化和卷积操作的主动结果对这些组合特征进行加权。此外，我们还采用了一种引导解码器策略，即使用单个损失函数对每个解码器块进行优化，以增强拟议 AD-Net 中的特征学习过程。与同类方法相比，拟议的 AD-Net 所需的模型参数更少，因此具有显著优势。参数的减少直接影响到训练所需的标注数据数量，有助于在训练过程中更快地收敛。我们使用四个公共基准数据集对所提出的 AD-Net 的有效性进行了评估。我们通过 Wilcoxon 符号秩检验来验证 AD-Net 的效率。结果表明，即使不实施数据增强策略，我们的方法在性能上也超越了其他先进方法。

{"title":"AD-Net: Attention-based dilated convolutional residual network with guided decoder for robust skin lesion segmentation","authors":"Asim Naveed, Syed S. Naqvi, Tariq M. Khan, Shahzaib Iqbal, M. Yaqoob Wani, Haroon Ahmed Khan","doi":"10.1007/s00521-024-10362-4","DOIUrl":"https://doi.org/10.1007/s00521-024-10362-4","url":null,"abstract":"In computer-aided diagnosis tools employed for skin cancer treatment and early diagnosis, skin lesion segmentation is important. However, achieving precise segmentation is challenging due to inherent variations in appearance, contrast, texture, and blurry lesion boundaries. This research presents a robust approach utilizing a dilated convolutional residual network, which incorporates an attention-based spatial feature enhancement block (ASFEB) and employs a guided decoder strategy. In each dilated convolutional residual block, dilated convolution is employed to broaden the receptive field with varying dilation rates. To improve the spatial feature information of the encoder, we employed an attention-based spatial feature enhancement block in the skip connections. The ASFEB in our proposed method combines feature maps obtained from average and maximum-pooling operations. These combined features are then weighted using the active outcome of global average pooling and convolution operations. Additionally, we have incorporated a guided decoder strategy, where each decoder block is optimized using an individual loss function to enhance the feature learning process in the proposed AD-Net. The proposed AD-Net presents a significant benefit by necessitating fewer model parameters compared to its peer methods. This reduction in parameters directly impacts the number of labeled data required for training, facilitating faster convergence during the training process. The effectiveness of the proposed AD-Net was evaluated using four public benchmark datasets. We conducted a Wilcoxon signed-rank test to verify the efficiency of the AD-Net. The outcomes suggest that our method surpasses other cutting-edge methods in performance, even without the implementation of data augmentation strategies.","PeriodicalId":18925,"journal":{"name":"Neural Computing and Applications","volume":"29 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142268762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Designing dataless neural networks for kidney exchange variants 为肾脏交换变体设计无数据神经网络

Neural Computing and Applications

Pub Date : 2024-09-18 DOI: 10.1007/s00521-024-10352-6

Sangram K. Jena, K. Subramani, Alvaro Velasquez

Kidney transplantation is vital for treating end-stage renal disease, impacting roughly one in a thousand Europeans. The search for a suitable deceased donor often leads to prolonged and uncertain wait times, making living donor transplants a viable alternative. However, approximately 40% of living donors are incompatible with their intended recipients. Therefore, many countries have established kidney exchange programs, allowing patients with incompatible donors to participate in “swap” arrangements, exchanging donors with other patients in similar situations. Several variants of the vertex-disjoint cycle cover problem model the above problem, which deals with different aspects of kidney exchange as required. This paper discusses several specific vertex-disjoint cycle cover variants and deals with finding the exact solution. We employ the dataless neural networks framework to establish single differentiable functions for each variant. Recent research highlights the framework’s effectiveness in representing several combinatorial optimization problems. Inspired by these findings, we propose customized dataless neural networks for vertex-disjoint cycle cover variants. We derive a differentiable function for each variant and prove that the function will attain its minimum value if an exact solution is found for the corresponding problem variant. We also provide proof of the correctness of our approach.

肾移植对于治疗终末期肾病至关重要，大约每一千名欧洲人中就有一人受到影响。寻找合适的死亡捐献者往往会导致漫长而不确定的等待时间，因此活体捐献者移植成为一种可行的替代方案。然而，约有 40% 的活体捐献者与预期受体不相容。因此，许多国家制定了肾脏交换计划，允许供体不相容的患者参与 "交换 "安排，与其他情况类似的患者交换供体。顶点二交循环覆盖问题的几个变体对上述问题进行了建模，根据需要处理肾脏交换的不同方面。本文讨论了几种具体的顶点相交循环覆盖变体，并探讨了如何找到精确的解决方案。我们采用无数据神经网络框架，为每个变体建立了单可变函数。最近的研究强调了该框架在表示若干组合优化问题时的有效性。受这些研究成果的启发，我们提出了针对顶点二交循环覆盖变体的定制无数据神经网络。我们为每个变体推导出一个可微分函数，并证明如果找到相应问题变体的精确解，该函数将达到其最小值。我们还证明了我们方法的正确性。

{"title":"Designing dataless neural networks for kidney exchange variants","authors":"Sangram K. Jena, K. Subramani, Alvaro Velasquez","doi":"10.1007/s00521-024-10352-6","DOIUrl":"https://doi.org/10.1007/s00521-024-10352-6","url":null,"abstract":"Kidney transplantation is vital for treating end-stage renal disease, impacting roughly one in a thousand Europeans. The search for a suitable deceased donor often leads to prolonged and uncertain wait times, making living donor transplants a viable alternative. However, approximately 40% of living donors are incompatible with their intended recipients. Therefore, many countries have established kidney exchange programs, allowing patients with incompatible donors to participate in “swap” arrangements, exchanging donors with other patients in similar situations. Several variants of the vertex-disjoint cycle cover problem model the above problem, which deals with different aspects of kidney exchange as required. This paper discusses several specific vertex-disjoint cycle cover variants and deals with finding the exact solution. We employ the dataless neural networks framework to establish single differentiable functions for each variant. Recent research highlights the framework’s effectiveness in representing several combinatorial optimization problems. Inspired by these findings, we propose customized dataless neural networks for vertex-disjoint cycle cover variants. We derive a differentiable function for each variant and prove that the function will attain its minimum value if an exact solution is found for the corresponding problem variant. We also provide proof of the correctness of our approach.","PeriodicalId":18925,"journal":{"name":"Neural Computing and Applications","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142251064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0