首页 > 最新文献

Intelligent Systems with Applications最新文献

英文 中文
A sentiment analysis approach for understanding users’ perception of metaverse marketplace 了解用户对元网络市场感知的情感分析方法
Pub Date : 2024-03-19 DOI: 10.1016/j.iswa.2024.200362
Ahmed Al-Adaileh , Mousa Al-Kfairy , Mohammad Tubishat , Omar Alfandi

This research explores the user perceptions of the Metaverse Marketplace, analyzing a substantial dataset of over 860,000 Twitter posts through sentiment analysis and topic modeling techniques. The study aims to uncover the driving factors behind user engagement and sentiment in this novel digital trading space. Key findings highlight a predominantly positive user sentiment, with significant enthusiasm for the marketplace's revenue generation and entertainment potential, particularly within the gaming sector. Users express appreciation for the innovative opportunities the Metaverse Marketplace offers for artists, designers, and traders in handling and trading digital assets. This positive outlook is tempered by notable concerns regarding security and privacy within the Metaverse, pointing to a critical area for development and assurance. The study also reveals a substantial neutral sentiment, reflecting users’ cautious but interested stance, particularly regarding the marketplace's role in investment and passive income opportunities. This balanced view underscores the evolving nature of user perceptions in this emerging field. Theoretically, the research enriches the discourse on technology adoption, particularly in virtual environments, by highlighting perceived benefits and enjoyment as significant adoption drivers. These insights are invaluable for stakeholders in the Metaverse Marketplace, guiding the development of more secure, engaging, and user-friendly platforms. While providing a pioneering perspective on Metaverse user perceptions, the study acknowledges its limitation to Twitter data, suggesting the need for broader research methodologies for a more holistic understanding.

本研究通过情感分析和主题建模技术,分析了超过 86 万条 Twitter 帖子的大量数据集,探讨了用户对 Metaverse Marketplace 的看法。研究旨在揭示用户在这一新型数字交易空间中的参与度和情感背后的驱动因素。研究的主要发现强调了用户的积极情绪,他们对市场的创收和娱乐潜力充满热情,尤其是在游戏领域。用户对 Metaverse 市场在处理和交易数字资产方面为艺术家、设计师和交易商提供的创新机会表示赞赏。但用户对 Metaverse 中的安全和隐私问题也表示了明显的担忧,指出这是一个需要开发和保证的关键领域,从而影响了这种积极的前景。研究还显示了大量的中立情绪,反映了用户谨慎但感兴趣的态度,特别是对市场在投资和被动收入机会方面的作用。这种平衡的观点强调了用户对这一新兴领域看法的演变性质。从理论上讲,这项研究丰富了有关技术采用(尤其是虚拟环境中的技术采用)的讨论,强调了作为重要采用驱动因素的感知利益和乐趣。这些见解对 Metaverse 市场的利益相关者来说非常宝贵,可以指导开发更安全、更吸引人、更友好的平台。虽然该研究为 Metaverse 用户感知提供了一个开创性的视角,但也承认其仅限于 Twitter 数据,这表明需要采用更广泛的研究方法来获得更全面的理解。
{"title":"A sentiment analysis approach for understanding users’ perception of metaverse marketplace","authors":"Ahmed Al-Adaileh ,&nbsp;Mousa Al-Kfairy ,&nbsp;Mohammad Tubishat ,&nbsp;Omar Alfandi","doi":"10.1016/j.iswa.2024.200362","DOIUrl":"https://doi.org/10.1016/j.iswa.2024.200362","url":null,"abstract":"<div><p>This research explores the user perceptions of the Metaverse Marketplace, analyzing a substantial dataset of over 860,000 Twitter posts through sentiment analysis and topic modeling techniques. The study aims to uncover the driving factors behind user engagement and sentiment in this novel digital trading space. Key findings highlight a predominantly positive user sentiment, with significant enthusiasm for the marketplace's revenue generation and entertainment potential, particularly within the gaming sector. Users express appreciation for the innovative opportunities the Metaverse Marketplace offers for artists, designers, and traders in handling and trading digital assets. This positive outlook is tempered by notable concerns regarding security and privacy within the Metaverse, pointing to a critical area for development and assurance. The study also reveals a substantial neutral sentiment, reflecting users’ cautious but interested stance, particularly regarding the marketplace's role in investment and passive income opportunities. This balanced view underscores the evolving nature of user perceptions in this emerging field. Theoretically, the research enriches the discourse on technology adoption, particularly in virtual environments, by highlighting perceived benefits and enjoyment as significant adoption drivers. These insights are invaluable for stakeholders in the Metaverse Marketplace, guiding the development of more secure, engaging, and user-friendly platforms. While providing a pioneering perspective on Metaverse user perceptions, the study acknowledges its limitation to Twitter data, suggesting the need for broader research methodologies for a more holistic understanding.</p></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"22 ","pages":"Article 200362"},"PeriodicalIF":0.0,"publicationDate":"2024-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667305324000383/pdfft?md5=408db24ecd15b5edd94a070515a178eb&pid=1-s2.0-S2667305324000383-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140179896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DeLiVoTr: Deep and light-weight voxel transformer for 3D object detection DeLiVoTr:用于 3D 物体检测的深度轻量级体素变换器
Pub Date : 2024-03-19 DOI: 10.1016/j.iswa.2024.200361
Gopi Krishna Erabati, Helder Araujo

The image-based backbone (feature extraction) networks downsample the feature maps not only to increase the receptive field but also to efficiently detect objects of various scales. The existing feature extraction networks in LiDAR-based 3D object detection tasks follow the feature map downsampling similar to image-based feature extraction networks to increase the receptive field. But, such downsampling of LiDAR feature maps in large-scale autonomous driving scenarios hinder the detection of small size objects, such as pedestrians. To solve this issue we design an architecture that not only maintains the same scale of the feature maps but also the receptive field in the feature extraction network to aid for efficient detection of small size objects. We resort to attention mechanism to build sufficient receptive field and we propose a Deep and Light-weight Voxel Transformer (DeLiVoTr) network with voxel intra- and inter-region transformer modules to extract voxel local and global features respectively. We introduce DeLiVoTr block that uses transformations with expand and reduce strategy to vary the width and depth of the network efficiently. This facilitates to learn wider and deeper voxel representations and enables to use not only smaller dimension for attention mechanism but also a light-weight feed-forward network, facilitating the reduction of parameters and operations. In addition to model scaling, we employ layer-level scaling of DeLiVoTr encoder layers for efficient parameter allocation in each encoder layer instead of fixed number of parameters as in existing approaches. Leveraging layer-level depth and width scaling we formulate three variants of DeLiVoTr network. We conduct extensive experiments and analysis on large-scale Waymo and KITTI datasets. Our network surpasses state-of-the-art methods for detection of small objects (pedestrians) with an inference speed of 20.5 FPS.

基于图像的骨干(特征提取)网络对特征图进行下采样,不仅能增加感受野,还能有效探测各种尺度的物体。在基于激光雷达的三维物体检测任务中,现有的特征提取网络采用了与基于图像的特征提取网络类似的特征图下采样方法,以增加感受野。但是,在大规模自动驾驶场景中,这种对激光雷达特征图的下采样阻碍了对行人等小尺寸物体的检测。为了解决这个问题,我们设计了一种架构,它不仅能保持特征图的比例不变,还能保持特征提取网络的感受野,从而帮助高效检测小尺寸物体。我们利用注意力机制来建立足够的感受野,并提出了一种深度和轻量级体素变换器(DeLiVoTr)网络,该网络带有体素区域内和区域间变换器模块,可分别提取体素局部和全局特征。我们引入了 DeLiVoTr 模块,该模块使用扩展和缩减策略进行变换,从而有效地改变网络的宽度和深度。这有助于学习更宽和更深的体素表征,不仅使注意力机制的维度更小,还能使用轻量级前馈网络,从而减少参数和操作。除了模型缩放外,我们还采用了 DeLiVoTr 编码器层的层级缩放,以便在每个编码器层中高效分配参数,而不是像现有方法那样分配固定数量的参数。利用层级深度和宽度缩放,我们提出了 DeLiVoTr 网络的三种变体。我们在大规模 Waymo 和 KITTI 数据集上进行了广泛的实验和分析。在检测小型物体(行人)方面,我们的网络以 20.5 FPS 的推理速度超越了最先进的方法。
{"title":"DeLiVoTr: Deep and light-weight voxel transformer for 3D object detection","authors":"Gopi Krishna Erabati,&nbsp;Helder Araujo","doi":"10.1016/j.iswa.2024.200361","DOIUrl":"10.1016/j.iswa.2024.200361","url":null,"abstract":"<div><p>The image-based backbone (feature extraction) networks downsample the feature maps not only to increase the receptive field but also to efficiently detect objects of various scales. The existing feature extraction networks in LiDAR-based 3D object detection tasks follow the feature map downsampling similar to image-based feature extraction networks to increase the receptive field. But, such downsampling of LiDAR feature maps in large-scale autonomous driving scenarios hinder the detection of small size objects, such as <em>pedestrians</em>. To solve this issue we design an architecture that not only maintains the same scale of the feature maps but also the receptive field in the feature extraction network to aid for efficient detection of small size objects. We resort to attention mechanism to build sufficient receptive field and we propose a <strong>De</strong>ep and <strong>Li</strong>ght-weight <strong>Vo</strong>xel <strong>Tr</strong>ansformer (DeLiVoTr) network with voxel intra- and inter-region transformer modules to extract voxel local and global features respectively. We introduce DeLiVoTr block that uses transformations with expand and reduce strategy to vary the width and depth of the network efficiently. This facilitates to learn wider and deeper voxel representations and enables to use not only smaller dimension for attention mechanism but also a light-weight feed-forward network, facilitating the reduction of parameters and operations. In addition to <em>model</em> scaling, we employ <em>layer-level</em> scaling of DeLiVoTr encoder layers for efficient parameter allocation in each encoder layer instead of fixed number of parameters as in existing approaches. Leveraging <em>layer-level depth</em> and <em>width</em> scaling we formulate three variants of DeLiVoTr network. We conduct extensive experiments and analysis on large-scale Waymo and KITTI datasets. Our network surpasses state-of-the-art methods for detection of small objects (<em>pedestrians</em>) with an inference speed of 20.5 FPS.</p></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"22 ","pages":"Article 200361"},"PeriodicalIF":0.0,"publicationDate":"2024-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667305324000371/pdfft?md5=a6e557978ff347c6423116d4ba2f6a20&pid=1-s2.0-S2667305324000371-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140275011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Review of ambiguity problem in text summarization using hybrid ACA and SLR 使用混合 ACA 和 SLR 解决文本摘要中的歧义问题综述
Pub Date : 2024-03-19 DOI: 10.1016/j.iswa.2024.200360
Sutriawan Sutriawan , Supriadi Rustad , Guruh Fajar Shidik , Pujiono Pujiono , Muljono Muljono

Text summarization is the process of creating a text summary that contains important information from a text document. In recent years, significant progress has been made in the field of text summarization research, along with the challenges that drive research progress in the field at large. The development of textual data has sparked great interest in text summarization research, which is thoroughly reviewed in this survey study. Text summarization research improvements continue to be made to date with various approaches, such as abstractive and extractive. The abstractive approach uses an intermediate representation of the input document to produce a summary that may differ from the original text. The extractive approach means that key sentences are extracted from the source document and combined to form a summary. Despite the various methodologies and approaches recommended, the summaries produced still contain ambiguities that can be interpreted with different meanings, resulting in errors in defining ambiguities, uncertainty in measuring the quality of summaries, difficulty in modeling linguistic context, difficulty in representing semantic meanings, and difficulty in specifying types of ambiguities. This research survey offers a comprehensive exploration of text summarization research, covering challenges, classifications, approaches, preprocessing methods, features, techniques, and evaluation methods, meeting future research needs. The results provide an overview of the state of the art of recent research developments in the topic of ambiguity resolution in text summarization, such as trends in research topics and approaches or techniques used in addressing ambiguity problems in text summarization.

文本摘要是创建包含文本文档重要信息的文本摘要的过程。近年来,文本摘要研究领域取得了重大进展,同时也面临着推动整个领域研究进展的挑战。文本数据的发展引发了人们对文本摘要研究的极大兴趣,本调查研究对文本摘要研究进行了全面回顾。迄今为止,文本摘要研究仍在不断改进,采用了各种方法,如抽象法和提取法。抽象法使用输入文档的中间表示来生成摘要,这种摘要可能与原文不同。提取法是从源文件中提取关键句子,并将其合并形成摘要。尽管推荐了各种方法和途径,但所生成的摘要仍然包含歧义,这些歧义可能被解释为不同的含义,从而导致歧义定义错误、摘要质量衡量不确定、语言上下文建模困难、语义表达困难以及歧义类型指定困难。本研究调查对文本摘要研究进行了全面探索,涵盖了挑战、分类、方法、预处理方法、特征、技术和评估方法,满足了未来的研究需求。研究成果概述了文本摘要中歧义解决这一主题的最新研究进展,如研究课题的趋势和解决文本摘要中歧义问题的方法或技术。
{"title":"Review of ambiguity problem in text summarization using hybrid ACA and SLR","authors":"Sutriawan Sutriawan ,&nbsp;Supriadi Rustad ,&nbsp;Guruh Fajar Shidik ,&nbsp;Pujiono Pujiono ,&nbsp;Muljono Muljono","doi":"10.1016/j.iswa.2024.200360","DOIUrl":"10.1016/j.iswa.2024.200360","url":null,"abstract":"<div><p>Text summarization is the process of creating a text summary that contains important information from a text document. In recent years, significant progress has been made in the field of text summarization research, along with the challenges that drive research progress in the field at large. The development of textual data has sparked great interest in text summarization research, which is thoroughly reviewed in this survey study. Text summarization research improvements continue to be made to date with various approaches, such as abstractive and extractive. The abstractive approach uses an intermediate representation of the input document to produce a summary that may differ from the original text. The extractive approach means that key sentences are extracted from the source document and combined to form a summary. Despite the various methodologies and approaches recommended, the summaries produced still contain ambiguities that can be interpreted with different meanings, resulting in errors in defining ambiguities, uncertainty in measuring the quality of summaries, difficulty in modeling linguistic context, difficulty in representing semantic meanings, and difficulty in specifying types of ambiguities. This research survey offers a comprehensive exploration of text summarization research, covering challenges, classifications, approaches, preprocessing methods, features, techniques, and evaluation methods, meeting future research needs. The results provide an overview of the state of the art of recent research developments in the topic of ambiguity resolution in text summarization, such as trends in research topics and approaches or techniques used in addressing ambiguity problems in text summarization.</p></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"22 ","pages":"Article 200360"},"PeriodicalIF":0.0,"publicationDate":"2024-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S266730532400036X/pdfft?md5=3c2870d3b3f87a6ef8f6576559396413&pid=1-s2.0-S266730532400036X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140269990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MEFF – A model ensemble feature fusion approach for tackling adversarial attacks in medical imaging MEFF - 应对医学成像中对抗性攻击的模型集合特征融合方法
Pub Date : 2024-03-16 DOI: 10.1016/j.iswa.2024.200355
Laith Alzubaidi , Khamael AL–Dulaimi , Huda Abdul-Hussain Obeed , Ahmed Saihood , Mohammed A. Fadhel , Sabah Abdulazeez Jebur , Yubo Chen , A.S. Albahri , Jose Santamaría , Ashish Gupta , Yuantong Gu

Adversarial attacks pose a significant threat to deep learning models, specifically medical images, as they can mislead models into making inaccurate predictions by introducing subtle distortions to the input data that are often imperceptible to humans. Although adversarial training is a common technique used to mitigate these attacks on medical images, it lacks the flexibility to address new attack methods and effectively improve feature representation. This paper introduces a novel Model Ensemble Feature Fusion (MEFF) designed to combat adversarial attacks in medical image applications. The proposed model employs feature fusion by combining features extracted from different DL models and then trains Machine Learning classifiers using the fused features. It uses a concatenation method to merge the extracted features, forming a more comprehensive representation and enhancing the model's ability to classify classes accurately. Our experimental study has performed a comprehensive evaluation of MEFF, considering several challenging scenarios, including 2D and 3D images, greyscale and colour images, binary classification, and multi-label classification. The reported results demonstrate the robustness of using MEFF against different types of adversarial attacks across six distinct medical image applications. A key advantage of MEFF is its capability to incorporate a wide range of adversarial attacks without the need to train from scratch. Therefore, it contributes to developing a more diverse and robust defence strategy. More importantly, by leveraging feature fusion and ensemble modelling, MEFF enhances the resilience of DL models in the face of adversarial attacks, paving the way for improved robustness and reliability in medical image analysis.

对抗性攻击对深度学习模型(尤其是医学图像)构成了重大威胁,因为它们会对输入数据引入人类通常无法察觉的微妙失真,从而误导模型做出不准确的预测。虽然对抗训练是一种常用技术,可用于减轻对医学图像的这些攻击,但它缺乏灵活性,无法应对新的攻击方法,也无法有效改善特征表示。本文介绍了一种新颖的模型集合特征融合(MEFF),旨在对抗医学图像应用中的对抗性攻击。所提出的模型通过结合从不同 DL 模型中提取的特征来实现特征融合,然后使用融合特征训练机器学习分类器。它使用串联方法合并提取的特征,形成更全面的表示,增强模型准确分类的能力。我们的实验研究对 MEFF 进行了全面评估,考虑了多个具有挑战性的场景,包括二维和三维图像、灰度和彩色图像、二元分类和多标签分类。报告结果表明,在六种不同的医学图像应用中,MEFF 对不同类型的对抗性攻击具有很强的抵御能力。MEFF 的一个关键优势是它能够在不需要从头开始训练的情况下纳入各种对抗性攻击。因此,它有助于开发更多样化、更强大的防御策略。更重要的是,通过利用特征融合和集合建模,MEFF 增强了 DL 模型在面对对抗性攻击时的应变能力,为提高医学图像分析的鲁棒性和可靠性铺平了道路。
{"title":"MEFF – A model ensemble feature fusion approach for tackling adversarial attacks in medical imaging","authors":"Laith Alzubaidi ,&nbsp;Khamael AL–Dulaimi ,&nbsp;Huda Abdul-Hussain Obeed ,&nbsp;Ahmed Saihood ,&nbsp;Mohammed A. Fadhel ,&nbsp;Sabah Abdulazeez Jebur ,&nbsp;Yubo Chen ,&nbsp;A.S. Albahri ,&nbsp;Jose Santamaría ,&nbsp;Ashish Gupta ,&nbsp;Yuantong Gu","doi":"10.1016/j.iswa.2024.200355","DOIUrl":"https://doi.org/10.1016/j.iswa.2024.200355","url":null,"abstract":"<div><p>Adversarial attacks pose a significant threat to deep learning models, specifically medical images, as they can mislead models into making inaccurate predictions by introducing subtle distortions to the input data that are often imperceptible to humans. Although adversarial training is a common technique used to mitigate these attacks on medical images, it lacks the flexibility to address new attack methods and effectively improve feature representation. This paper introduces a novel Model Ensemble Feature Fusion (MEFF) designed to combat adversarial attacks in medical image applications. The proposed model employs feature fusion by combining features extracted from different DL models and then trains Machine Learning classifiers using the fused features. It uses a concatenation method to merge the extracted features, forming a more comprehensive representation and enhancing the model's ability to classify classes accurately. Our experimental study has performed a comprehensive evaluation of MEFF, considering several challenging scenarios, including 2D and 3D images, greyscale and colour images, binary classification, and multi-label classification. The reported results demonstrate the robustness of using MEFF against different types of adversarial attacks across six distinct medical image applications. A key advantage of MEFF is its capability to incorporate a wide range of adversarial attacks without the need to train from scratch. Therefore, it contributes to developing a more diverse and robust defence strategy. More importantly, by leveraging feature fusion and ensemble modelling, MEFF enhances the resilience of DL models in the face of adversarial attacks, paving the way for improved robustness and reliability in medical image analysis.</p></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"22 ","pages":"Article 200355"},"PeriodicalIF":0.0,"publicationDate":"2024-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667305324000310/pdfft?md5=5fa2dc401268f3c29a24c198fa07f620&pid=1-s2.0-S2667305324000310-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140191734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IoT data sharing technology based on blockchain and federated learning algorithms 基于区块链和联合学习算法的物联网数据共享技术
Pub Date : 2024-03-16 DOI: 10.1016/j.iswa.2024.200359
Zhiqiang Feng

To share data on Internet of Things devices more securely, accurately, and efficiently, this study designs a layered sharing architecture based on blockchain and federated learning. This architecture achieves efficient and secure Internet of Things data sharing through client node clustering and blockchain consensus processes. In addition, to address the issue of imbalanced distribution of data labels in system devices, a device clustering federated learning algorithm based on label similarity is designed to improve the accuracy and stability of the model. The experimental results showed that under independent synchronous data distribution and non independent synchronous data distribution, the research algorithm achieved a 95 % accuracy after 30 iterations, and the communication cost was relatively low. When testing algorithm stability under non independent synchronous data distribution, the more label categories there are, the higher the accuracy. When the label category M = 12, the accuracy could reach 96.0 %. In the medical sharing system of a certain hospital, the research system took about 42.9 % less time to extract information than the original system, and the accuracy could be maintained at over 98 %. This research method can effectively solve the problem of uneven distribution of device data labels, and improve the data transmission efficiency and accuracy of Internet of Things data sharing systems. Moreover, this method can also reduce the impact of malicious nodes on the global model, providing technical support for data transmission and security protection in other fields.

为了更安全、准确、高效地共享物联网设备上的数据,本研究设计了一种基于区块链和联盟学习的分层共享架构。该架构通过客户端节点聚类和区块链共识过程,实现了高效、安全的物联网数据共享。此外,针对系统设备中数据标签分布不均衡的问题,设计了基于标签相似性的设备聚类联合学习算法,以提高模型的准确性和稳定性。实验结果表明,在独立同步数据分布和非独立同步数据分布条件下,该研究算法经过 30 次迭代后,准确率达到 95%,且通信成本相对较低。在非独立同步数据分布条件下测试算法稳定性时,标签类别越多,准确率越高。当标签类别 M = 12 时,准确率可达 96.0%。在某医院的医疗共享系统中,研究系统提取信息的时间比原系统减少了约 42.9%,准确率保持在 98% 以上。该研究方法能有效解决设备数据标签分布不均的问题,提高物联网数据共享系统的数据传输效率和准确性。此外,该方法还能减少恶意节点对全局模型的影响,为其他领域的数据传输和安全防护提供技术支持。
{"title":"IoT data sharing technology based on blockchain and federated learning algorithms","authors":"Zhiqiang Feng","doi":"10.1016/j.iswa.2024.200359","DOIUrl":"https://doi.org/10.1016/j.iswa.2024.200359","url":null,"abstract":"<div><p>To share data on Internet of Things devices more securely, accurately, and efficiently, this study designs a layered sharing architecture based on blockchain and federated learning. This architecture achieves efficient and secure Internet of Things data sharing through client node clustering and blockchain consensus processes. In addition, to address the issue of imbalanced distribution of data labels in system devices, a device clustering federated learning algorithm based on label similarity is designed to improve the accuracy and stability of the model. The experimental results showed that under independent synchronous data distribution and non independent synchronous data distribution, the research algorithm achieved a 95 % accuracy after 30 iterations, and the communication cost was relatively low. When testing algorithm stability under non independent synchronous data distribution, the more label categories there are, the higher the accuracy. When the label category <em>M</em> = 12, the accuracy could reach 96.0 %. In the medical sharing system of a certain hospital, the research system took about 42.9 % less time to extract information than the original system, and the accuracy could be maintained at over 98 %. This research method can effectively solve the problem of uneven distribution of device data labels, and improve the data transmission efficiency and accuracy of Internet of Things data sharing systems. Moreover, this method can also reduce the impact of malicious nodes on the global model, providing technical support for data transmission and security protection in other fields.</p></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"22 ","pages":"Article 200359"},"PeriodicalIF":0.0,"publicationDate":"2024-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667305324000358/pdfft?md5=eadc7c0c02f671c3d2bfcdcae178083b&pid=1-s2.0-S2667305324000358-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140179989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cluster-based oversampling with area extraction from representative points for class imbalance learning 基于聚类的超采样与代表性点面积提取,用于类不平衡学习
Pub Date : 2024-03-16 DOI: 10.1016/j.iswa.2024.200357
Zakarya Farou , Yizhi Wang , Tomáš Horváth

Class imbalance learning is challenging in various domains where training datasets exhibit disproportionate samples in a specific class. Resampling methods have been used to adjust the class distribution, but they often have limitations for small disjunct minority subsets. This paper introduces AROSS, an adaptive cluster-based oversampling approach that addresses these limitations. AROSS utilizes an optimized agglomerative clustering algorithm with the Cophenetic Correlation Coefficient and the Bayesian Information Criterion to identify representative areas of the minority class. Safe and half-safe areas are obtained using an incremental k-Nearest Neighbor strategy, and oversampling is performed with a truncated hyperspherical Gaussian distribution. Experimental evaluations on 70 binary datasets demonstrate the effectiveness of AROSS in improving class imbalance learning performance, making it a promising solution for mitigating class imbalance challenges, especially for small disjunct minority subsets.

在各种领域中,类不平衡学习都具有挑战性,因为在这些领域中,训练数据集显示出特定类中样本比例失调。重采样方法已被用于调整类分布,但对于小的不连续性少数群体子集,这些方法往往有局限性。本文介绍的 AROSS 是一种基于聚类的自适应超采样方法,可以解决这些局限性。AROSS 利用优化的聚集聚类算法、科芬尼相关系数和贝叶斯信息标准来确定少数群体的代表性区域。使用增量 k 近邻策略获得安全区和半安全区,并使用截断的超球面高斯分布进行超采样。在 70 个二元数据集上进行的实验评估表明,AROSS 在提高类不平衡学习性能方面非常有效,使其成为缓解类不平衡挑战的一种有前途的解决方案,特别是对于小的不连续性少数群体子集。
{"title":"Cluster-based oversampling with area extraction from representative points for class imbalance learning","authors":"Zakarya Farou ,&nbsp;Yizhi Wang ,&nbsp;Tomáš Horváth","doi":"10.1016/j.iswa.2024.200357","DOIUrl":"https://doi.org/10.1016/j.iswa.2024.200357","url":null,"abstract":"<div><p>Class imbalance learning is challenging in various domains where training datasets exhibit disproportionate samples in a specific class. Resampling methods have been used to adjust the class distribution, but they often have limitations for small disjunct minority subsets. This paper introduces AROSS, an adaptive cluster-based oversampling approach that addresses these limitations. AROSS utilizes an optimized agglomerative clustering algorithm with the Cophenetic Correlation Coefficient and the Bayesian Information Criterion to identify representative areas of the minority class. Safe and half-safe areas are obtained using an incremental k-Nearest Neighbor strategy, and oversampling is performed with a truncated hyperspherical Gaussian distribution. Experimental evaluations on 70 binary datasets demonstrate the effectiveness of AROSS in improving class imbalance learning performance, making it a promising solution for mitigating class imbalance challenges, especially for small disjunct minority subsets.</p></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"22 ","pages":"Article 200357"},"PeriodicalIF":0.0,"publicationDate":"2024-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667305324000334/pdfft?md5=a11f2bb04866bb8768451b4018887e0e&pid=1-s2.0-S2667305324000334-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140162425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Speed meets accuracy: Advanced deep learning for efficient Orientia tsutsugamushi bacteria assessment in RNAi screening 速度与准确性的完美结合:在 RNAi 筛选中利用先进的深度学习对恙虫病菌进行高效评估
Pub Date : 2024-03-16 DOI: 10.1016/j.iswa.2024.200356
Potjanee Kanchanapiboon , Chuenchat Songsaksuppachok , Porncheera Chusorn , Panrasee Ritthipravat

This study investigates the use of advanced computer vision techniques for assessing the severity of Orientia tsutsugamushi bacterial infectivity. It uses fluorescent scrub typhus images obtained from molecular screening, and addresses challenges posed by a complex and extensive image dataset, with limited computational resources. Our methodology integrates three key strategies within a deep learning framework: transitioning from instance segmentation (IS) models to an object detection model; reducing the model's backbone size; and employing lower-precision floating-point calculations. These approaches were systematically evaluated to strike an optimal balance between model accuracy and inference speed, crucial for effective bacterial infectivity assessment. A significant outcome is that the implementation of the Faster R-CNN architecture, with a shallow backbone and reduced precision, notably improves accuracy and reduces inference time in cell counting and infectivity assessment. This innovative approach successfully addresses the limitations of image processing techniques and IS models, effectively bridging the gap between sophisticated computational methods and modern molecular biology applications. The findings underscore the potential of this integrated approach to enhance the accuracy and efficiency of bacterial infectivity evaluations in molecular research.

本研究调查了先进计算机视觉技术在评估恙虫病细菌感染严重程度中的应用。它使用了从分子筛选中获得的荧光恙虫病图像,并利用有限的计算资源解决了复杂而广泛的图像数据集带来的挑战。我们的方法在深度学习框架内整合了三个关键策略:从实例分割(IS)模型过渡到对象检测模型;缩小模型的主干尺寸;以及采用低精度浮点计算。对这些方法进行了系统评估,以便在模型准确性和推理速度之间取得最佳平衡,这对有效评估细菌感染性至关重要。一个重要的结果是,采用浅骨干网和较低精度的更快 R-CNN 架构,显著提高了细胞计数和感染性评估的准确性,并缩短了推理时间。这种创新方法成功地解决了图像处理技术和 IS 模型的局限性,有效地缩小了复杂计算方法与现代分子生物学应用之间的差距。研究结果凸显了这种集成方法在提高分子研究中细菌感染性评估的准确性和效率方面的潜力。
{"title":"Speed meets accuracy: Advanced deep learning for efficient Orientia tsutsugamushi bacteria assessment in RNAi screening","authors":"Potjanee Kanchanapiboon ,&nbsp;Chuenchat Songsaksuppachok ,&nbsp;Porncheera Chusorn ,&nbsp;Panrasee Ritthipravat","doi":"10.1016/j.iswa.2024.200356","DOIUrl":"https://doi.org/10.1016/j.iswa.2024.200356","url":null,"abstract":"<div><p>This study investigates the use of advanced computer vision techniques for assessing the severity of <em>Orientia tsutsugamushi</em> bacterial infectivity. It uses fluorescent scrub typhus images obtained from molecular screening, and addresses challenges posed by a complex and extensive image dataset, with limited computational resources. Our methodology integrates three key strategies within a deep learning framework: transitioning from instance segmentation (IS) models to an object detection model; reducing the model's backbone size; and employing lower-precision floating-point calculations. These approaches were systematically evaluated to strike an optimal balance between model accuracy and inference speed, crucial for effective bacterial infectivity assessment. A significant outcome is that the implementation of the Faster R-CNN architecture, with a shallow backbone and reduced precision, notably improves accuracy and reduces inference time in cell counting and infectivity assessment. This innovative approach successfully addresses the limitations of image processing techniques and IS models, effectively bridging the gap between sophisticated computational methods and modern molecular biology applications. The findings underscore the potential of this integrated approach to enhance the accuracy and efficiency of bacterial infectivity evaluations in molecular research.</p></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"22 ","pages":"Article 200356"},"PeriodicalIF":0.0,"publicationDate":"2024-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667305324000322/pdfft?md5=2d06cfac57033fbe4635f13bd56c5c03&pid=1-s2.0-S2667305324000322-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140179894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploration of advancements in handwritten document recognition techniques 探索手写文件识别技术的进步
Pub Date : 2024-03-15 DOI: 10.1016/j.iswa.2024.200358
Vanita Agrawal , Jayant Jagtap , M.V.V. Prasad Kantipudi

Handwritten document recognition and classification are among the many computers related issues being studied for digitizing handwritten data. A handwritten document comprises text, diagrams, mathematical expressions, numerals, and tables. Due to the variety of writing styles and the intricacy of the written language, it has proven difficult to recognize handwritten material. As a result, numerous handwritten document recognition systems have been developed, each with unique benefits and drawbacks. The paper reviews the evolution of handwritten document recognition in qualitative and quantitative ways. Initially, the bibliometric survey is presented based on the number of articles, citations, countries, authors, etc., on handwritten document recognition in the Scopus database. Later, a survey is done on the learning techniques used for handwritten documents: text recognition, digit recognition, mathematical expression recognition, table recognition, and diagram recognition. This paper also presents the directions for future research in handwritten document recognition.

手写文档的识别和分类是手写数据数字化过程中与计算机相关的众多研究课题之一。手写文档包括文本、图表、数学表达式、数字和表格。由于书写方式的多样性和书面语言的复杂性,要识别手写资料已被证明是很困难的。因此,人们开发了许多手写文档识别系统,每种系统都有其独特的优点和缺点。本文从定性和定量两个方面回顾了手写文档识别的发展历程。首先,根据 Scopus 数据库中有关手写文档识别的文章数量、引用次数、国家、作者等进行了文献计量调查。随后,对用于手写文档的学习技术进行了调查:文本识别、数字识别、数学表达式识别、表格识别和图表识别。本文还介绍了手写文档识别的未来研究方向。
{"title":"Exploration of advancements in handwritten document recognition techniques","authors":"Vanita Agrawal ,&nbsp;Jayant Jagtap ,&nbsp;M.V.V. Prasad Kantipudi","doi":"10.1016/j.iswa.2024.200358","DOIUrl":"https://doi.org/10.1016/j.iswa.2024.200358","url":null,"abstract":"<div><p>Handwritten document recognition and classification are among the many computers related issues being studied for digitizing handwritten data. A handwritten document comprises text, diagrams, mathematical expressions, numerals, and tables. Due to the variety of writing styles and the intricacy of the written language, it has proven difficult to recognize handwritten material. As a result, numerous handwritten document recognition systems have been developed, each with unique benefits and drawbacks. The paper reviews the evolution of handwritten document recognition in qualitative and quantitative ways. Initially, the bibliometric survey is presented based on the number of articles, citations, countries, authors, etc., on handwritten document recognition in the Scopus database. Later, a survey is done on the learning techniques used for handwritten documents: text recognition, digit recognition, mathematical expression recognition, table recognition, and diagram recognition. This paper also presents the directions for future research in handwritten document recognition.</p></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"22 ","pages":"Article 200358"},"PeriodicalIF":0.0,"publicationDate":"2024-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667305324000346/pdfft?md5=008f9ee0edb201f02c7d97e969505812&pid=1-s2.0-S2667305324000346-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140179895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tree boosting methods for balanced and imbalanced classification and their robustness over time in risk assessment 用于平衡和不平衡分类的树状提升方法及其在风险评估中的长期稳健性
Pub Date : 2024-03-12 DOI: 10.1016/j.iswa.2024.200354
Gissel Velarde , Michael Weichert, Anuj Deshmunkh, Sanjay Deshmane, Anindya Sudhir, Khushboo Sharma, Vaibhav Joshi

Most real-world classification problems deal with imbalanced datasets, posing a challenge for Artificial Intelligence (AI), i.e., machine learning algorithms, because the minority class, which is of extreme interest, often proves difficult to be detected. This paper empirically evaluates tree boosting methods' performance given different dataset sizes and class distributions, from perfectly balanced to highly imbalanced. For tabular data, tree-based methods such as XGBoost, stand out in several benchmarks due to detection performance and speed. Therefore, XGBoost and Imbalance-XGBoost are evaluated. After introducing the motivation to address risk assessment with machine learning, the paper reviews evaluation metrics for detection systems or binary classifiers. It proposes a method for data preparation followed by tree boosting methods including hyper-parameter optimization. The method is evaluated on private datasets of 1 thousand (K), 10K and 100K samples on distributions with 50, 45, 25, and 5 percent positive samples. As expected, the developed method increases its recognition performance as more data is given for training and the F1 score decreases as the data distribution becomes more imbalanced, but it is still significantly superior to the baseline of precision-recall determined by the ratio of positives divided by positives and negatives. Sampling to balance the training set does not provide consistent improvement and deteriorates detection. In contrast, classifier hyper-parameter optimization improves recognition, but should be applied carefully depending on data volume and distribution. Finally, the developed method is robust to data variation over time up to some point. Retraining can be used when performance starts deteriorating.

现实世界中的大多数分类问题都与不平衡数据集有关,这给人工智能(AI),即机器学习算法带来了挑战,因为极受关注的少数类往往难以被检测到。本文根据从完全平衡到高度不平衡的不同数据集规模和类分布,对树增强方法的性能进行了实证评估。对于表格数据,XGBoost 等基于树的方法因其检测性能和速度在多个基准测试中脱颖而出。因此,我们对 XGBoost 和 Imbalance-XGBoost 进行了评估。在介绍了利用机器学习进行风险评估的动机之后,本文回顾了检测系统或二元分类器的评估指标。论文提出了一种数据准备方法,随后提出了包括超参数优化在内的树状提升方法。该方法在包含 1,000、10,000 和 100,000 个样本的私人数据集上进行了评估,样本分布的阳性率分别为 50%、45%、25% 和 5%。正如预期的那样,随着训练数据的增加,所开发的方法的识别性能也会提高,而随着数据分布变得更加不平衡,F1 分数也会降低,但它仍然明显优于由阳性样本除以阳性样本和阴性样本的比例决定的精确度-识别基线。为平衡训练集而采样并不能带来持续的改进,反而会降低检测效果。相反,分类器超参数优化可以提高识别率,但应根据数据量和分布情况谨慎应用。最后,所开发的方法在一定时间内对数据变化具有鲁棒性。当性能开始下降时,可以使用重新训练。
{"title":"Tree boosting methods for balanced and imbalanced classification and their robustness over time in risk assessment","authors":"Gissel Velarde ,&nbsp;Michael Weichert,&nbsp;Anuj Deshmunkh,&nbsp;Sanjay Deshmane,&nbsp;Anindya Sudhir,&nbsp;Khushboo Sharma,&nbsp;Vaibhav Joshi","doi":"10.1016/j.iswa.2024.200354","DOIUrl":"https://doi.org/10.1016/j.iswa.2024.200354","url":null,"abstract":"<div><p>Most real-world classification problems deal with imbalanced datasets, posing a challenge for Artificial Intelligence (AI), i.e., machine learning algorithms, because the minority class, which is of extreme interest, often proves difficult to be detected. This paper empirically evaluates tree boosting methods' performance given different dataset sizes and class distributions, from perfectly balanced to highly imbalanced. For tabular data, tree-based methods such as XGBoost, stand out in several benchmarks due to detection performance and speed. Therefore, XGBoost and Imbalance-XGBoost are evaluated. After introducing the motivation to address risk assessment with machine learning, the paper reviews evaluation metrics for detection systems or binary classifiers. It proposes a method for data preparation followed by tree boosting methods including hyper-parameter optimization. The method is evaluated on private datasets of 1 thousand (K), 10K and 100K samples on distributions with 50, 45, 25, and 5 percent positive samples. As expected, the developed method increases its recognition performance as more data is given for training and the F1 score decreases as the data distribution becomes more imbalanced, but it is still significantly superior to the baseline of precision-recall determined by the ratio of positives divided by positives and negatives. Sampling to balance the training set does not provide consistent improvement and deteriorates detection. In contrast, classifier hyper-parameter optimization improves recognition, but should be applied carefully depending on data volume and distribution. Finally, the developed method is robust to data variation over time up to some point. Retraining can be used when performance starts deteriorating.</p></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"22 ","pages":"Article 200354"},"PeriodicalIF":0.0,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667305324000309/pdfft?md5=be6e208c32a749998c8ea1ee56dcab8e&pid=1-s2.0-S2667305324000309-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140122584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
In-depth investigation of speech emotion recognition studies from past to present –The importance of emotion recognition from speech signal for AI– 从古至今语音情感识别研究的深入探究--从语音信号中识别情感对人工智能的重要性--从语音信号中识别情感对人工智能的重要性--从语音信号中识别情感对人工智能的重要性
Pub Date : 2024-03-11 DOI: 10.1016/j.iswa.2024.200351
Yeşim ÜLGEN SÖNMEZ , Asaf VAROL

In the super smart society (Society 5.0), new and rapid methods are needed for speech recognition, emotion recognition, and speech emotion recognition areas to maximize human-machine or human-computer interaction and collaboration. Speech signal contains much information about the speaker, such as age, sex, ethnicity, health condition, emotion, and thoughts. The field of study which analyzes the mood of the person from the speech is called speech emotion recognition (SER). Classifying the emotions from the speech data is a complicated problem for artificial intelligence, and its sub-discipline, machine learning. Because it is hard to analyze the speech signal which contains various frequencies and characteristics. Speech data are digitized with signal processing methods and speech features are obtained. These features vary depending on the emotions such as sadness, fear, anger, happiness, boredom, confusion, etc. Even though different methods have been developed for determining the audio properties and emotion recognition, the success rate varies depending on the languages, cultures, emotions, and data sets. In speech emotion recognition, there is a need for new methods which can be applied in data sets with different sizes, which will increase classification success, in which best properties can be obtained, and which are affordable. The success rates are affected by many factors such as the methods used, lack of speech emotion datasets, the homogeneity of the database, the difficulty of the language (linguistic differences), the noise in audio data and the length of the audio data. Within the scope of this study, studies on emotion recognition from speech signals from past to present have been analyzed in detail. In this study, classification studies based on a discrete emotion model using speech data belonging to the Berlin emotional database (EMO-DB), Italian emotional speech database (EMOVO), The Surrey audio-visual expressed emotion database (SAVEE), Ryerson Audio-Visual Database of Emotional Speech and Song Database (RAVDESS), which are mostly independent of the speaker and content, are examined. The results of both classical classifiers and deep learning methods are compared. Deep learning results are more successful, but classical classification is more important in determining the defining features of speech, song or voice. So It develops feature extraction stage. This study will be able to contribute to the literature and help the researchers in the SER field.

在超级智能社会(5.0 社会)中,语音识别、情感识别和语音情绪识别领域需要新的快速方法,以最大限度地实现人机或人机交互与协作。语音信号包含说话者的许多信息,如年龄、性别、种族、健康状况、情感和思想等。从语音中分析人的情绪的研究领域被称为语音情绪识别(SER)。对人工智能及其分支学科机器学习来说,从语音数据中进行情绪分类是一个复杂的问题。因为要分析包含各种频率和特征的语音信号非常困难。通过信号处理方法对语音数据进行数字化处理,从而获得语音特征。这些特征因情绪而异,如悲伤、恐惧、愤怒、快乐、无聊、困惑等。尽管已经开发了不同的方法来确定音频属性和情感识别,但成功率因语言、文化、情感和数据集而异。在语音情感识别方面,需要新的方法,这些方法可以应用于不同规模的数据集,提高分类成功率,获得最佳属性,而且价格合理。成功率受很多因素的影响,例如所使用的方法、缺乏语音情感数据集、数据库的同质性、语言的难度(语言差异)、音频数据中的噪声以及音频数据的长度。在本研究范围内,详细分析了从过去到现在的语音信号情感识别研究。在本研究中,基于离散情感模型的分类研究使用了柏林情感数据库(EMO-DB)、意大利情感语音数据库(EMOVO)、萨里视听表达情感数据库(SAVEE)、瑞尔森情感语音和歌曲视听数据库(RAVDESS)中的语音数据,这些数据大多与说话者和内容无关。比较了经典分类器和深度学习方法的结果。深度学习的结果更为成功,但经典分类在确定语音、歌曲或声音的定义特征方面更为重要。因此,它开发了特征提取阶段。这项研究将为相关文献做出贡献,并为 SER 领域的研究人员提供帮助。
{"title":"In-depth investigation of speech emotion recognition studies from past to present –The importance of emotion recognition from speech signal for AI–","authors":"Yeşim ÜLGEN SÖNMEZ ,&nbsp;Asaf VAROL","doi":"10.1016/j.iswa.2024.200351","DOIUrl":"https://doi.org/10.1016/j.iswa.2024.200351","url":null,"abstract":"<div><p>In the super smart society (Society 5.0), new and rapid methods are needed for speech recognition, emotion recognition, and speech emotion recognition areas to maximize human-machine or human-computer interaction and collaboration. Speech signal contains much information about the speaker, such as age, sex, ethnicity, health condition, emotion, and thoughts. The field of study which analyzes the mood of the person from the speech is called speech emotion recognition (SER). Classifying the emotions from the speech data is a complicated problem for artificial intelligence, and its sub-discipline, machine learning. Because it is hard to analyze the speech signal which contains various frequencies and characteristics. Speech data are digitized with signal processing methods and speech features are obtained. These features vary depending on the emotions such as sadness, fear, anger, happiness, boredom, confusion, etc. Even though different methods have been developed for determining the audio properties and emotion recognition, the success rate varies depending on the languages, cultures, emotions, and data sets. In speech emotion recognition, there is a need for new methods which can be applied in data sets with different sizes, which will increase classification success, in which best properties can be obtained, and which are affordable. The success rates are affected by many factors such as the methods used, lack of speech emotion datasets, the homogeneity of the database, the difficulty of the language (linguistic differences), the noise in audio data and the length of the audio data. Within the scope of this study, studies on emotion recognition from speech signals from past to present have been analyzed in detail. In this study, classification studies based on a discrete emotion model using speech data belonging to the Berlin emotional database (EMO-DB), Italian emotional speech database (EMOVO), The Surrey audio-visual expressed emotion database (SAVEE), Ryerson Audio-Visual Database of Emotional Speech and Song Database (RAVDESS), which are mostly independent of the speaker and content, are examined. The results of both classical classifiers and deep learning methods are compared. Deep learning results are more successful, but classical classification is more important in determining the defining features of speech, song or voice. So It develops feature extraction stage. This study will be able to contribute to the literature and help the researchers in the SER field.</p></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"22 ","pages":"Article 200351"},"PeriodicalIF":0.0,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667305324000279/pdfft?md5=1617124db6cea95a53e38e62a54e8824&pid=1-s2.0-S2667305324000279-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140122577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Intelligent Systems with Applications
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1