首页 > 最新文献

Big Data and Cognitive Computing最新文献

英文 中文
Toward Morphologic Atlasing of the Human Whole Brain at the Nanoscale 在纳米尺度上绘制人类全脑形态图
IF 3.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-12-01 DOI: 10.3390/bdcc7040179
W. Nowinski
Although no dataset at the nanoscale for the entire human brain has yet been acquired and neither a nanoscale human whole brain atlas has been constructed, tremendous progress in neuroimaging and high-performance computing makes them feasible in the non-distant future. To construct the human whole brain nanoscale atlas, there are several challenges, and here, we address two, i.e., the morphology modeling of the brain at the nanoscale and designing of a nanoscale brain atlas. A new nanoscale neuronal format is introduced to describe data necessary and sufficient to model the entire human brain at the nanoscale, enabling calculations of the synaptome and connectome. The design of the nanoscale brain atlas covers design principles, content, architecture, navigation, functionality, and user interface. Three novel design principles are introduced supporting navigation, exploration, and calculations, namely, a gross neuroanatomy-guided navigation of micro/nanoscale neuroanatomy; a movable and zoomable sampling volume of interest for navigation and exploration; and a nanoscale data processing in a parallel-pipeline mode exploiting parallelism resulting from the decomposition of gross neuroanatomy parcellated into structures and regions as well as nano neuroanatomy decomposed into neurons and synapses, enabling the distributed construction and continual enhancement of the nanoscale atlas. Numerous applications of this atlas can be contemplated ranging from proofreading and continual multi-site extension to exploration, morphometric and network-related analyses, and knowledge discovery. To my best knowledge, this is the first proposed neuronal morphology nanoscale model and the first attempt to design a human whole brain atlas at the nanoscale.
虽然目前还没有纳米尺度的整个人类大脑数据集,也没有纳米尺度的人类全脑图谱,但神经成像和高性能计算的巨大进步使它们在不久的将来成为可能。构建人类全脑纳米图谱面临诸多挑战,本文主要解决两个问题,即纳米尺度的脑形态学建模和纳米尺度脑图谱的设计。介绍了一种新的纳米级神经元格式来描述在纳米尺度上模拟整个人类大脑所必需和足够的数据,使突触组和连接组的计算成为可能。纳米级脑图谱的设计涵盖了设计原则、内容、架构、导航、功能和用户界面。介绍了支持导航、探索和计算的三种新的设计原则,即微/纳米尺度神经解剖学的总神经解剖学引导导航;可移动和可缩放的采样体,用于导航和勘探;采用并行管道模式的纳米级数据处理,利用了将总体神经解剖分解为结构和区域以及将纳米神经解剖分解为神经元和突触所产生的并行性,从而实现了纳米级图谱的分布式构建和持续增强。该地图集的许多应用可以考虑从校对和持续多站点扩展到探索,形态测量学和网络相关分析以及知识发现。据我所知,这是第一个提出的神经元形态纳米尺度模型,也是第一次尝试在纳米尺度上设计人类全脑图谱。
{"title":"Toward Morphologic Atlasing of the Human Whole Brain at the Nanoscale","authors":"W. Nowinski","doi":"10.3390/bdcc7040179","DOIUrl":"https://doi.org/10.3390/bdcc7040179","url":null,"abstract":"Although no dataset at the nanoscale for the entire human brain has yet been acquired and neither a nanoscale human whole brain atlas has been constructed, tremendous progress in neuroimaging and high-performance computing makes them feasible in the non-distant future. To construct the human whole brain nanoscale atlas, there are several challenges, and here, we address two, i.e., the morphology modeling of the brain at the nanoscale and designing of a nanoscale brain atlas. A new nanoscale neuronal format is introduced to describe data necessary and sufficient to model the entire human brain at the nanoscale, enabling calculations of the synaptome and connectome. The design of the nanoscale brain atlas covers design principles, content, architecture, navigation, functionality, and user interface. Three novel design principles are introduced supporting navigation, exploration, and calculations, namely, a gross neuroanatomy-guided navigation of micro/nanoscale neuroanatomy; a movable and zoomable sampling volume of interest for navigation and exploration; and a nanoscale data processing in a parallel-pipeline mode exploiting parallelism resulting from the decomposition of gross neuroanatomy parcellated into structures and regions as well as nano neuroanatomy decomposed into neurons and synapses, enabling the distributed construction and continual enhancement of the nanoscale atlas. Numerous applications of this atlas can be contemplated ranging from proofreading and continual multi-site extension to exploration, morphometric and network-related analyses, and knowledge discovery. To my best knowledge, this is the first proposed neuronal morphology nanoscale model and the first attempt to design a human whole brain atlas at the nanoscale.","PeriodicalId":36397,"journal":{"name":"Big Data and Cognitive Computing","volume":" May","pages":""},"PeriodicalIF":3.7,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138610890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Managing Cybersecurity Threats and Increasing Organizational Resilience 管理网络安全威胁,提高组织复原力
IF 3.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-11-22 DOI: 10.3390/bdcc7040177
Peter R. J. Trim, Yang-Im Lee
Cyber security is high up on the agenda of senior managers in private and public sector organizations and is likely to remain so for the foreseeable future. [...]
网络安全在私营和公共部门组织高级管理人员的议事日程中占据重要位置,在可预见的未来,网络安全可能仍将如此。[...]
{"title":"Managing Cybersecurity Threats and Increasing Organizational Resilience","authors":"Peter R. J. Trim, Yang-Im Lee","doi":"10.3390/bdcc7040177","DOIUrl":"https://doi.org/10.3390/bdcc7040177","url":null,"abstract":"Cyber security is high up on the agenda of senior managers in private and public sector organizations and is likely to remain so for the foreseeable future. [...]","PeriodicalId":36397,"journal":{"name":"Big Data and Cognitive Computing","volume":"87 ","pages":""},"PeriodicalIF":3.7,"publicationDate":"2023-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139250668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A New Approach to Data Analysis Using Machine Learning for Cybersecurity 利用机器学习进行网络安全数据分析的新方法
IF 3.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-11-21 DOI: 10.3390/bdcc7040176
Shivashankar Hiremath, Eeshan Shetty, A. J. Prakash, S. Sahoo, Kiran Kumar Patro, Kandala N. V. P. S. Rajesh, Paweł Pławiak
The internet has become an indispensable tool for organizations, permeating every facet of their operations. Virtually all companies leverage Internet services for diverse purposes, including the digital storage of data in databases and cloud platforms. Furthermore, the rising demand for software and applications has led to a widespread shift toward computer-based activities within the corporate landscape. However, this digital transformation has exposed the information technology (IT) infrastructures of these organizations to a heightened risk of cyber-attacks, endangering sensitive data. Consequently, organizations must identify and address vulnerabilities within their systems, with a primary focus on scrutinizing customer-facing websites and applications. This work aims to tackle this pressing issue by employing data analysis tools, such as Power BI, to assess vulnerabilities within a client’s application or website. Through a rigorous analysis of data, valuable insights and information will be provided, which are necessary to formulate effective remedial measures against potential attacks. Ultimately, the central goal of this research is to demonstrate that clients can establish a secure environment, shielding their digital assets from potential attackers.
互联网已成为企业不可或缺的工具,渗透到企业运营的方方面面。几乎所有公司都利用互联网服务来实现各种目的,包括在数据库和云平台中以数字方式存储数据。此外,对软件和应用程序的需求不断增长,导致企业内部普遍转向基于计算机的活动。然而,这种数字化转型使这些组织的信息技术(IT)基础设施面临更高的网络攻击风险,从而危及敏感数据。因此,企业必须识别并解决系统中的漏洞,重点是仔细检查面向客户的网站和应用程序。这项工作旨在利用 Power BI 等数据分析工具来评估客户应用程序或网站中的漏洞,从而解决这一紧迫问题。通过对数据的严格分析,将提供有价值的见解和信息,这些见解和信息是针对潜在攻击制定有效补救措施所必需的。最终,本研究的核心目标是证明客户可以建立一个安全的环境,保护其数字资产免受潜在攻击者的侵害。
{"title":"A New Approach to Data Analysis Using Machine Learning for Cybersecurity","authors":"Shivashankar Hiremath, Eeshan Shetty, A. J. Prakash, S. Sahoo, Kiran Kumar Patro, Kandala N. V. P. S. Rajesh, Paweł Pławiak","doi":"10.3390/bdcc7040176","DOIUrl":"https://doi.org/10.3390/bdcc7040176","url":null,"abstract":"The internet has become an indispensable tool for organizations, permeating every facet of their operations. Virtually all companies leverage Internet services for diverse purposes, including the digital storage of data in databases and cloud platforms. Furthermore, the rising demand for software and applications has led to a widespread shift toward computer-based activities within the corporate landscape. However, this digital transformation has exposed the information technology (IT) infrastructures of these organizations to a heightened risk of cyber-attacks, endangering sensitive data. Consequently, organizations must identify and address vulnerabilities within their systems, with a primary focus on scrutinizing customer-facing websites and applications. This work aims to tackle this pressing issue by employing data analysis tools, such as Power BI, to assess vulnerabilities within a client’s application or website. Through a rigorous analysis of data, valuable insights and information will be provided, which are necessary to formulate effective remedial measures against potential attacks. Ultimately, the central goal of this research is to demonstrate that clients can establish a secure environment, shielding their digital assets from potential attackers.","PeriodicalId":36397,"journal":{"name":"Big Data and Cognitive Computing","volume":"1 1","pages":""},"PeriodicalIF":3.7,"publicationDate":"2023-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139253609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Empowering Propaganda Detection in Resource-Restraint Languages: A Transformer-Based Framework for Classifying Hindi News Articles 增强资源受限语言的宣传检测能力:基于变换器的印地语新闻文章分类框架
IF 3.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-11-15 DOI: 10.3390/bdcc7040175
Deptii D. Chaudhari, Ambika Vishal Pawar
Misinformation, fake news, and various propaganda techniques are increasingly used in digital media. It becomes challenging to uncover propaganda as it works with the systematic goal of influencing other individuals for the determined ends. While significant research has been reported on propaganda identification and classification in resource-rich languages such as English, much less effort has been made in resource-deprived languages like Hindi. The spread of propaganda in the Hindi news media has induced our attempt to devise an approach for the propaganda categorization of Hindi news articles. The unavailability of the necessary language tools makes propaganda classification in Hindi more challenging. This study proposes the effective use of deep learning and transformer-based approaches for Hindi computational propaganda classification. To address the lack of pretrained word embeddings in Hindi, Hindi Word2vec embeddings were created using the H-Prop-News corpus for feature extraction. Subsequently, three deep learning models, i.e., CNN (convolutional neural network), LSTM (long short-term memory), Bi-LSTM (bidirectional long short-term memory); and four transformer-based models, i.e., multi-lingual BERT, Distil-BERT, Hindi-BERT, and Hindi-TPU-Electra, were experimented with. The experimental outcomes indicate that the multi-lingual BERT and Hindi-BERT models provide the best performance, with the highest F1 score of 84% on the test data. These results strongly support the efficacy of the proposed solution and indicate its appropriateness for propaganda classification.
错误信息、假新闻和各种宣传手段在数字媒体中的使用越来越多。由于宣传的系统性目标是影响他人以达到既定目的,因此揭露宣传变得极具挑战性。在英语等资源丰富的语言中,对宣传的识别和分类已有大量研究报道,但在印地语等资源匮乏的语言中,这方面的研究却少得多。宣传在印地语新闻媒体中的传播促使我们尝试设计一种对印地语新闻文章进行宣传分类的方法。由于缺乏必要的语言工具,印地语的宣传分类更具挑战性。本研究提出有效利用深度学习和基于转换器的方法来进行印地语计算宣传分类。为了解决印地语缺乏预训练词嵌入的问题,我们使用 H-Prop-News 语料库创建了印地语 Word2vec 嵌入,用于特征提取。随后,对三种深度学习模型,即 CNN(卷积神经网络)、LSTM(长短期记忆)和 Bi-LSTM(双向长短期记忆),以及四种基于转换器的模型,即多语言 BERT、Distil-BERT、Handi-BERT 和 Hindi-TPU-Electra 进行了实验。实验结果表明,多语言 BERT 和 Hindi-BERT 模型性能最佳,在测试数据上的 F1 分数最高,达到 84%。这些结果有力地证明了所提解决方案的有效性,并表明其适用于宣传分类。
{"title":"Empowering Propaganda Detection in Resource-Restraint Languages: A Transformer-Based Framework for Classifying Hindi News Articles","authors":"Deptii D. Chaudhari, Ambika Vishal Pawar","doi":"10.3390/bdcc7040175","DOIUrl":"https://doi.org/10.3390/bdcc7040175","url":null,"abstract":"Misinformation, fake news, and various propaganda techniques are increasingly used in digital media. It becomes challenging to uncover propaganda as it works with the systematic goal of influencing other individuals for the determined ends. While significant research has been reported on propaganda identification and classification in resource-rich languages such as English, much less effort has been made in resource-deprived languages like Hindi. The spread of propaganda in the Hindi news media has induced our attempt to devise an approach for the propaganda categorization of Hindi news articles. The unavailability of the necessary language tools makes propaganda classification in Hindi more challenging. This study proposes the effective use of deep learning and transformer-based approaches for Hindi computational propaganda classification. To address the lack of pretrained word embeddings in Hindi, Hindi Word2vec embeddings were created using the H-Prop-News corpus for feature extraction. Subsequently, three deep learning models, i.e., CNN (convolutional neural network), LSTM (long short-term memory), Bi-LSTM (bidirectional long short-term memory); and four transformer-based models, i.e., multi-lingual BERT, Distil-BERT, Hindi-BERT, and Hindi-TPU-Electra, were experimented with. The experimental outcomes indicate that the multi-lingual BERT and Hindi-BERT models provide the best performance, with the highest F1 score of 84% on the test data. These results strongly support the efficacy of the proposed solution and indicate its appropriateness for propaganda classification.","PeriodicalId":36397,"journal":{"name":"Big Data and Cognitive Computing","volume":"27 2","pages":""},"PeriodicalIF":3.7,"publicationDate":"2023-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139274664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimization of Cryptocurrency Algorithmic Trading Strategies Using the Decomposition Approach 使用分解法优化加密货币算法交易策略
IF 3.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-11-14 DOI: 10.3390/bdcc7040174
Sherin M. Omran, Wessam H. El-Behaidy, A. Youssif
A cryptocurrency is a non-centralized form of money that facilitates financial transactions using cryptographic processes. It can be thought of as a virtual currency or a payment mechanism for sending and receiving money online. Cryptocurrencies have gained wide market acceptance and rapid development during the past few years. Due to the volatile nature of the crypto-market, cryptocurrency trading involves a high level of risk. In this paper, a new normalized decomposition-based, multi-objective particle swarm optimization (N-MOPSO/D) algorithm is presented for cryptocurrency algorithmic trading. The aim of this algorithm is to help traders find the best Litecoin trading strategies that improve their outcomes. The proposed algorithm is used to manage the trade-offs among three objectives: the return on investment, the Sortino ratio, and the number of trades. A hybrid weight assignment mechanism has also been proposed. It was compared against the trading rules with their standard parameters, MOPSO/D, using normalized weighted Tchebycheff scalarization, and MOEA/D. The proposed algorithm could outperform the counterpart algorithms for benchmark and real-world problems. Results showed that the proposed algorithm is very promising and stable under different market conditions. It could maintain the best returns and risk during both training and testing with a moderate number of trades.
加密货币是一种非中心化的货币形式,利用加密过程促进金融交易。它可以被视为一种虚拟货币或一种在线收发货币的支付机制。在过去几年中,加密货币获得了广泛的市场认可和快速发展。由于加密货币市场的不稳定性,加密货币交易涉及高风险。本文针对加密货币算法交易提出了一种新的基于归一化分解的多目标粒子群优化(N-MOPSO/D)算法。该算法的目的是帮助交易者找到最佳的莱特币交易策略,从而提高交易结果。所提出的算法用于管理三个目标之间的权衡:投资回报率、索蒂诺比率和交易次数。此外,还提出了一种混合权重分配机制。它与带有标准参数的交易规则、MOPSO/D(使用归一化加权 Tchebycheff 标量化)和 MOEA/D 进行了比较。在基准问题和实际问题上,所提出的算法优于其他算法。结果表明,所提出的算法在不同的市场条件下都具有很好的前景和稳定性。在适量交易的情况下,该算法在训练和测试期间都能保持最佳收益和风险。
{"title":"Optimization of Cryptocurrency Algorithmic Trading Strategies Using the Decomposition Approach","authors":"Sherin M. Omran, Wessam H. El-Behaidy, A. Youssif","doi":"10.3390/bdcc7040174","DOIUrl":"https://doi.org/10.3390/bdcc7040174","url":null,"abstract":"A cryptocurrency is a non-centralized form of money that facilitates financial transactions using cryptographic processes. It can be thought of as a virtual currency or a payment mechanism for sending and receiving money online. Cryptocurrencies have gained wide market acceptance and rapid development during the past few years. Due to the volatile nature of the crypto-market, cryptocurrency trading involves a high level of risk. In this paper, a new normalized decomposition-based, multi-objective particle swarm optimization (N-MOPSO/D) algorithm is presented for cryptocurrency algorithmic trading. The aim of this algorithm is to help traders find the best Litecoin trading strategies that improve their outcomes. The proposed algorithm is used to manage the trade-offs among three objectives: the return on investment, the Sortino ratio, and the number of trades. A hybrid weight assignment mechanism has also been proposed. It was compared against the trading rules with their standard parameters, MOPSO/D, using normalized weighted Tchebycheff scalarization, and MOEA/D. The proposed algorithm could outperform the counterpart algorithms for benchmark and real-world problems. Results showed that the proposed algorithm is very promising and stable under different market conditions. It could maintain the best returns and risk during both training and testing with a moderate number of trades.","PeriodicalId":36397,"journal":{"name":"Big Data and Cognitive Computing","volume":"17 1","pages":""},"PeriodicalIF":3.7,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139276436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Semantic Adjacency Criterion in Time Intervals Mining 时间间隔挖掘中的语义邻接准则
Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-11-09 DOI: 10.3390/bdcc7040173
Alexander Shknevsky, Yuval Shahar, Robert Moskovitch
We propose a new pruning constraint when mining frequent temporal patterns to be used as classification and prediction features, the Semantic Adjacency Criterion [SAC], which filters out temporal patterns that contain potentially semantically contradictory components, exploiting each medical domain’s knowledge. We have defined three SAC versions and tested them within three medical domains (oncology, hepatitis, diabetes) and a frequent-temporal-pattern discovery framework. Previously, we had shown that using SAC enhances the repeatability of discovering the same temporal patterns in similar proportions in different patient groups within the same clinical domain. Here, we focused on SAC’s computational implications for pattern discovery, and for classification and prediction, using the discovered patterns as features, by four different machine-learning methods: Random Forests, Naïve Bayes, SVM, and Logistic Regression. Using SAC resulted in a significant reduction, across all medical domains and classification methods, of up to 97% in the number of discovered temporal patterns, and in the runtime of the discovery process, of up to 98%. Nevertheless, the highly reduced set of only semantically transparent patterns, when used as features, resulted in classification and prediction models whose performance was at least as good as the models resulting from using the complete temporal-pattern set.
我们提出了一种新的修剪约束,当挖掘频繁的时间模式用于分类和预测特征时,语义邻接准则[SAC],它过滤掉包含潜在语义矛盾成分的时间模式,利用每个医学领域的知识。我们定义了三个SAC版本,并在三个医学领域(肿瘤学、肝炎、糖尿病)和一个频繁时间模式发现框架中对它们进行了测试。之前,我们已经证明,使用SAC可以提高在同一临床领域内不同患者组中以相似比例发现相同时间模式的可重复性。在这里,我们关注SAC在模式发现、分类和预测方面的计算意义,使用发现的模式作为特征,通过四种不同的机器学习方法:随机森林、Naïve贝叶斯、支持向量机和逻辑回归。在所有医学领域和分类方法中,使用SAC可以显著减少发现的时间模式的数量,最多可减少97%,在发现过程的运行时中,最多可减少98%。然而,当使用高度简化的语义透明模式集作为特征时,产生的分类和预测模型的性能至少与使用完整时间模式集产生的模型一样好。
{"title":"The Semantic Adjacency Criterion in Time Intervals Mining","authors":"Alexander Shknevsky, Yuval Shahar, Robert Moskovitch","doi":"10.3390/bdcc7040173","DOIUrl":"https://doi.org/10.3390/bdcc7040173","url":null,"abstract":"We propose a new pruning constraint when mining frequent temporal patterns to be used as classification and prediction features, the Semantic Adjacency Criterion [SAC], which filters out temporal patterns that contain potentially semantically contradictory components, exploiting each medical domain’s knowledge. We have defined three SAC versions and tested them within three medical domains (oncology, hepatitis, diabetes) and a frequent-temporal-pattern discovery framework. Previously, we had shown that using SAC enhances the repeatability of discovering the same temporal patterns in similar proportions in different patient groups within the same clinical domain. Here, we focused on SAC’s computational implications for pattern discovery, and for classification and prediction, using the discovered patterns as features, by four different machine-learning methods: Random Forests, Naïve Bayes, SVM, and Logistic Regression. Using SAC resulted in a significant reduction, across all medical domains and classification methods, of up to 97% in the number of discovered temporal patterns, and in the runtime of the discovery process, of up to 98%. Nevertheless, the highly reduced set of only semantically transparent patterns, when used as features, resulted in classification and prediction models whose performance was at least as good as the models resulting from using the complete temporal-pattern set.","PeriodicalId":36397,"journal":{"name":"Big Data and Cognitive Computing","volume":" 94","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135191533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Evaluation of Short-Term Rockburst Risk Severity Using Machine Learning Methods 利用机器学习方法评估短期岩爆风险严重程度
Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-11-07 DOI: 10.3390/bdcc7040172
Aibing Jin, Prabhat Basnet, Shakil Mahtab
In deep engineering, rockburst hazards frequently result in injuries, fatalities, and the destruction of contiguous structures. Due to the complex nature of rockbursts, predicting the severity of rockburst damage (intensity) without the aid of computer models is challenging. Although there are various predictive models in existence, effectively identifying the risk severity in imbalanced data remains crucial. The ensemble boosting method is often better suited to dealing with unequally distributed classes than are classical models. Therefore, this paper employs the ensemble categorical gradient boosting (CGB) method to predict short-term rockburst risk severity. After data collection, principal component analysis (PCA) was employed to avoid the redundancies caused by multi-collinearity. Afterwards, the CGB was trained on PCA data, optimal hyper-parameters were retrieved using the grid-search technique to predict the test samples, and performance was evaluated using precision, recall, and F1 score metrics. The results showed that the PCA-CGB model achieved better results in prediction than did the single CGB model or conventional boosting methods. The model achieved an F1 score of 0.8952, indicating that the proposed model is robust in predicting damage severity given an imbalanced dataset. This work provides practical guidance in risk management.
在深部工程中,岩爆灾害经常造成人员伤亡和相邻结构的破坏。由于岩爆的复杂性,在没有计算机模型的帮助下预测岩爆损伤的严重程度(强度)是具有挑战性的。尽管存在各种预测模型,但有效识别不平衡数据中的风险严重程度仍然至关重要。集成增强方法通常比经典模型更适合于处理不均匀分布的类。因此,本文采用集合分类梯度提升法(CGB)预测短期岩爆风险严重程度。数据采集后,采用主成分分析(PCA)避免多重共线性造成的冗余。然后,在PCA数据上对CGB进行训练,使用网格搜索技术检索最优超参数来预测测试样本,并使用精度、召回率和F1分数指标来评估性能。结果表明,PCA-CGB模型的预测效果优于单一CGB模型或常规助推方法。该模型的F1得分为0.8952,表明该模型在不平衡数据集下预测损伤严重程度具有鲁棒性。这项工作为风险管理提供了实际指导。
{"title":"Evaluation of Short-Term Rockburst Risk Severity Using Machine Learning Methods","authors":"Aibing Jin, Prabhat Basnet, Shakil Mahtab","doi":"10.3390/bdcc7040172","DOIUrl":"https://doi.org/10.3390/bdcc7040172","url":null,"abstract":"In deep engineering, rockburst hazards frequently result in injuries, fatalities, and the destruction of contiguous structures. Due to the complex nature of rockbursts, predicting the severity of rockburst damage (intensity) without the aid of computer models is challenging. Although there are various predictive models in existence, effectively identifying the risk severity in imbalanced data remains crucial. The ensemble boosting method is often better suited to dealing with unequally distributed classes than are classical models. Therefore, this paper employs the ensemble categorical gradient boosting (CGB) method to predict short-term rockburst risk severity. After data collection, principal component analysis (PCA) was employed to avoid the redundancies caused by multi-collinearity. Afterwards, the CGB was trained on PCA data, optimal hyper-parameters were retrieved using the grid-search technique to predict the test samples, and performance was evaluated using precision, recall, and F1 score metrics. The results showed that the PCA-CGB model achieved better results in prediction than did the single CGB model or conventional boosting methods. The model achieved an F1 score of 0.8952, indicating that the proposed model is robust in predicting damage severity given an imbalanced dataset. This work provides practical guidance in risk management.","PeriodicalId":36397,"journal":{"name":"Big Data and Cognitive Computing","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135433118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Social Trend Mining: Lead or Lag 社交趋势挖掘:领先还是落后
Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-11-07 DOI: 10.3390/bdcc7040171
Hossein Hassani, Nadejda Komendantova, Elena Rovenskaya, Mohammad Reza Yeganegi
This research underscores the profound implications of Social Intelligence Mining, notably employing open access data and Google Search engine data for trend discernment. Utilizing advanced analytical methodologies, including wavelet coherence analysis and phase difference, hidden relationships and patterns within social data were revealed. These techniques furnish an enriched comprehension of social phenomena dynamics, bolstering decision-making processes. The study’s versatility extends across myriad domains, offering insights into public sentiment and the foresight for strategic approaches. The findings suggest immense potential in Social Intelligence Mining to influence strategies, foster innovation, and add value across diverse sectors.
这项研究强调了社会智能挖掘的深刻含义,特别是使用开放获取数据和谷歌搜索引擎数据进行趋势识别。利用先进的分析方法,包括小波相干分析和相位差,揭示了社会数据中隐藏的关系和模式。这些技术提供了对社会现象动态的丰富理解,支持了决策过程。这项研究的多功能性跨越了无数领域,提供了对公众情绪的洞察和对战略方法的预见。研究结果表明,社会智能挖掘在影响不同行业的战略、促进创新和增加价值方面具有巨大潜力。
{"title":"Social Trend Mining: Lead or Lag","authors":"Hossein Hassani, Nadejda Komendantova, Elena Rovenskaya, Mohammad Reza Yeganegi","doi":"10.3390/bdcc7040171","DOIUrl":"https://doi.org/10.3390/bdcc7040171","url":null,"abstract":"This research underscores the profound implications of Social Intelligence Mining, notably employing open access data and Google Search engine data for trend discernment. Utilizing advanced analytical methodologies, including wavelet coherence analysis and phase difference, hidden relationships and patterns within social data were revealed. These techniques furnish an enriched comprehension of social phenomena dynamics, bolstering decision-making processes. The study’s versatility extends across myriad domains, offering insights into public sentiment and the foresight for strategic approaches. The findings suggest immense potential in Social Intelligence Mining to influence strategies, foster innovation, and add value across diverse sectors.","PeriodicalId":36397,"journal":{"name":"Big Data and Cognitive Computing","volume":"2 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135433108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Arabic Toxic Tweet Classification: Leveraging the AraBERT Model 阿拉伯语有毒推文分类:利用AraBERT模型
Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-10-26 DOI: 10.3390/bdcc7040170
Amr Mohamed El Koshiry, Entesar Hamed I. Eliwa, Tarek Abd El-Hafeez, Ahmed Omar
Social media platforms have become the primary means of communication and information sharing, facilitating interactive exchanges among users. Unfortunately, these platforms also witness the dissemination of inappropriate and toxic content, including hate speech and insults. While significant efforts have been made to classify toxic content in the English language, the same level of attention has not been given to Arabic texts. This study addresses this gap by constructing a standardized Arabic dataset specifically designed for toxic tweet classification. The dataset is annotated automatically using Google’s Perspective API and the expertise of three native Arabic speakers and linguists. To evaluate the performance of different models, we conduct a series of experiments using seven models: long short-term memory (LSTM), bidirectional LSTM, a convolutional neural network, a gated recurrent unit (GRU), bidirectional GRU, multilingual bidirectional encoder representations from transformers, and AraBERT. Additionally, we employ word embedding techniques. Our experimental findings demonstrate that the fine-tuned AraBERT model surpasses the performance of other models, achieving an impressive accuracy of 0.9960. Notably, this accuracy value outperforms similar approaches reported in recent literature. This study represents a significant advancement in Arabic toxic tweet classification, shedding light on the importance of addressing toxicity in social media platforms while considering diverse languages and cultures.
社交媒体平台已经成为沟通和信息分享的主要手段,方便了用户之间的互动交流。不幸的是,这些平台也见证了不恰当和有毒内容的传播,包括仇恨言论和侮辱。虽然已作出重大努力对英语语文的有毒内容进行分类,但对阿拉伯语文本却没有给予同样的重视。本研究通过构建一个专门为有毒推文分类设计的标准化阿拉伯语数据集来解决这一差距。该数据集使用Google的Perspective API和三位母语为阿拉伯语的语言学家的专业知识自动注释。为了评估不同模型的性能,我们使用七个模型进行了一系列实验:长短期记忆(LSTM),双向LSTM,卷积神经网络,门通循环单元(GRU),双向GRU,多语言双向编码器表示来自变压器和AraBERT。此外,我们还采用了词嵌入技术。我们的实验结果表明,经过微调的AraBERT模型的性能优于其他模型,达到了令人印象深刻的0.9960的精度。值得注意的是,该精度值优于最近文献中报道的类似方法。这项研究代表了阿拉伯语有毒推文分类的重大进步,揭示了在考虑不同语言和文化的情况下解决社交媒体平台毒性问题的重要性。
{"title":"Arabic Toxic Tweet Classification: Leveraging the AraBERT Model","authors":"Amr Mohamed El Koshiry, Entesar Hamed I. Eliwa, Tarek Abd El-Hafeez, Ahmed Omar","doi":"10.3390/bdcc7040170","DOIUrl":"https://doi.org/10.3390/bdcc7040170","url":null,"abstract":"Social media platforms have become the primary means of communication and information sharing, facilitating interactive exchanges among users. Unfortunately, these platforms also witness the dissemination of inappropriate and toxic content, including hate speech and insults. While significant efforts have been made to classify toxic content in the English language, the same level of attention has not been given to Arabic texts. This study addresses this gap by constructing a standardized Arabic dataset specifically designed for toxic tweet classification. The dataset is annotated automatically using Google’s Perspective API and the expertise of three native Arabic speakers and linguists. To evaluate the performance of different models, we conduct a series of experiments using seven models: long short-term memory (LSTM), bidirectional LSTM, a convolutional neural network, a gated recurrent unit (GRU), bidirectional GRU, multilingual bidirectional encoder representations from transformers, and AraBERT. Additionally, we employ word embedding techniques. Our experimental findings demonstrate that the fine-tuned AraBERT model surpasses the performance of other models, achieving an impressive accuracy of 0.9960. Notably, this accuracy value outperforms similar approaches reported in recent literature. This study represents a significant advancement in Arabic toxic tweet classification, shedding light on the importance of addressing toxicity in social media platforms while considering diverse languages and cultures.","PeriodicalId":36397,"journal":{"name":"Big Data and Cognitive Computing","volume":"105 12","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134907884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessment of Security KPIs for 5G Network Slices for Special Groups of Subscribers 面向特殊用户群体的5G网络切片安全kpi评估
Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-10-26 DOI: 10.3390/bdcc7040169
Roman Odarchenko, Maksim Iavich, Giorgi Iashvili, Solomiia Fedushko, Yuriy Syerov
It is clear that 5G networks have already become integral to our present. However, a significant issue lies in the fact that current 5G communication systems are incapable of fully ensuring the required quality of service and the security of transmitted data, especially in government networks that operate in the context of the Internet of Things, hostilities, hybrid warfare, and cyberwarfare. The use of 5G extends to critical infrastructure operators and special users such as law enforcement, governments, and the military. Adapting modern cellular networks to meet the specific needs of these special users is not only feasible but also necessary. In doing so, these networks must meet additional stringent requirements for reliability, performance, and, most importantly, data security. This scientific paper is dedicated to addressing the challenges associated with ensuring cybersecurity in this context. To effectively improve or ensure a sufficient level of cybersecurity, it is essential to measure the primary indicators of the effectiveness of the security system. At the moment, there are no comprehensive lists of these key indicators that require priority monitoring. Therefore, this article first analyzed the existing similar indicators and presented a list of them, which will make it possible to continuously monitor the state of cybersecurity systems of 5G cellular networks with the aim of using them for groups of special users. Based on this list of cybersecurity KPIs, as a result, this article presents a model to identify and evaluate these indicators. To develop this model, we comprehensively analyzed potential groups of performance indicators, selected the most relevant ones, and introduced a mathematical framework for their quantitative assessment. Furthermore, as part of our research efforts, we proposed enhancements to the core of the 4G/5G network. These enhancements enable data collection and statistical analysis through specialized sensors and existing servers, contributing to improved cybersecurity within these networks. Thus, the approach proposed in the article opens up an opportunity for continuous monitoring and, accordingly, improving the performance indicators of cybersecurity systems, which in turn makes it possible to use them for the maintenance of critical infrastructure and other users whose service presents increased requirements for cybersecurity systems.
很明显,5G网络已经成为我们这个时代不可或缺的一部分。然而,一个重要的问题在于,目前的5G通信系统无法完全确保所需的服务质量和传输数据的安全性,特别是在物联网、敌对行动、混合战争和网络战背景下运行的政府网络中。5G的使用扩展到关键基础设施运营商和特殊用户,如执法部门、政府和军队。调整现代蜂窝网络以满足这些特殊用户的特定需求不仅是可行的,而且是必要的。为此,这些网络必须满足对可靠性、性能以及最重要的数据安全性的额外严格要求。这篇科学论文致力于解决在这种情况下与确保网络安全相关的挑战。要有效提高或确保足够的网络安全水平,必须衡量安全体系有效性的主要指标。目前,没有需要优先监测的这些关键指标的全面清单。因此,本文首先对现有的类似指标进行了分析,并给出了一份清单,这将使5G蜂窝网络的网络安全系统状态的持续监控成为可能,目的是将其用于特殊用户群体。基于这些网络安全kpi列表,本文提出了一个识别和评估这些指标的模型。为了建立这个模型,我们综合分析了潜在的绩效指标组,选择了最相关的绩效指标组,并引入了量化评估的数学框架。此外,作为我们研究工作的一部分,我们提出了对4G/5G网络核心的增强。这些增强功能可以通过专门的传感器和现有服务器进行数据收集和统计分析,有助于提高这些网络的网络安全。因此,本文中提出的方法为持续监控提供了机会,并相应地改善了网络安全系统的性能指标,从而使其能够用于维护关键基础设施和对网络安全系统提出更高要求的其他用户。
{"title":"Assessment of Security KPIs for 5G Network Slices for Special Groups of Subscribers","authors":"Roman Odarchenko, Maksim Iavich, Giorgi Iashvili, Solomiia Fedushko, Yuriy Syerov","doi":"10.3390/bdcc7040169","DOIUrl":"https://doi.org/10.3390/bdcc7040169","url":null,"abstract":"It is clear that 5G networks have already become integral to our present. However, a significant issue lies in the fact that current 5G communication systems are incapable of fully ensuring the required quality of service and the security of transmitted data, especially in government networks that operate in the context of the Internet of Things, hostilities, hybrid warfare, and cyberwarfare. The use of 5G extends to critical infrastructure operators and special users such as law enforcement, governments, and the military. Adapting modern cellular networks to meet the specific needs of these special users is not only feasible but also necessary. In doing so, these networks must meet additional stringent requirements for reliability, performance, and, most importantly, data security. This scientific paper is dedicated to addressing the challenges associated with ensuring cybersecurity in this context. To effectively improve or ensure a sufficient level of cybersecurity, it is essential to measure the primary indicators of the effectiveness of the security system. At the moment, there are no comprehensive lists of these key indicators that require priority monitoring. Therefore, this article first analyzed the existing similar indicators and presented a list of them, which will make it possible to continuously monitor the state of cybersecurity systems of 5G cellular networks with the aim of using them for groups of special users. Based on this list of cybersecurity KPIs, as a result, this article presents a model to identify and evaluate these indicators. To develop this model, we comprehensively analyzed potential groups of performance indicators, selected the most relevant ones, and introduced a mathematical framework for their quantitative assessment. Furthermore, as part of our research efforts, we proposed enhancements to the core of the 4G/5G network. These enhancements enable data collection and statistical analysis through specialized sensors and existing servers, contributing to improved cybersecurity within these networks. Thus, the approach proposed in the article opens up an opportunity for continuous monitoring and, accordingly, improving the performance indicators of cybersecurity systems, which in turn makes it possible to use them for the maintenance of critical infrastructure and other users whose service presents increased requirements for cybersecurity systems.","PeriodicalId":36397,"journal":{"name":"Big Data and Cognitive Computing","volume":"42 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136381149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
Big Data and Cognitive Computing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1