首页 > 最新文献

Big Data Mining and Analytics最新文献

英文 中文
Call for Papers: Special Issue on Big Data Computing for Internet of Things and Utility and Cloud Computing 论文征集:物联网大数据计算与公用事业与云计算特刊
IF 13.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-06-09 DOI: 10.26599/BDMA.2022.9020011
{"title":"Call for Papers: Special Issue on Big Data Computing for Internet of Things and Utility and Cloud Computing","authors":"","doi":"10.26599/BDMA.2022.9020011","DOIUrl":"https://doi.org/10.26599/BDMA.2022.9020011","url":null,"abstract":"","PeriodicalId":52355,"journal":{"name":"Big Data Mining and Analytics","volume":"5 3","pages":"270-270"},"PeriodicalIF":13.6,"publicationDate":"2022-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8254253/9793354/09792624.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68010340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
$p$-Norm Broad Learning for Negative Emotion Classification in Social Networks 社交网络中消极情绪分类的$p$-范数广义学习
IF 13.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-06-09 DOI: 10.26599/BDMA.2022.9020008
Guanghao Chen;Sancheng Peng;Rong Zeng;Zhongwang Hu;Lihong Cao;Yongmei Zhou;Zhouhao Ouyang;Xiangyu Nie
Negative emotion classification refers to the automatic classification of negative emotion of texts in social networks. Most existing methods are based on deep learning models, facing challenges such as complex structures and too many hyperparameters. To meet these challenges, in this paper, we propose a method for negative emotion classification utilizing a Robustly Optimized BERT Pretraining Approach (RoBERTa) and $p$-norm Broad Learning ($p$-BL). Specifically, there are mainly three contributions in this paper. Firstly, we fine-tune the RoBERTa to adapt it to the task of negative emotion classification. Then, we employ the fine-tuned RoBERTa to extract features of original texts and generate sentence vectors. Secondly, we adopt $p$-BL to construct a classifier and then predict negative emotions of texts using the classifier. Compared with deep learning models, $p$-BL has advantages such as a simple structure that is only 3-layer and fewer parameters to be trained. Moreover, it can suppress the adverse effects of more outliers and noise in data by flexibly changing the value of $p$. Thirdly, we conduct extensive experiments on the public datasets, and the experimental results show that our proposed method outperforms the baseline methods on the tested datasets.
负面情绪分类是指对社交网络中文本的负面情绪进行自动分类。现有的大多数方法都基于深度学习模型,面临着结构复杂和超参数过多等挑战。为了应对这些挑战,在本文中,我们提出了一种利用鲁棒优化的BERT预训练方法(RoBERTa)和$p$-normal广义学习($p$-BL)进行负面情绪分类的方法。具体而言,本文主要有三点贡献。首先,我们对RoBERTa进行了微调,使其适应负面情绪分类的任务。然后,我们使用微调的RoBERTa来提取原始文本的特征并生成句子向量。其次,我们采用$p$-BL构造分类器,然后使用该分类器预测文本的负面情绪。与深度学习模型相比,$p$-BL具有结构简单、仅为3层、需要训练的参数较少等优点。此外,它可以通过灵活地更改$p$的值来抑制数据中更多异常值和噪声的不利影响。第三,我们在公共数据集上进行了广泛的实验,实验结果表明,我们提出的方法在测试数据集上优于基线方法。
{"title":"$p$-Norm Broad Learning for Negative Emotion Classification in Social Networks","authors":"Guanghao Chen;Sancheng Peng;Rong Zeng;Zhongwang Hu;Lihong Cao;Yongmei Zhou;Zhouhao Ouyang;Xiangyu Nie","doi":"10.26599/BDMA.2022.9020008","DOIUrl":"https://doi.org/10.26599/BDMA.2022.9020008","url":null,"abstract":"Negative emotion classification refers to the automatic classification of negative emotion of texts in social networks. Most existing methods are based on deep learning models, facing challenges such as complex structures and too many hyperparameters. To meet these challenges, in this paper, we propose a method for negative emotion classification utilizing a Robustly Optimized BERT Pretraining Approach (RoBERTa) and \u0000<tex>$p$</tex>\u0000-norm Broad Learning (\u0000<tex>$p$</tex>\u0000-BL). Specifically, there are mainly three contributions in this paper. Firstly, we fine-tune the RoBERTa to adapt it to the task of negative emotion classification. Then, we employ the fine-tuned RoBERTa to extract features of original texts and generate sentence vectors. Secondly, we adopt \u0000<tex>$p$</tex>\u0000-BL to construct a classifier and then predict negative emotions of texts using the classifier. Compared with deep learning models, \u0000<tex>$p$</tex>\u0000-BL has advantages such as a simple structure that is only 3-layer and fewer parameters to be trained. Moreover, it can suppress the adverse effects of more outliers and noise in data by flexibly changing the value of \u0000<tex>$p$</tex>\u0000. Thirdly, we conduct extensive experiments on the public datasets, and the experimental results show that our proposed method outperforms the baseline methods on the tested datasets.","PeriodicalId":52355,"journal":{"name":"Big Data Mining and Analytics","volume":"5 3","pages":"245-256"},"PeriodicalIF":13.6,"publicationDate":"2022-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8254253/9793354/09793355.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68010342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
An Optimized Sanitization Approach for Minable Data Publication 一种可挖掘数据发布的优化消毒方法
IF 13.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-06-09 DOI: 10.26599/BDMA.2022.9020007
Fan Yang;Xiaofeng Liao
Minable data publication is ubiquitous since it is beneficial to sharing/trading data among commercial companies and further facilitates the development of data-driven tasks. Unfortunately, the minable data publication is often implemented by publishers with limited privacy concerns such that the published dataset is minable by malicious entities. It prohibits minable data publication since the published data may contain sensitive information. Thus, it is urgently demanded to present some approaches and technologies for reducing the privacy leakage risks. To this end, in this paper, we propose an optimized sanitization approach for minable data publication (named as SA-MDP). SA-MDP supports association rules mining function while providing privacy protection for specific rules. In SA-MDP, we consider the trade-off between the data utility and the data privacy in the minable data publication problem. To address this problem, SA-MDP designs a customized particle swarm optimization (PSO) algorithm, where the optimization objective is determined by both the data utility and the data privacy. Specifically, we take advantage of PSO to produce new particles, which is achieved by random mutation or learning from the best particle. Hence, SA-MDP can avoid the solutions being trapped into local optima. Besides, we design a proper fitness function to guide the particles to run towards the optimal solution. Additionally, we present a preprocessing method before the evolution process of the customized PSO algorithm to improve the convergence rate. Finally, the proposed SA-MDP approach is performed and verified over several datasets. The experimental results have demonstrated the effectiveness and efficiency of SA-MDP.
Minable数据发布无处不在,因为它有利于商业公司之间的数据共享/交易,并进一步促进数据驱动任务的开发。不幸的是,可挖掘数据发布通常由具有有限隐私问题的发布者实现,因此已发布的数据集可被恶意实体挖掘。它禁止发布可挖掘数据,因为发布的数据可能包含敏感信息。因此,迫切需要提出一些降低隐私泄露风险的方法和技术。为此,在本文中,我们提出了一种用于可挖掘数据发布的优化消毒方法(称为SA-MDP)。SA-MDP支持关联规则挖掘功能,同时为特定规则提供隐私保护。在SA-MDP中,我们考虑了可挖掘数据发布问题中数据效用和数据隐私之间的权衡。为了解决这个问题,SA-MDP设计了一种定制的粒子群优化(PSO)算法,其中优化目标由数据效用和数据隐私决定。具体来说,我们利用粒子群算法产生新的粒子,这是通过随机变异或从最佳粒子中学习来实现的。因此,SA-MDP可以避免解陷入局部最优。此外,我们设计了一个合适的适应度函数来引导粒子向最优解运行。此外,我们还提出了一种在定制PSO算法进化过程之前进行预处理的方法,以提高收敛速度。最后,在几个数据集上执行并验证了所提出的SA-MDP方法。实验结果证明了SA-MDP的有效性和有效性。
{"title":"An Optimized Sanitization Approach for Minable Data Publication","authors":"Fan Yang;Xiaofeng Liao","doi":"10.26599/BDMA.2022.9020007","DOIUrl":"https://doi.org/10.26599/BDMA.2022.9020007","url":null,"abstract":"Minable data publication is ubiquitous since it is beneficial to sharing/trading data among commercial companies and further facilitates the development of data-driven tasks. Unfortunately, the minable data publication is often implemented by publishers with limited privacy concerns such that the published dataset is minable by malicious entities. It prohibits minable data publication since the published data may contain sensitive information. Thus, it is urgently demanded to present some approaches and technologies for reducing the privacy leakage risks. To this end, in this paper, we propose an optimized sanitization approach for minable data publication (named as SA-MDP). SA-MDP supports association rules mining function while providing privacy protection for specific rules. In SA-MDP, we consider the trade-off between the data utility and the data privacy in the minable data publication problem. To address this problem, SA-MDP designs a customized particle swarm optimization (PSO) algorithm, where the optimization objective is determined by both the data utility and the data privacy. Specifically, we take advantage of PSO to produce new particles, which is achieved by random mutation or learning from the best particle. Hence, SA-MDP can avoid the solutions being trapped into local optima. Besides, we design a proper fitness function to guide the particles to run towards the optimal solution. Additionally, we present a preprocessing method before the evolution process of the customized PSO algorithm to improve the convergence rate. Finally, the proposed SA-MDP approach is performed and verified over several datasets. The experimental results have demonstrated the effectiveness and efficiency of SA-MDP.","PeriodicalId":52355,"journal":{"name":"Big Data Mining and Analytics","volume":"5 3","pages":"257-269"},"PeriodicalIF":13.6,"publicationDate":"2022-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8254253/9793354/09793357.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68010341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Systematic Review Towards Big Data Analytics in Social Media 社交媒体大数据分析系统综述
IF 13.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-06-09 DOI: 10.26599/BDMA.2022.9020009
Md. Saifur Rahman;Hassan Reza
The recent advancement in internet 2.0 creates a scope to connect people worldwide using society 2.0 and web 2.0 technologies. This new era allows the consumer to directly connect with other individuals, business corporations, and the government. People are open to sharing opinions, views, and ideas on any topic in different formats out loud. This creates the opportunity to make the “Big Social Data” handy by implementing machine learning approaches and social data analytics. This study offers an overview of recent works in social media, data science, and machine learning to gain a wide perspective on social media big data analytics. We explain why social media data are significant elements of the improved data-driven decision-making process. We propose and build the “Sunflower Model of Big Data” to define big data and bring it up to date with technology by combining 5 V's and 10 Bigs. We discover the top ten social data analytics to work in the domain of social media platforms. A comprehensive list of relevant statistical/machine learning methods to implement each of these big data analytics is discussed in this work. “Text Analytics” is the most used analytics in social data analysis to date. We create a taxonomy on social media analytics to meet the need and provide a clear understanding. Tools, techniques, and supporting data type are also discussed in this research work. As a result, researchers will have an easier time deciding which social data analytics would best suit their needs.
互联网2.0的最新发展为使用社会2.0和网络2.0技术连接世界各地的人们创造了一个空间。这个新时代允许消费者直接与其他个人、企业和政府建立联系。人们愿意以不同的形式大声分享对任何主题的意见、观点和想法。这为通过实施机器学习方法和社会数据分析使“大社会数据”变得方便创造了机会。这项研究概述了社交媒体、数据科学和机器学习领域的最新工作,以获得对社交媒体大数据分析的广泛视角。我们解释了为什么社交媒体数据是改进的数据驱动决策过程的重要组成部分。我们提出并构建了“大数据的向日葵模型”来定义大数据,并通过结合5个V和10个Bigs使其与时俱进。我们发现了在社交媒体平台领域工作的十大社交数据分析。本文讨论了实现每一种大数据分析的相关统计/机器学习方法的综合列表。“文本分析”是迄今为止社会数据分析中使用最多的分析。我们创建了一个关于社交媒体分析的分类法,以满足需求并提供清晰的理解。本文还讨论了工具、技术和支持数据类型。因此,研究人员将更容易决定哪种社交数据分析最适合他们的需求。
{"title":"A Systematic Review Towards Big Data Analytics in Social Media","authors":"Md. Saifur Rahman;Hassan Reza","doi":"10.26599/BDMA.2022.9020009","DOIUrl":"https://doi.org/10.26599/BDMA.2022.9020009","url":null,"abstract":"The recent advancement in internet 2.0 creates a scope to connect people worldwide using society 2.0 and web 2.0 technologies. This new era allows the consumer to directly connect with other individuals, business corporations, and the government. People are open to sharing opinions, views, and ideas on any topic in different formats out loud. This creates the opportunity to make the “Big Social Data” handy by implementing machine learning approaches and social data analytics. This study offers an overview of recent works in social media, data science, and machine learning to gain a wide perspective on social media big data analytics. We explain why social media data are significant elements of the improved data-driven decision-making process. We propose and build the “Sunflower Model of Big Data” to define big data and bring it up to date with technology by combining 5 V's and 10 Bigs. We discover the top ten social data analytics to work in the domain of social media platforms. A comprehensive list of relevant statistical/machine learning methods to implement each of these big data analytics is discussed in this work. “Text Analytics” is the most used analytics in social data analysis to date. We create a taxonomy on social media analytics to meet the need and provide a clear understanding. Tools, techniques, and supporting data type are also discussed in this research work. As a result, researchers will have an easier time deciding which social data analytics would best suit their needs.","PeriodicalId":52355,"journal":{"name":"Big Data Mining and Analytics","volume":"5 3","pages":"228-244"},"PeriodicalIF":13.6,"publicationDate":"2022-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8254253/9793354/09793356.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68010343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Estimating Intelligence Quotient Using Stylometry and Machine Learning Techniques: A Review 用风格测量法和机器学习技术估算智商:综述
IF 13.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-06-09 DOI: 10.26599/BDMA.2022.9020002
Glory O. Adebayo;Roman V. Yampolskiy
The task of trying to quantify a person's intelligence has been a goal of psychologists for over a century. The area of estimating IQ using stylometry has been a developing area of research and the effectiveness of using machine learning in stylometry analysis for the estimation of IQ has been demonstrated in literature whose conclusions suggest that using a large dataset could improve the quality of estimation. The unavailability of large datasets in this area of research has led to very few publications in IQ estimation from written text. In this paper, we review studies that have been done in IQ estimation and also that have been done in author profiling using stylometry and we conclude that based on the success of IQ estimation and author profiling with stylometry, a study on IQ estimation from written text using stylometry will yield good results if the right dataset is used.
一个多世纪以来,心理学家一直致力于量化一个人的智力。使用触笔法估计智商的领域一直是一个发展中的研究领域,在触笔法分析中使用机器学习估计智商的有效性已在文献中得到证明,其结论表明使用大型数据集可以提高估计质量。由于这一研究领域缺乏大型数据集,导致很少有书面文本中IQ估计的出版物。在这篇论文中,我们回顾了在智商估计方面所做的研究,以及在使用风格测量法对作者进行分析方面所进行的研究,我们得出的结论是,基于智商估计和作者风格测量法的成功,如果使用正确的数据集,使用风格测量术对书面文本进行智商估计的研究将产生良好的结果。
{"title":"Estimating Intelligence Quotient Using Stylometry and Machine Learning Techniques: A Review","authors":"Glory O. Adebayo;Roman V. Yampolskiy","doi":"10.26599/BDMA.2022.9020002","DOIUrl":"https://doi.org/10.26599/BDMA.2022.9020002","url":null,"abstract":"The task of trying to quantify a person's intelligence has been a goal of psychologists for over a century. The area of estimating IQ using stylometry has been a developing area of research and the effectiveness of using machine learning in stylometry analysis for the estimation of IQ has been demonstrated in literature whose conclusions suggest that using a large dataset could improve the quality of estimation. The unavailability of large datasets in this area of research has led to very few publications in IQ estimation from written text. In this paper, we review studies that have been done in IQ estimation and also that have been done in author profiling using stylometry and we conclude that based on the success of IQ estimation and author profiling with stylometry, a study on IQ estimation from written text using stylometry will yield good results if the right dataset is used.","PeriodicalId":52355,"journal":{"name":"Big Data Mining and Analytics","volume":"5 3","pages":"163-191"},"PeriodicalIF":13.6,"publicationDate":"2022-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8254253/9793354/09793359.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68010345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
News topic detection based on capsule semantic graph 基于胶囊语义图的新闻话题检测
IF 13.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-01-25 DOI: 10.26599/BDMA.2021.9020023
Shuang Yang;Yan Tang
Most news topic detection methods use word-based methods, which easily ignore the relationship among words and have semantic sparsity, resulting in low topic detection accuracy. In addition, the current mainstream probability methods and graph analysis methods for topic detection have high time complexity. For these reasons, we present a news topic detection model on the basis of capsule semantic graph (CSG). The keywords that appear in each text at the same time are modeled as a keyword graph, which is divided into multiple subgraphs through community detection. Each subgraph contains a group of closely related keywords. The graph is used as the vertex of CSG. The semantic relationship among the vertices is obtained by calculating the similarity of the average word vector of each vertex. At the same time, the news text is clustered using the incremental clustering method, where each text uses CSG; that is, the similarity among texts is calculated by the graph kernel. The relationship between vertices and edges is also considered when calculating the similarity. Experimental results on three standard datasets show that CSG can obtain higher precision, recall, and F1 values than several latest methods. Experimental results on large-scale news datasets reveal that the time complexity of CSG is lower than that of probabilistic methods and other graph analysis methods.
大多数新闻主题检测方法使用基于单词的方法,容易忽略单词之间的关系,并且具有语义稀疏性,导致主题检测准确率较低。此外,当前主流的主题检测概率方法和图分析方法具有较高的时间复杂度。基于这些原因,我们提出了一个基于胶囊语义图的新闻主题检测模型。同时出现在每个文本中的关键词被建模为关键词图,通过社区检测将其划分为多个子图。每个子图都包含一组密切相关的关键字。该图被用作CSG的顶点。通过计算每个顶点的平均词向量的相似性来获得顶点之间的语义关系。同时,采用增量聚类方法对新闻文本进行聚类,每个文本使用CSG;也就是说,文本之间的相似度是通过图核来计算的。在计算相似性时,还考虑了顶点和边之间的关系。在三个标准数据集上的实验结果表明,CSG可以获得比几种最新方法更高的精度、召回率和F1值。在大型新闻数据集上的实验结果表明,CSG的时间复杂度低于概率方法和其他图分析方法。
{"title":"News topic detection based on capsule semantic graph","authors":"Shuang Yang;Yan Tang","doi":"10.26599/BDMA.2021.9020023","DOIUrl":"https://doi.org/10.26599/BDMA.2021.9020023","url":null,"abstract":"Most news topic detection methods use word-based methods, which easily ignore the relationship among words and have semantic sparsity, resulting in low topic detection accuracy. In addition, the current mainstream probability methods and graph analysis methods for topic detection have high time complexity. For these reasons, we present a news topic detection model on the basis of capsule semantic graph (CSG). The keywords that appear in each text at the same time are modeled as a keyword graph, which is divided into multiple subgraphs through community detection. Each subgraph contains a group of closely related keywords. The graph is used as the vertex of CSG. The semantic relationship among the vertices is obtained by calculating the similarity of the average word vector of each vertex. At the same time, the news text is clustered using the incremental clustering method, where each text uses CSG; that is, the similarity among texts is calculated by the graph kernel. The relationship between vertices and edges is also considered when calculating the similarity. Experimental results on three standard datasets show that CSG can obtain higher precision, recall, and F1 values than several latest methods. Experimental results on large-scale news datasets reveal that the time complexity of CSG is lower than that of probabilistic methods and other graph analysis methods.","PeriodicalId":52355,"journal":{"name":"Big Data Mining and Analytics","volume":"5 2","pages":"98-109"},"PeriodicalIF":13.6,"publicationDate":"2022-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8254253/9691293/09691297.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67994283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Deep learning in nuclear industry: A survey 核工业中的深度学习:一项调查
IF 13.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-01-25 DOI: 10.26599/BDMA.2021.9020027
Chenwei Tang;Caiyang Yu;Yi Gao;Jianming Chen;Jiaming Yang;Jiuling Lang;Chuan Liu;Ling Zhong;Zhenan He;Jiancheng Lv
As a high-tech strategic emerging comprehensive industry, the nuclear industry is committed to the research, production, and processing of nuclear fuel, as well as the development and utilization of nuclear energy Nowadays, the nuclear industry has made remarkable progress in the application fields of nuclear weapons, nuclear power, nuclear medical treatment, radiation processing, and so on. With the development of artificial intelligence and the proposal of "Industry 4.0", more and more artificial intelligence technologies are introduced into the nuclear industry chain to improve production efficiency, reduce operation cost, improve operation safety, and realize risk avoidance. Meanwhile, deep learning, as an important technology of artificial intelligence, has made amazing progress in theoretical and applied research in the nuclear industry, which vigorously promotes the development of informatization, digitization, and intelligence of the nuclear industry. In this paper, we first simply comb and analyze the intelligent demand scenarios in the whole industrial chain of the nuclear industry. Then, we discuss the data types involved in the nuclear industry chain. After that, we investigate the research status of deep learning in the application fields corresponding to different data types in the nuclear industry. Finally, we discuss the limitation and unique challenges of deep learning in the nuclear industry and the future direction of the intelligent nuclear industry.
核工业作为高新技术战略性新兴综合产业,致力于核燃料的研究、生产、加工以及核能的开发利用。如今,核工业在核武器、核电、核医疗、辐射加工等应用领域取得了显著进展。随着人工智能的发展和";工业4.0;,越来越多的人工智能技术被引入核产业链,以提高生产效率,降低运营成本,提高运营安全,实现风险规避。与此同时,深度学习作为人工智能的重要技术,在核工业的理论和应用研究方面取得了惊人的进展,有力地推动了核工业信息化、数字化和智能化的发展。本文首先对核工业全产业链中的智能化需求场景进行了简单梳理和分析。然后,我们讨论了核产业链中涉及的数据类型。之后,我们调查了深度学习在核工业中不同数据类型对应的应用领域的研究现状。最后,我们讨论了深度学习在核工业中的局限性和独特挑战,以及智能核工业的未来方向。
{"title":"Deep learning in nuclear industry: A survey","authors":"Chenwei Tang;Caiyang Yu;Yi Gao;Jianming Chen;Jiaming Yang;Jiuling Lang;Chuan Liu;Ling Zhong;Zhenan He;Jiancheng Lv","doi":"10.26599/BDMA.2021.9020027","DOIUrl":"https://doi.org/10.26599/BDMA.2021.9020027","url":null,"abstract":"As a high-tech strategic emerging comprehensive industry, the nuclear industry is committed to the research, production, and processing of nuclear fuel, as well as the development and utilization of nuclear energy Nowadays, the nuclear industry has made remarkable progress in the application fields of nuclear weapons, nuclear power, nuclear medical treatment, radiation processing, and so on. With the development of artificial intelligence and the proposal of &#x0022;Industry 4.0&#x0022;, more and more artificial intelligence technologies are introduced into the nuclear industry chain to improve production efficiency, reduce operation cost, improve operation safety, and realize risk avoidance. Meanwhile, deep learning, as an important technology of artificial intelligence, has made amazing progress in theoretical and applied research in the nuclear industry, which vigorously promotes the development of informatization, digitization, and intelligence of the nuclear industry. In this paper, we first simply comb and analyze the intelligent demand scenarios in the whole industrial chain of the nuclear industry. Then, we discuss the data types involved in the nuclear industry chain. After that, we investigate the research status of deep learning in the application fields corresponding to different data types in the nuclear industry. Finally, we discuss the limitation and unique challenges of deep learning in the nuclear industry and the future direction of the intelligent nuclear industry.","PeriodicalId":52355,"journal":{"name":"Big Data Mining and Analytics","volume":"5 2","pages":"140-160"},"PeriodicalIF":13.6,"publicationDate":"2022-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8254253/9691293/09691301.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67834075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Understanding social relationships with person-pair relations 用人对关系理解社会关系
IF 13.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-01-25 DOI: 10.26599/BDMA.2021.9020022
Hang Zhao;Haicheng Chen;Leilai Li;Hai Wan
Social relationship understanding infers existing social relationships among individuals in a given scenario, which has been demonstrated to have a wide range of practical value in reality. However, existing methods infer the social relationship of each person pair in isolation, without considering the context-aware information for person pairs in the same scenario. The context-aware information for person pairs exists extensively in reality, that is, the social relationships of different person pairs in a simple scenario are always related to each other. For instance, if most of the person pairs in a simple scenario have the same social relationship, "friends", then the other pairs have a high probability of being "friends" or other similar coarse-level relationships, such as "intimate". This context-aware information should thus be considered in social relationship understanding. Therefore, this paper proposes a novel end-to-end trainable Person-Pair Relation Network (PPRN), which is a GRU-based graph inference network, to first extract the visual and position information as the person-pair feature information, then enable it to transfer on a fully-connected social graph, and finally utilizes different aggregators to collect different kinds of person-pair information. Unlike existing methods, the method—with its message passing mechanism in the graph model—can infer the social relationship of each person-pair in a joint way (i.e., not in isolation). Extensive experiments on People In Social Context (PISC)- and People In Photo Album (PIPA)-relation datasets show the superiority of our method compared to other methods.
社会关系理解推断出在给定场景中个体之间存在的社会关系,这在现实中具有广泛的实用价值。然而,现有的方法孤立地推断每个人对的社会关系,而没有考虑同一场景中人对的上下文感知信息。人对的上下文感知信息在现实中广泛存在,也就是说,在一个简单的场景中,不同人对的社会关系总是相互关联的。例如,如果在一个简单的场景中,大多数人对都有相同的社会关系,即“朋友”,那么其他人对很有可能是“朋友”或其他类似的粗略关系,如“亲密”。因此,在理解社会关系时,应该考虑这种上下文感知信息。因此,本文提出了一种新的端到端可训练的人对关系网络(PPRN),这是一种基于GRU的图推理网络,它首先提取视觉和位置信息作为人对特征信息,然后使其能够在完全连接的社交图上传递,最后利用不同的聚合器来收集不同类型的人对信息。与现有的方法不同,该方法在图模型中具有消息传递机制,可以以联合的方式(即,不是孤立的)推断每个人对的社会关系。在社交环境中的人(PISC)和相册中的人关系数据集上进行的大量实验表明,与其他方法相比,我们的方法具有优越性。
{"title":"Understanding social relationships with person-pair relations","authors":"Hang Zhao;Haicheng Chen;Leilai Li;Hai Wan","doi":"10.26599/BDMA.2021.9020022","DOIUrl":"https://doi.org/10.26599/BDMA.2021.9020022","url":null,"abstract":"Social relationship understanding infers existing social relationships among individuals in a given scenario, which has been demonstrated to have a wide range of practical value in reality. However, existing methods infer the social relationship of each person pair in isolation, without considering the context-aware information for person pairs in the same scenario. The context-aware information for person pairs exists extensively in reality, that is, the social relationships of different person pairs in a simple scenario are always related to each other. For instance, if most of the person pairs in a simple scenario have the same social relationship, \"friends\", then the other pairs have a high probability of being \"friends\" or other similar coarse-level relationships, such as \"intimate\". This context-aware information should thus be considered in social relationship understanding. Therefore, this paper proposes a novel end-to-end trainable Person-Pair Relation Network (PPRN), which is a GRU-based graph inference network, to first extract the visual and position information as the person-pair feature information, then enable it to transfer on a fully-connected social graph, and finally utilizes different aggregators to collect different kinds of person-pair information. Unlike existing methods, the method—with its message passing mechanism in the graph model—can infer the social relationship of each person-pair in a joint way (i.e., not in isolation). Extensive experiments on People In Social Context (PISC)- and People In Photo Album (PIPA)-relation datasets show the superiority of our method compared to other methods.","PeriodicalId":52355,"journal":{"name":"Big Data Mining and Analytics","volume":"5 2","pages":"120-129"},"PeriodicalIF":13.6,"publicationDate":"2022-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8254253/9691293/09691299.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67994284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A mini-review of machine learning in big data analytics: Applications, challenges, and prospects 机器学习在大数据分析中的应用、挑战和前景
IF 13.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-01-25 DOI: 10.26599/BDMA.2021.9020028
Isaac Kofi Nti;Juanita Ahia Quarcoo;Justice Aning;Godfred Kusi Fosu
The availability of digital technology in the hands of every citizenry worldwide makes an available unprecedented massive amount of data. The capability to process these gigantic amounts of data in real-time with Big Data Analytics (BDA) tools and Machine Learning (ML) algorithms carries many paybacks. However, the high number of free BDA tools, platforms, and data mining tools makes it challenging to select the appropriate one for the right task. This paper presents a comprehensive mini-literature review of ML in BDA, using a keyword search; a total of 1512 published articles was identified. The articles were screened to 140 based on the study proposed novel taxonomy. The study outcome shows that deep neural networks (15%), support vector machines (15%), artificial neural networks (14%), decision trees (12%), and ensemble learning techniques (11%) are widely applied in BDA. The related applications fields, challenges, and most importantly the openings for future research, are detailed.
数字技术掌握在全世界每一位公民手中,这就提供了前所未有的海量数据。使用大数据分析(BDA)工具和机器学习(ML)算法实时处理这些海量数据的能力带来了许多回报。然而,大量免费的BDA工具、平台和数据挖掘工具使得为正确的任务选择合适的工具变得很有挑战性。本文使用关键词搜索对BDA中的ML进行了全面的小型文献综述;共发现1512篇已发表的文章。根据研究提出的新分类法,这些文章被筛选到140篇。研究结果表明,深度神经网络(15%)、支持向量机(15%),人工神经网络(14%)、决策树(12%)和集成学习技术(11%)在BDA中得到了广泛应用。详细介绍了相关的应用领域、挑战,最重要的是未来研究的前景。
{"title":"A mini-review of machine learning in big data analytics: Applications, challenges, and prospects","authors":"Isaac Kofi Nti;Juanita Ahia Quarcoo;Justice Aning;Godfred Kusi Fosu","doi":"10.26599/BDMA.2021.9020028","DOIUrl":"https://doi.org/10.26599/BDMA.2021.9020028","url":null,"abstract":"The availability of digital technology in the hands of every citizenry worldwide makes an available unprecedented massive amount of data. The capability to process these gigantic amounts of data in real-time with Big Data Analytics (BDA) tools and Machine Learning (ML) algorithms carries many paybacks. However, the high number of free BDA tools, platforms, and data mining tools makes it challenging to select the appropriate one for the right task. This paper presents a comprehensive mini-literature review of ML in BDA, using a keyword search; a total of 1512 published articles was identified. The articles were screened to 140 based on the study proposed novel taxonomy. The study outcome shows that deep neural networks (15%), support vector machines (15%), artificial neural networks (14%), decision trees (12%), and ensemble learning techniques (11%) are widely applied in BDA. The related applications fields, challenges, and most importantly the openings for future research, are detailed.","PeriodicalId":52355,"journal":{"name":"Big Data Mining and Analytics","volume":"5 2","pages":"81-97"},"PeriodicalIF":13.6,"publicationDate":"2022-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8254253/9691293/09691296.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67994391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
A novel influence maximization algorithm for a competitive environment based on social media data analytics 一种新的基于社交媒体数据分析的竞争环境影响最大化算法
IF 13.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-01-25 DOI: 10.26599/BDMA.2021.9020024
Jie Tong;Leilei Shi;Lu Liu;John Panneerselvam;Zixuan Han
Online social networks are increasingly connecting people around the world. Influence maximization is a key area of research in online social networks, which identifies influential users during information dissemination. Most of the existing influence maximization methods only consider the transmission of a single channel, but real-world networks mostly include multiple channels of information transmission with competitive relationships. The problem of influence maximization in an environment involves selecting the seed node set for certain competitive information, so that it can avoid the influence of other information, and ultimately affect the largest set of nodes in the network. In this paper, the influence calculation of nodes is achieved according to the local community discovery algorithm, which is based on community dispersion and the characteristics of dynamic community structure. Furthermore, considering two various competitive information dissemination cases as an example, a solution is designed for self-interested information based on the assumption that the seed node set of competitive information is known, and a novel influence maximization algorithm of node avoidance based on user interest is proposed. Experiments conducted based on real-world Twitter dataset demonstrates the efficiency of our proposed algorithm in terms of accuracy and time against notable influence maximization algorithms.
在线社交网络越来越多地将世界各地的人们联系在一起。影响力最大化是在线社交网络的一个关键研究领域,它在信息传播过程中识别有影响力的用户。现有的大多数影响力最大化方法只考虑单个渠道的传输,但现实世界的网络大多包括具有竞争关系的多个信息传输渠道。环境中的影响力最大化问题涉及为某些竞争信息选择种子节点集,这样它就可以避免其他信息的影响,并最终影响网络中最大的节点集。本文根据社区分散性和动态社区结构的特点,采用局部社区发现算法,实现了节点的影响计算。此外,以两种不同的竞争信息传播情况为例,在已知竞争信息种子节点集的假设下,设计了一种针对自利信息的解决方案,并提出了一种基于用户兴趣的节点规避影响最大化算法。基于真实世界Twitter数据集进行的实验证明了我们提出的算法在准确性和时间方面相对于显著影响最大化算法的效率。
{"title":"A novel influence maximization algorithm for a competitive environment based on social media data analytics","authors":"Jie Tong;Leilei Shi;Lu Liu;John Panneerselvam;Zixuan Han","doi":"10.26599/BDMA.2021.9020024","DOIUrl":"https://doi.org/10.26599/BDMA.2021.9020024","url":null,"abstract":"Online social networks are increasingly connecting people around the world. Influence maximization is a key area of research in online social networks, which identifies influential users during information dissemination. Most of the existing influence maximization methods only consider the transmission of a single channel, but real-world networks mostly include multiple channels of information transmission with competitive relationships. The problem of influence maximization in an environment involves selecting the seed node set for certain competitive information, so that it can avoid the influence of other information, and ultimately affect the largest set of nodes in the network. In this paper, the influence calculation of nodes is achieved according to the local community discovery algorithm, which is based on community dispersion and the characteristics of dynamic community structure. Furthermore, considering two various competitive information dissemination cases as an example, a solution is designed for self-interested information based on the assumption that the seed node set of competitive information is known, and a novel influence maximization algorithm of node avoidance based on user interest is proposed. Experiments conducted based on real-world Twitter dataset demonstrates the efficiency of our proposed algorithm in terms of accuracy and time against notable influence maximization algorithms.","PeriodicalId":52355,"journal":{"name":"Big Data Mining and Analytics","volume":"5 2","pages":"130-139"},"PeriodicalIF":13.6,"publicationDate":"2022-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8254253/9691293/09691300.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67834095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
Big Data Mining and Analytics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1