首页 > 最新文献

Big Data Research最新文献

英文 中文
A Real Time Deep Learning Based Approach for Detecting Network Attacks 基于深度学习的网络攻击实时检测方法
IF 3.3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-02-27 DOI: 10.1016/j.bdr.2024.100446
Christian Callegari, Stefano Giordano, Michele Pagano

Anomaly-based Intrusion Detection is a key research topic in network security due to its ability to face unknown attacks and new security threats. For this reason, many works on the topic have been proposed in the last decade. Nonetheless, an ultimate solution, able to provide a high detection rate with an acceptable false alarm rate, has still to be identified. In the last years big research efforts have focused on the application of Deep Learning techniques to the field, but no work has been able, so far, to propose a system achieving good detection performance, while processing raw network traffic in real time. For this reason in the paper we propose an Intrusion Detection System that, leveraging on probabilistic data structures and Deep Learning techniques, is able to process in real time the traffic collected in a backbone network, offering excellent detection performance and low false alarm rate. Indeed, the extensive experimental tests, run to validate our system and compare different Deep Learning techniques, confirm that, with a proper parameter setting, we can achieve about 92% of detection rate, with an accuracy of 0.899. Finally, with minimal changes, the proposed system can provide some information about the kind of anomaly, although in the multi-class scenario the detection rate is slightly lower (around 86%).

基于异常的入侵检测是网络安全领域的一个重要研究课题,因为它能够面对未知的攻击和新的安全威胁。因此,在过去的十年中,已经有许多关于这一主题的研究成果被提出。然而,能够提供高检测率和可接受误报率的终极解决方案仍有待确定。在过去的几年里,大量的研究工作都集中在深度学习技术在该领域的应用上,但迄今为止,还没有任何工作能够在实时处理原始网络流量的同时,提出一种能够实现良好检测性能的系统。为此,我们在本文中提出了一种入侵检测系统,该系统利用概率数据结构和深度学习技术,能够实时处理骨干网络中收集到的流量,具有良好的检测性能和较低的误报率。事实上,为验证我们的系统和比较不同的深度学习技术而进行的大量实验测试证实,通过适当的参数设置,我们可以实现约 92% 的检测率和 0.899 的准确率。最后,尽管在多类情况下检测率略低(约 86%),但只需做极少的改动,我们提出的系统就能提供一些异常类型的信息。
{"title":"A Real Time Deep Learning Based Approach for Detecting Network Attacks","authors":"Christian Callegari,&nbsp;Stefano Giordano,&nbsp;Michele Pagano","doi":"10.1016/j.bdr.2024.100446","DOIUrl":"10.1016/j.bdr.2024.100446","url":null,"abstract":"<div><p>Anomaly-based Intrusion Detection is a key research topic in network security due to its ability to face unknown attacks and new security threats. For this reason, many works on the topic have been proposed in the last decade. Nonetheless, an ultimate solution, able to provide a high detection rate with an acceptable false alarm rate, has still to be identified. In the last years big research efforts have focused on the application of Deep Learning techniques to the field, but no work has been able, so far, to propose a system achieving good detection performance, while processing raw network traffic in real time. For this reason in the paper we propose an Intrusion Detection System that, leveraging on probabilistic data structures and Deep Learning techniques, is able to process in real time the traffic collected in a backbone network, offering <em>excellent</em> detection performance and low false alarm rate. Indeed, the extensive experimental tests, run to validate our system and compare different Deep Learning techniques, confirm that, with a proper parameter setting, we can achieve about 92% of detection rate, with an accuracy of 0.899. Finally, with minimal changes, the proposed system can provide some information about the kind of anomaly, although in the multi-class scenario the detection rate is slightly lower (around 86%).</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100446"},"PeriodicalIF":3.3,"publicationDate":"2024-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2214579624000224/pdfft?md5=bbd19915547bc28f9b5784f2f0ddcb21&pid=1-s2.0-S2214579624000224-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140004622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Integration visual navigation algorithm for urban air mobility 用于城市空中机动的集成视觉导航算法
IF 3.3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-02-23 DOI: 10.1016/j.bdr.2024.100447
Yandong Li, Bo Jiang, Long Zeng, Chenglong Li

This paper presents an integration visual navigation algorithm called PnP-ORBSLAM for UAV position estimation in Urban Air Mobility (UAM). ORBSLAM is a popular and benchmark algorithm for vision based navigation applications. The proposed method improve the performance of ORBSLAM by adding a post-processing marker recognition phase to the model. Based on the features extracted from the markers, PnP algorithm is introduced to estimate the position of the monocular camera. The position estimation accuracy of the UAV is supposed to be improved by adding the position information of the camera to the model. Experiment is carried out based on Airsim simulation platform. Results show that the PnP-ORBSLAM algorithm can improve the three-dimensional accuracy by a margin of 5.38 % compared with ORBSLAM. In addition, the process speed of the proposed method can reach about 28 frames per second. It means that the PnP-ORBSLAM algorithm can work in real-time.

本文提出了一种名为 PnP-ORBSLAM 的集成视觉导航算法,用于城市空中机动(UAM)中的无人机位置估计。ORBSLAM 是基于视觉的导航应用中一种流行的基准算法。所提出的方法通过在模型中添加后处理标记识别阶段来提高 ORBSLAM 的性能。根据从标记中提取的特征,引入 PnP 算法来估计单目摄像头的位置。通过在模型中加入相机的位置信息,无人机的位置估计精度应该会得到提高。实验基于 Airsim 仿真平台进行。结果表明,PnP-ORBSLAM 算法的三维精度比 ORBSLAM 算法提高了 5.38%。此外,所提方法的处理速度可达到每秒约 28 帧。这意味着 PnP-ORBSLAM 算法可以实时工作。
{"title":"An Integration visual navigation algorithm for urban air mobility","authors":"Yandong Li,&nbsp;Bo Jiang,&nbsp;Long Zeng,&nbsp;Chenglong Li","doi":"10.1016/j.bdr.2024.100447","DOIUrl":"10.1016/j.bdr.2024.100447","url":null,"abstract":"<div><p>This paper presents an integration visual navigation algorithm called PnP-ORBSLAM for UAV position estimation in Urban Air Mobility (UAM). ORBSLAM is a popular and benchmark algorithm for vision based navigation applications. The proposed method improve the performance of ORBSLAM by adding a post-processing marker recognition phase to the model. Based on the features extracted from the markers, PnP algorithm is introduced to estimate the position of the monocular camera. The position estimation accuracy of the UAV is supposed to be improved by adding the position information of the camera to the model. Experiment is carried out based on Airsim simulation platform. Results show that the PnP-ORBSLAM algorithm can improve the three-dimensional accuracy by a margin of 5.38 % compared with ORBSLAM. In addition, the process speed of the proposed method can reach about 28 frames per second. It means that the PnP-ORBSLAM algorithm can work in real-time.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100447"},"PeriodicalIF":3.3,"publicationDate":"2024-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139949248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Investigating Influence of Google-Play Application Titles on Success 调查 Google-Play 应用程序标题对成功的影响
IF 3.3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-02-21 DOI: 10.1016/j.bdr.2024.100443
Ahmad Bilal , Hamid Turab Mirza , Ibrar Hussain , Adnan Ahmad

The title (name) is the primary information related to a mobile (smartphone) application, as it describes its functions and services. An eye-catching title can entice customers to choose a certain application over others. Application development companies are well aware of this phenomenon and invest significant efforts in crafting their application titles with compelling keywords, phrases and topics in pursuit of higher installs. However, to the best of our knowledge, traditional literature that investigates the impact of application titles on success is limited. There may be only a few instances where scientific (data-analytical) approaches have been used to examine application titles. Moreover, these investigations of titles are dominated by supervised learning and traditional literature may lack any unsupervised (cluster) data analysis techniques to measure the impact of titles on application success. Therefore, this research work proposes an unsupervised data analysis approach based on multiple layers and algorithms. The initial layer clusters the application titles, the subsequent layer extracts various textual features from these clusters and the final layer refines the extracted attributes. In general, certain textual features in the titles are proven to be positively and negatively linked with the application installs. Verification of the results has confirmed that this proposed approach can successfully detect the most prominent features from application titles (textual data) that correlate with success.

标题(名称)是与移动(智能手机)应用程序相关的主要信息,因为它描述了应用程序的功能和服务。一个醒目的标题可以吸引客户选择某个应用程序而不是其他应用程序。应用程序开发公司非常清楚这一现象,并投入大量精力,用引人注目的关键词、短语和主题来制作应用程序标题,以追求更高的安装率。然而,据我们所知,研究应用程序标题对成功的影响的传统文献非常有限。使用科学(数据分析)方法研究应用程序标题的例子可能屈指可数。此外,这些对标题的研究都是以监督学习为主,而传统文献可能缺乏任何无监督(聚类)数据分析技术来衡量标题对应用程序成功的影响。因此,本研究工作提出了一种基于多层和算法的无监督数据分析方法。初始层对应用程序标题进行聚类,后续层从这些聚类中提取各种文本特征,最后一层对提取的属性进行细化。一般来说,标题中的某些文本特征被证明与应用程序的安装有正反两方面的联系。对结果的验证证实,这种建议的方法可以成功地从应用程序标题(文本数据)中检测出与成功相关的最突出特征。
{"title":"Investigating Influence of Google-Play Application Titles on Success","authors":"Ahmad Bilal ,&nbsp;Hamid Turab Mirza ,&nbsp;Ibrar Hussain ,&nbsp;Adnan Ahmad","doi":"10.1016/j.bdr.2024.100443","DOIUrl":"https://doi.org/10.1016/j.bdr.2024.100443","url":null,"abstract":"<div><p>The title (name) is the primary information related to a mobile (smartphone) application, as it describes its functions and services. An eye-catching title can entice customers to choose a certain application over others. Application development companies are well aware of this phenomenon and invest significant efforts in crafting their application titles with compelling keywords, phrases and topics in pursuit of higher installs. However, to the best of our knowledge, traditional literature that investigates the impact of application titles on success is limited. There may be only a few instances where scientific (data-analytical) approaches have been used to examine application titles. Moreover, these investigations of titles are dominated by supervised learning and traditional literature may lack any unsupervised (cluster) data analysis techniques to measure the impact of titles on application success. Therefore, this research work proposes an unsupervised data analysis approach based on multiple layers and algorithms. The initial layer clusters the application titles, the subsequent layer extracts various textual features from these clusters and the final layer refines the extracted attributes. In general, certain textual features in the titles are proven to be positively and negatively linked with the application installs. Verification of the results has confirmed that this proposed approach can successfully detect the most prominent features from application titles (textual data) that correlate with success.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100443"},"PeriodicalIF":3.3,"publicationDate":"2024-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139935737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scholar's Career Switch from Academia to Industry: Mining and Analysis from AMiner 学者从学术界到工业界的职业转换:来自 AMiner 的挖掘和分析
IF 3.3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-02-19 DOI: 10.1016/j.bdr.2024.100441
Zhou Shao , Sha Yuan , Yinyu Jin , Yongli Wang

The phenomenon of scholars switching their careers from academia to industry has become more prevalent nowadays. This paper proposes a combination approach of bibliometrics analysis and data mining to study the phenomenon from the perspective of Science of Science (SciSci). Based on the proposed methods, this paper first provides an overview of frequent companies and frequent universities as well as the exponentially increasing number of scholars under the scenario. And then, this study uncovers the excessively single patterns in South Korean scholars switches using frequent pattern mining from their papers. This paper studies the knowledge and technology transfer (KTT) and the research change of scholars by using the language model, the result of which illustrates that the research difference between industry and academia gradually decreases and reaches a steady state in recent years. In exploring the driving factors of the phenomenon, deep preliminary cooperation may be an essential reason, and the career switches will not promote the published amounts of papers but may benefit its academic influence. This study should, therefore, be of value to researchers wishing to study the academia-industry career switches more intensely.

如今,学者从学术界转向产业界的现象越来越普遍。本文提出了文献计量学分析和数据挖掘相结合的方法,从科学的科学(SciSci)的角度研究这一现象。基于所提出的方法,本文首先概述了频繁出现的公司和频繁出现的大学,以及在这种情况下呈指数级增长的学者数量。然后,本研究通过对韩国学者论文中频繁模式的挖掘,发现了韩国学者交换中过于单一的模式。本文利用语言模型研究了知识与技术转移(KTT)和学者的研究变化,研究结果表明,近年来产学研差异逐渐缩小并达到稳定状态。在探讨这一现象的驱动因素时,前期的深度合作可能是一个重要原因,职业转换不会促进论文发表量的提升,但可能有利于其学术影响力的提升。因此,本研究对希望更深入地研究学术界-产业界职业转换的研究人员应该有一定的参考价值。
{"title":"Scholar's Career Switch from Academia to Industry: Mining and Analysis from AMiner","authors":"Zhou Shao ,&nbsp;Sha Yuan ,&nbsp;Yinyu Jin ,&nbsp;Yongli Wang","doi":"10.1016/j.bdr.2024.100441","DOIUrl":"10.1016/j.bdr.2024.100441","url":null,"abstract":"<div><p>The phenomenon of scholars switching their careers from academia to industry has become more prevalent nowadays. This paper proposes a combination approach of bibliometrics analysis and data mining to study the phenomenon from the perspective of Science of Science (SciSci). Based on the proposed methods, this paper first provides an overview of frequent companies and frequent universities as well as the exponentially increasing number of scholars under the scenario. And then, this study uncovers the excessively single patterns in South Korean scholars switches using frequent pattern mining from their papers. This paper studies the knowledge and technology transfer (KTT) and the research change of scholars by using the language model, the result of which illustrates that the research difference between industry and academia gradually decreases and reaches a steady state in recent years. In exploring the driving factors of the phenomenon, deep preliminary cooperation may be an essential reason, and the career switches will not promote the published amounts of papers but may benefit its academic influence. This study should, therefore, be of value to researchers wishing to study the academia-industry career switches more intensely.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100441"},"PeriodicalIF":3.3,"publicationDate":"2024-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139922786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interactive big data visualization and analytics 交互式大数据可视化和分析
IF 3.3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-02-14 DOI: 10.1016/j.bdr.2024.100445
David Auber , Nikos Bikakis , Panos K. Chrysanthis , George Papastefanatos , Mohamed Sharaf
{"title":"Interactive big data visualization and analytics","authors":"David Auber ,&nbsp;Nikos Bikakis ,&nbsp;Panos K. Chrysanthis ,&nbsp;George Papastefanatos ,&nbsp;Mohamed Sharaf","doi":"10.1016/j.bdr.2024.100445","DOIUrl":"https://doi.org/10.1016/j.bdr.2024.100445","url":null,"abstract":"","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100445"},"PeriodicalIF":3.3,"publicationDate":"2024-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139748262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A big data driven vegetation disease and pest region identification method based on self supervised convolutional neural networks and parallel extreme learning machines 基于自监督卷积神经网络和并行极限学习机的大数据驱动型植被病虫害区域识别方法
IF 3.3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-02-13 DOI: 10.1016/j.bdr.2024.100444
Bo Jiang , Hao Wang , Hanxu Ma

A self supervised convolutional neural network-parallel extreme learning machine classification model based on big data is proposed to address the subjectivity and inaccuracy of traditional methods for identifying vegetation pests and diseases that rely on manual observation and empirical judgment. This model is constructed using convolutional neural networks and parallel extreme learning machines, and integrates feature extraction networks with dual attention mechanisms to improve the accuracy of identifying pests and diseases. The model utilized a large amount of big data for training, achieving a recall rate of 98.42 % on multispectral datasets, and an overall classification accuracy of 99.04 %. After optimizing the residual network, the overall accuracy of identifying vegetation pest and disease areas has been further improved to 99.77 %, and the recall rate has also reached 98.91 %. These results indicate that the method proposed in this study has high accuracy and efficiency in the application of big data, can meet the needs of disease and pest identification, and provides effective technical support for the monitoring and prevention of crop diseases and pests, which has important practical significance.

针对传统植被病虫害识别方法依赖人工观察和经验判断的主观性和不准确性,提出了一种基于大数据的自监督卷积神经网络-并行极端学习机分类模型。该模型采用卷积神经网络和并行极端学习机构建,将特征提取网络与双重关注机制相结合,提高了识别病虫害的准确性。该模型利用大量大数据进行训练,在多光谱数据集上的召回率达到 98.42%,整体分类准确率达到 99.04%。在优化残差网络后,植被病虫害区域识别的总体准确率进一步提高到 99.77 %,召回率也达到了 98.91 %。这些结果表明,本研究提出的方法在大数据应用中具有较高的准确率和效率,能够满足病虫害识别的需要,为农作物病虫害的监测和防治提供了有效的技术支撑,具有重要的现实意义。
{"title":"A big data driven vegetation disease and pest region identification method based on self supervised convolutional neural networks and parallel extreme learning machines","authors":"Bo Jiang ,&nbsp;Hao Wang ,&nbsp;Hanxu Ma","doi":"10.1016/j.bdr.2024.100444","DOIUrl":"10.1016/j.bdr.2024.100444","url":null,"abstract":"<div><p>A self supervised convolutional neural network-parallel extreme learning machine classification model based on big data is proposed to address the subjectivity and inaccuracy of traditional methods for identifying vegetation pests and diseases that rely on manual observation and empirical judgment. This model is constructed using convolutional neural networks and parallel extreme learning machines, and integrates feature extraction networks with dual attention mechanisms to improve the accuracy of identifying pests and diseases. The model utilized a large amount of big data for training, achieving a recall rate of 98.42 % on multispectral datasets, and an overall classification accuracy of 99.04 %. After optimizing the residual network, the overall accuracy of identifying vegetation pest and disease areas has been further improved to 99.77 %, and the recall rate has also reached 98.91 %. These results indicate that the method proposed in this study has high accuracy and efficiency in the application of big data, can meet the needs of disease and pest identification, and provides effective technical support for the monitoring and prevention of crop diseases and pests, which has important practical significance.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100444"},"PeriodicalIF":3.3,"publicationDate":"2024-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139887525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Knowledge Distillation via Token-Level Relationship Graph Based on the Big Data Technologies 基于大数据技术的令牌级关系图知识提炼
IF 3.3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-02-12 DOI: 10.1016/j.bdr.2024.100438
Shuoxi Zhang , Hanpeng Liu , Kun He

In the big data era, characterized by vast volumes of complex data, the efficiency of machine learning models is of utmost importance, particularly in the context of intelligent agriculture. Knowledge distillation (KD), a technique aimed at both model compression and performance enhancement, serves as a pivotal solution by distilling the knowledge from an elaborate model (teacher) to a lightweight, compact counterpart (student). However, the true potential of KD has not been fully explored. Existing approaches primarily focus on transferring instance-level information by big data technologies, overlooking the valuable information embedded in token-level relationships, which may be particularly affected by the long-tail effects. To address the above limitations, we propose a novel method called Knowledge Distillation with Token-level Relationship Graph (TRG) that leverages token-wise relationships to enhance the performance of knowledge distillation. By employing TRG, the student model can effectively emulate higher-level semantic information from the teacher model, resulting in improved performance and mobile-friendly efficiency. To further enhance the learning process, we introduce a dynamic temperature adjustment strategy, which encourages the student model to capture the topology structure of the teacher model more effectively. We conduct experiments to evaluate the effectiveness of the proposed method against several state-of-the-art approaches. Empirical results demonstrate the superiority of TRG across various visual tasks, including those involving imbalanced data. Our method consistently outperforms the existing baselines, establishing a new state-of-the-art performance in the field of KD based on big data technologies.

在以海量复杂数据为特征的大数据时代,机器学习模型的效率至关重要,尤其是在智能农业领域。知识蒸馏(KD)是一种旨在压缩模型和提高性能的技术,通过将复杂模型(教师)中的知识蒸馏为轻量、紧凑的对应模型(学生),成为一种关键的解决方案。然而,KD 的真正潜力尚未得到充分挖掘。现有方法主要侧重于通过大数据技术传输实例级信息,忽略了标记级关系中蕴含的宝贵信息,而这些信息尤其可能受到长尾效应的影响。针对上述局限,我们提出了一种名为 "令牌级关系图(TRG)的知识蒸馏 "的新方法,利用令牌级关系来提高知识蒸馏的性能。通过使用 TRG,学生模型可以有效地模仿教师模型中更高层次的语义信息,从而提高性能和移动友好的效率。为了进一步加强学习过程,我们引入了动态温度调整策略,鼓励学生模型更有效地捕捉教师模型的拓扑结构。我们通过实验评估了所提方法与几种最先进方法的有效性。实证结果表明,TRG 在各种视觉任务(包括涉及不平衡数据的视觉任务)中都具有优势。我们的方法始终优于现有的基线方法,在基于大数据技术的 KD 领域确立了新的一流性能。
{"title":"Knowledge Distillation via Token-Level Relationship Graph Based on the Big Data Technologies","authors":"Shuoxi Zhang ,&nbsp;Hanpeng Liu ,&nbsp;Kun He","doi":"10.1016/j.bdr.2024.100438","DOIUrl":"https://doi.org/10.1016/j.bdr.2024.100438","url":null,"abstract":"<div><p>In the big data era, characterized by vast volumes of complex data, the efficiency of machine learning models is of utmost importance, particularly in the context of intelligent agriculture. Knowledge distillation (KD), a technique aimed at both model compression and performance enhancement, serves as a pivotal solution by distilling the knowledge from an elaborate model (teacher) to a lightweight, compact counterpart (student). However, the true potential of KD has not been fully explored. Existing approaches primarily focus on transferring instance-level information by big data technologies, overlooking the valuable information embedded in token-level relationships, which may be particularly affected by the long-tail effects. To address the above limitations, we propose a novel method called Knowledge Distillation with Token-level Relationship Graph (TRG) that leverages token-wise relationships to enhance the performance of knowledge distillation. By employing TRG, the student model can effectively emulate higher-level semantic information from the teacher model, resulting in improved performance and mobile-friendly efficiency. To further enhance the learning process, we introduce a dynamic temperature adjustment strategy, which encourages the student model to capture the topology structure of the teacher model more effectively. We conduct experiments to evaluate the effectiveness of the proposed method against several state-of-the-art approaches. Empirical results demonstrate the superiority of TRG across various visual tasks, including those involving imbalanced data. Our method consistently outperforms the existing baselines, establishing a new state-of-the-art performance in the field of KD based on big data technologies.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100438"},"PeriodicalIF":3.3,"publicationDate":"2024-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139737402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Attentive Implicit Relation Embedding for Event Recommendation in Event-Based Social Network 为基于事件的社交网络中的事件推荐嵌入注意隐含关系
IF 3.3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-02-05 DOI: 10.1016/j.bdr.2024.100426
Yuan Liang

The event-based social network (EBSN) is a new type of social network that combines online and offline networks, and its primary goal is to recommend appropriate events to users. Most studies do not model event recommendations on the EBSN platform as graph representation learning, nor do they consider the implicit relationship between events, resulting in recommendations that are not accepted by users. Thus, we study graph representation learning, which integrates implicit relationships between social networks and events. First, we propose an algorithm that integrates implicit relationships between social networks and events based on a multiple attention model. The graph structure that integrates implicit relationships between social networks and events is divided into user modeling and event modeling: modeling the interactive information of user events, user social relationships, and implicit relationships between users in user modeling; modeling user information and implicit relationships between events in event modeling; and deeply mining high-level transfer relationships between users and events. Then, the user modeling and event modeling models are fused using a multiattention joint learning mechanism to capture the different impacts of social and implicit relationships on user preferences, improving the recommendation quality of the recommendation system. Finally, the effectiveness of the proposed algorithm is verified in real datasets.

基于事件的社交网络(EBSN)是一种结合了线上和线下网络的新型社交网络,其主要目标是向用户推荐合适的事件。大多数研究都没有将 EBSN 平台上的事件推荐建模为图表示学习,也没有考虑事件之间的隐含关系,结果导致推荐不被用户接受。因此,我们研究了图表示学习,它整合了社交网络和事件之间的隐含关系。首先,我们基于多重注意模型提出了一种整合社交网络和事件之间隐含关系的算法。整合社交网络与事件之间隐含关系的图结构分为用户建模和事件建模:在用户建模中对用户事件的交互信息、用户社交关系和用户之间的隐含关系进行建模;在事件建模中对用户信息和事件之间的隐含关系进行建模;深度挖掘用户与事件之间的高层转移关系。然后,利用多注意力联合学习机制融合用户建模和事件建模模型,捕捉社交关系和隐性关系对用户偏好的不同影响,提高推荐系统的推荐质量。最后,在真实数据集中验证了所提算法的有效性。
{"title":"Attentive Implicit Relation Embedding for Event Recommendation in Event-Based Social Network","authors":"Yuan Liang","doi":"10.1016/j.bdr.2024.100426","DOIUrl":"10.1016/j.bdr.2024.100426","url":null,"abstract":"<div><p>The <u>e</u>vent-<u>b</u>ased <u>s</u>ocial <u>n</u>etwork (EBSN) is a new type of social network that combines online and offline networks, and its primary goal is to recommend appropriate events to users. Most studies do not model event recommendations on the EBSN platform as graph representation learning, nor do they consider the implicit relationship between events, resulting in recommendations that are not accepted by users. Thus, we study graph representation learning, which integrates implicit relationships between social networks and events. First, we propose an algorithm that integrates implicit relationships between social networks and events based on a multiple attention model. The graph structure that integrates implicit relationships between social networks and events is divided into user modeling and event modeling: modeling the interactive information of user events, user social relationships, and implicit relationships between users in user modeling; modeling user information and implicit relationships between events in event modeling; and deeply mining high-level transfer relationships between users and events. Then, the user modeling and event modeling models are fused using a multiattention joint learning mechanism to capture the different impacts of social and implicit relationships on user preferences, improving the recommendation quality of the recommendation system. Finally, the effectiveness of the proposed algorithm is verified in real datasets.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100426"},"PeriodicalIF":3.3,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139688835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Chlorophyll-a concentration variations in Bohai sea: Impacts of environmental complexity and human activities based on remote sensing technologies 渤海叶绿素 a 浓度变化:基于遥感技术的环境复杂性和人类活动的影响
IF 3.3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-02-03 DOI: 10.1016/j.bdr.2024.100440
Yong Du , Xiaoyu Zhang , Shuchang Ma , Nan Yao

This study extensively explores the intricate dynamics of the Bohai Sea ecosystem, a semi-closed marginal sea in China, influenced by both environmental complexity and human activities. By utilizing chlorophyll-a as an indicator, we closely examine how phytoplankton responds to coastal environmental conditions and stressors. The temporal analysis conducted over the 23-year period from 1998 to 2020 reveals a distinctive "bell-shaped" variation in chlorophyll-a concentration. Spatially, a declining trend is observed from coastal to central regions, characterized by widespread low-value areas. Employing M-K and slope trend analyses, we observe a 42.13 % decline in the northern Bohai Sea, contrasting with a significant 57.87 % increase in the central and southern regions. The innovative aspects of this research lie in identifying the complex interplay between chlorophyll-a concentration, human pollution controls, and nutrient inputs. Factors contributing to chlorophyll-a concentration, ranked by significance, include sea surface temperature, photosynthetically available radiation (PAR), and wind speed. Remarkably, the negligible impact of the "2015 Tianjin explosion" underscores the robustness of the Bohai Sea's chlorophyll-a dynamics. Furthermore, the positive correlation between phosphorus input and chlorophyll classifies Bohai Bay as a phosphorus-limited aquatic ecosystem. In conclusion, this study provides crucial insights for the preservation of the Bohai Sea ecosystem, emphasizing the necessity for ongoing monitoring and management strategies in the face of evolving environmental and anthropogenic influences.

渤海是中国的一个半封闭边缘海,受环境复杂性和人类活动的双重影响,本研究广泛探讨了渤海生态系统的复杂动态。我们以叶绿素 a 为指标,仔细研究了浮游植物如何对沿岸环境条件和压力因素做出反应。从 1998 年到 2020 年 23 年的时间分析表明,叶绿素-a 浓度呈明显的 "钟形 "变化。从空间上看,叶绿素-a 浓度从沿海地区向中部地区呈下降趋势,低值区分布广泛。通过 M-K 和斜率趋势分析,我们发现渤海北部下降了 42.13%,而中部和南部地区则显著上升了 57.87%。这项研究的创新之处在于确定了叶绿素-a 浓度、人类污染控制和营养物质输入之间复杂的相互作用。影响叶绿素-a 浓度的因素按重要性排序包括海面温度、光合可利用辐射(PAR)和风速。值得注意的是,"2015 年天津大爆炸 "的影响微乎其微,这凸显了渤海叶绿素-a 动态变化的稳健性。此外,磷输入与叶绿素之间的正相关性将渤海湾归类为磷限制型水生生态系统。总之,这项研究为保护渤海生态系统提供了重要启示,强调了面对不断变化的环境和人为影响,持续监测和管理策略的必要性。
{"title":"Chlorophyll-a concentration variations in Bohai sea: Impacts of environmental complexity and human activities based on remote sensing technologies","authors":"Yong Du ,&nbsp;Xiaoyu Zhang ,&nbsp;Shuchang Ma ,&nbsp;Nan Yao","doi":"10.1016/j.bdr.2024.100440","DOIUrl":"10.1016/j.bdr.2024.100440","url":null,"abstract":"<div><p>This study extensively explores the intricate dynamics of the Bohai Sea ecosystem, a semi-closed marginal sea in China, influenced by both environmental complexity and human activities. By utilizing chlorophyll-a as an indicator, we closely examine how phytoplankton responds to coastal environmental conditions and stressors. The temporal analysis conducted over the 23-year period from 1998 to 2020 reveals a distinctive \"bell-shaped\" variation in chlorophyll-a concentration. Spatially, a declining trend is observed from coastal to central regions, characterized by widespread low-value areas. Employing M-K and slope trend analyses, we observe a 42.13 % decline in the northern Bohai Sea, contrasting with a significant 57.87 % increase in the central and southern regions. The innovative aspects of this research lie in identifying the complex interplay between chlorophyll-a concentration, human pollution controls, and nutrient inputs. Factors contributing to chlorophyll-a concentration, ranked by significance, include sea surface temperature, photosynthetically available radiation (PAR), and wind speed. Remarkably, the negligible impact of the \"2015 Tianjin explosion\" underscores the robustness of the Bohai Sea's chlorophyll-a dynamics. Furthermore, the positive correlation between phosphorus input and chlorophyll classifies Bohai Bay as a phosphorus-limited aquatic ecosystem. In conclusion, this study provides crucial insights for the preservation of the Bohai Sea ecosystem, emphasizing the necessity for ongoing monitoring and management strategies in the face of evolving environmental and anthropogenic influences.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100440"},"PeriodicalIF":3.3,"publicationDate":"2024-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139663075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tropical cyclone trajectory based on satellite remote sensing prediction and time attention mechanism ConvLSTM model 基于卫星遥感预测和时间注意机制 ConvLSTM 模型的热带气旋轨迹
IF 3.3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-02-03 DOI: 10.1016/j.bdr.2024.100439
Tongfei Li , Mingzheng Lai , Shixian Nie , Haifeng Liu , Zhiyao Liang , Wei Lv

The accurate and timely prediction of tropical cyclones is of paramount importance in mitigating the impact of these catastrophic meteorological events. Presently, methods for predicting tropical cyclones based on satellite remote sensing images encounter notable challenges, including the inadequate extraction of three-dimensional spatial features and limitations in long-term forecasting. As a response to these challenges, this study introduces the Temporal Attention Mechanism ConvLSTM (TAM-CL) model, designed to conduct thorough spatiotemporal feature extraction on three-dimensional atmospheric reanalysis data of tropical cyclones. By leveraging ConvLSTM with three-dimensional convolution kernels, our model enhances the extraction of three-dimensional spatiotemporal features. Furthermore, an attention mechanism is integrated to bolster long-term prediction accuracy by emphasizing crucial temporal nodes. In the evaluation of tropical cyclone track and intensity forecasts across 24, 48, and 72 h, TAM-CL demonstrates a notable reduction in prediction errors, thereby underscoring its efficacy in forecasting both cyclone tracks and intensities. This contributes to an effective exploration of the application of deep networks in conjunction with atmospheric reanalysis data.

准确及时地预测热带气旋对减轻这些灾难性气象事件的影响至关重要。目前,基于卫星遥感图像的热带气旋预测方法遇到了显著的挑战,包括三维空间特征提取不足和长期预测的局限性。为应对这些挑战,本研究引入了时空注意机制 ConvLSTM(TAM-CL)模型,旨在对热带气旋的三维大气再分析数据进行全面的时空特征提取。通过利用具有三维卷积核的 ConvLSTM,我们的模型增强了对三维时空特征的提取。此外,我们还集成了关注机制,通过强调关键的时间节点来提高长期预测的准确性。在对 24、48 和 72 小时的热带气旋路径和强度预报进行评估时,TAM-CL 明显减少了预报误差,从而突出了其在预报气旋路径和强度方面的功效。这有助于有效探索深度网络与大气再分析数据的结合应用。
{"title":"Tropical cyclone trajectory based on satellite remote sensing prediction and time attention mechanism ConvLSTM model","authors":"Tongfei Li ,&nbsp;Mingzheng Lai ,&nbsp;Shixian Nie ,&nbsp;Haifeng Liu ,&nbsp;Zhiyao Liang ,&nbsp;Wei Lv","doi":"10.1016/j.bdr.2024.100439","DOIUrl":"10.1016/j.bdr.2024.100439","url":null,"abstract":"<div><p>The accurate and timely prediction of tropical cyclones is of paramount importance in mitigating the impact of these catastrophic meteorological events. Presently, methods for predicting tropical cyclones based on satellite remote sensing images encounter notable challenges, including the inadequate extraction of three-dimensional spatial features and limitations in long-term forecasting. As a response to these challenges, this study introduces the Temporal Attention Mechanism ConvLSTM (TAM-CL) model, designed to conduct thorough spatiotemporal feature extraction on three-dimensional atmospheric reanalysis data of tropical cyclones. By leveraging ConvLSTM with three-dimensional convolution kernels, our model enhances the extraction of three-dimensional spatiotemporal features. Furthermore, an attention mechanism is integrated to bolster long-term prediction accuracy by emphasizing crucial temporal nodes. In the evaluation of tropical cyclone track and intensity forecasts across 24, 48, and 72 h, TAM-CL demonstrates a notable reduction in prediction errors, thereby underscoring its efficacy in forecasting both cyclone tracks and intensities. This contributes to an effective exploration of the application of deep networks in conjunction with atmospheric reanalysis data.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100439"},"PeriodicalIF":3.3,"publicationDate":"2024-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139662985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Big Data Research
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1