Frontiers in Big Data最新文献_第2页

Efficient out-of-distribution detection via layer-adaptive scoring and early stopping.

IF 2.4 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers in Big Data

Pub Date : 2024-11-20 eCollection Date: 2024-01-01 DOI: 10.3389/fdata.2024.1444634

Haoliang Wang, Chen Zhao, Feng Chen

Introduction: Multi-layer aggregation is key to the success of out-of-distribution (OOD) detection in deep neural networks. Moreover, in real-time systems, the efficiency of OOD detection is equally important as its effectiveness.

Methods: We propose a novel early stopping OOD detection framework for deep neural networks. By attaching multiple OOD detectors to the intermediate layers, this framework can detect OODs early to save computational cost. Additionally, through a layer-adaptive scoring function, it can adaptively select the optimal layer for each OOD based on its complexity, thereby improving OOD detection accuracy.

Results: Extensive experiments demonstrate that our proposed framework is robust against OODs of varying complexity. Adopting the early stopping strategy can increase OOD detection efficiency by up to 99.1% while maintaining superior accuracy.

Discussion: OODs of varying complexity are better detected at different layers. Leveraging the intrinsic characteristics of inputs encoded in the intermediate latent space is important for achieving high OOD detection accuracy. Our proposed framework, incorporating early stopping, significantly enhances OOD detection efficiency without compromising accuracy, making it practical for real-time applications.

{"title":"Efficient out-of-distribution detection via layer-adaptive scoring and early stopping.","authors":"Haoliang Wang, Chen Zhao, Feng Chen","doi":"10.3389/fdata.2024.1444634","DOIUrl":"10.3389/fdata.2024.1444634","url":null,"abstract":"Introduction: Multi-layer aggregation is key to the success of out-of-distribution (OOD) detection in deep neural networks. Moreover, in real-time systems, the efficiency of OOD detection is equally important as its effectiveness.Methods: We propose a novel early stopping OOD detection framework for deep neural networks. By attaching multiple OOD detectors to the intermediate layers, this framework can detect OODs early to save computational cost. Additionally, through a layer-adaptive scoring function, it can adaptively select the optimal layer for each OOD based on its complexity, thereby improving OOD detection accuracy.Results: Extensive experiments demonstrate that our proposed framework is robust against OODs of varying complexity. Adopting the early stopping strategy can increase OOD detection efficiency by up to 99.1% while maintaining superior accuracy.Discussion: OODs of varying complexity are better detected at different layers. Leveraging the intrinsic characteristics of inputs encoded in the intermediate latent space is important for achieving high OOD detection accuracy. Our proposed framework, incorporating early stopping, significantly enhances OOD detection efficiency without compromising accuracy, making it practical for real-time applications.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"7 ","pages":"1444634"},"PeriodicalIF":2.4,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11615063/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142781820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Credibility-based knowledge graph embedding for identifying social brand advocates.

IF 2.4 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers in Big Data

Pub Date : 2024-11-20 eCollection Date: 2024-01-01 DOI: 10.3389/fdata.2024.1469819

Bilal Abu-Salih, Salihah Alotaibi, Manaf Al-Okaily, Mohammed Aljaafari, Muder Almiani

Brand advocates, characterized by their enthusiasm for promoting a brand without incentives, play a crucial role in driving positive word-of-mouth (WOM) and influencing potential customers. However, there is a notable lack of intelligent systems capable of accurately identifying online advocates based on their social interactions with brands. Knowledge Graphs (KGs) offer structured and factual representations of human knowledge, providing a potential solution to gain holistic insights into customer preferences and interactions with a brand. This study presents a novel framework that leverages KG construction and embedding techniques to identify brand advocates accurately. By harnessing the power of KGs, our framework enhances the accuracy and efficiency of identifying and understanding brand advocates, providing valuable insights into customer advocacy dynamics in the online realm. Moreover, we address the critical aspect of social credibility, which significantly influences the impact of advocacy efforts. Incorporating social credibility analysis into our framework allows businesses to identify and mitigate spammers, preserving authenticity and customer trust. To achieve this, we incorporate and extend DSpamOnto, a specialized ontology designed to identify social spam, with a focus on the social commerce domain. Additionally, we employ cutting-edge embedding techniques to map the KG into a low-dimensional vector space, enabling effective link prediction, clustering, and visualization. Through a rigorous evaluation process, we demonstrate the effectiveness and performance of our proposed framework, highlighting its potential to empower businesses in cultivating brand advocates and driving meaningful customer engagement strategies.

{"title":"Credibility-based knowledge graph embedding for identifying social brand advocates.","authors":"Bilal Abu-Salih, Salihah Alotaibi, Manaf Al-Okaily, Mohammed Aljaafari, Muder Almiani","doi":"10.3389/fdata.2024.1469819","DOIUrl":"10.3389/fdata.2024.1469819","url":null,"abstract":"Brand advocates, characterized by their enthusiasm for promoting a brand without incentives, play a crucial role in driving positive word-of-mouth (WOM) and influencing potential customers. However, there is a notable lack of intelligent systems capable of accurately identifying online advocates based on their social interactions with brands. Knowledge Graphs (KGs) offer structured and factual representations of human knowledge, providing a potential solution to gain holistic insights into customer preferences and interactions with a brand. This study presents a novel framework that leverages KG construction and embedding techniques to identify brand advocates accurately. By harnessing the power of KGs, our framework enhances the accuracy and efficiency of identifying and understanding brand advocates, providing valuable insights into customer advocacy dynamics in the online realm. Moreover, we address the critical aspect of social credibility, which significantly influences the impact of advocacy efforts. Incorporating social credibility analysis into our framework allows businesses to identify and mitigate spammers, preserving authenticity and customer trust. To achieve this, we incorporate and extend DSpamOnto, a specialized ontology designed to identify social spam, with a focus on the social commerce domain. Additionally, we employ cutting-edge embedding techniques to map the KG into a low-dimensional vector space, enabling effective link prediction, clustering, and visualization. Through a rigorous evaluation process, we demonstrate the effectiveness and performance of our proposed framework, highlighting its potential to empower businesses in cultivating brand advocates and driving meaningful customer engagement strategies.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"7 ","pages":"1469819"},"PeriodicalIF":2.4,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11614760/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142781816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ActiveReach: an active learning framework for approximate reachability query answering in large-scale graphs.

IF 2.4 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers in Big Data

Pub Date : 2024-11-19 eCollection Date: 2024-01-01 DOI: 10.3389/fdata.2024.1427104

Zohreh Raghebi, Farnoush Banaei-Kashani

With graph reachability query, one can answer whether there exists a path between two query vertices in a given graph. The existing reachability query processing solutions use traditional reachability index structures and can only compute exact answers, which may take a long time to resolve in large graphs. In contrast, with an approximate reachability query, one can offer a compromise by enabling users to strike a trade-off between query time and the accuracy of the query result. In this study, we propose a framework, dubbed ActiveReach, for learning index structures to answer approximate reachability query. ActiveReach is a two-phase framework that focuses on embedding nodes in a reachability space. In the first phase, we leverage node attributes and positional information to create reachability-aware embeddings for each node. These embeddings are then used as nodes' attributes in the second phase. In the second phase, we incorporate the new attributes and include reachability information as labels in the training data to generate embeddings in a reachability space. In addition, computing reachability for all training data may not be practical. Therefore, selecting a subset of data to compute reachability effectively and enhance reachability prediction performance is challenging. ActiveReach addresses this challenge by employing an active learning approach in the second phase to selectively compute reachability for a subset of node pairs, thus learning the approximate reachability for the entire graph. Our extensive experimental study with various real attributed large-scale graphs demonstrates the effectiveness of each component of our framework.

{"title":"ActiveReach: an active learning framework for approximate reachability query answering in large-scale graphs.","authors":"Zohreh Raghebi, Farnoush Banaei-Kashani","doi":"10.3389/fdata.2024.1427104","DOIUrl":"10.3389/fdata.2024.1427104","url":null,"abstract":"With graph reachability query, one can answer whether there exists a path between two query vertices in a given graph. The existing reachability query processing solutions use traditional reachability index structures and can only compute exact answers, which may take a long time to resolve in large graphs. In contrast, with an approximate reachability query, one can offer a compromise by enabling users to strike a trade-off between query time and the accuracy of the query result. In this study, we propose a framework, dubbed ActiveReach, for learning index structures to answer approximate reachability query. ActiveReach is a two-phase framework that focuses on embedding nodes in a reachability space. In the first phase, we leverage node attributes and positional information to create reachability-aware embeddings for each node. These embeddings are then used as nodes' attributes in the second phase. In the second phase, we incorporate the new attributes and include reachability information as labels in the training data to generate embeddings in a reachability space. In addition, computing reachability for all training data may not be practical. Therefore, selecting a subset of data to compute reachability effectively and enhance reachability prediction performance is challenging. ActiveReach addresses this challenge by employing an active learning approach in the second phase to selectively compute reachability for a subset of node pairs, thus learning the approximate reachability for the entire graph. Our extensive experimental study with various real attributed large-scale graphs demonstrates the effectiveness of each component of our framework.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"7 ","pages":"1427104"},"PeriodicalIF":2.4,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11611874/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142774795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Camera-view supervision for bird's-eye-view semantic segmentation.

IF 2.4 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers in Big Data

Pub Date : 2024-11-15 eCollection Date: 2024-01-01 DOI: 10.3389/fdata.2024.1431346

Bowen Yang, LinLin Yu, Feng Chen

Bird's-eye-view Semantic Segmentation (BEVSS) is a powerful and crucial component of planning and control systems in many autonomous vehicles. Current methods rely on end-to-end learning to train models, leading to indirectly supervised and inaccurate camera-to-BEV projections. We propose a novel method of supervising feature extraction with camera-view depth and segmentation information, which improves the quality of feature extraction and projection in the BEVSS pipeline. Our model, evaluated on the nuScenes dataset, shows a 3.8% improvement in Intersection-over-Union (IoU) for vehicle segmentation and a 30-fold reduction in depth error compared to baselines, while maintaining competitive inference times of 32 FPS. This method offers more accurate and reliable BEVSS for real-time autonomous driving systems. The codes and implementation details and code can be found at https://github.com/bluffish/sucam.

引用次数: 0

Cultural big data: nineteenth to twenty-first century panoramic visualization. 文化大数据：从十九世纪到二十一世纪的全景可视化。

IF 2.4 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers in Big Data

Pub Date : 2024-11-08 eCollection Date: 2024-01-01 DOI: 10.3389/fdata.2024.1309887

Tsz Kin Chau, Paul Bourke, Lily Hibberd, Daniel Jaquet, Sarah Kenderdine

From the nineteenth-century panorama to the emergence of the digital panoramic format in the 1990's, the visualization of large images frequently relies on panoramic viewing strategies. Originally rendered in the form of epic painted canvases, these strategies are now amplified through gigapixel imaging, computer vision and machine learning. Whether for scientific analysis, dissemination, or to visualize cultural big data, panoramic strategies pivot on the illusion of immersion. The latter is achieved through human-centered design situated within a large-scale environment combined with a multi-sensory experience spanning sight, sound, touch, and smell. In this article, we present the original research undertaken to realize a digital twin of the 1894 panorama of the battle of Murten. Following a brief history of the panorama, the methods and technological framework systems developed for Murten panorama's visualization are delineated. Novel visualization methodologies are further discussed, including how to create the illusion of immersion for the world's largest image of a single physical object and its cultural big data. We also present the visualization strategies developed for the augmentation of the layered narratives and histories embedded in the final interactive viewing experience of the Murten panorama. This article offers researchers in heritage big data new schemas for the visualization and augmentation of gigapixel images in digital panoramas.

从十九世纪的全景图到二十世纪九十年代出现的数字全景格式，大型图像的可视化经常依赖于全景观看策略。这些策略最初以史诗画布的形式呈现，现在则通过千兆像素成像、计算机视觉和机器学习得到了放大。无论是用于科学分析、传播，还是用于文化大数据的可视化，全景观看策略的核心都是让人产生身临其境的错觉。后者是通过以人为本的设计，在大尺度环境中结合视觉、听觉、触觉和嗅觉等多感官体验来实现的。在本文中，我们介绍了为实现 1894 年墨尔本战役全景图的数字孪生而进行的原创研究。在简要介绍了全景图的历史之后，阐述了为穆尔腾全景图可视化而开发的方法和技术框架系统。我们还进一步讨论了新颖的可视化方法，包括如何为世界上最大的单一实物图像及其文化大数据营造身临其境的错觉。我们还介绍了为增强墨尔本全景图最终交互式观看体验中蕴含的分层叙事和历史而开发的可视化策略。本文为遗产大数据研究人员提供了数字全景图中千兆像素图像的可视化和增强的新方案。

{"title":"Cultural big data: nineteenth to twenty-first century panoramic visualization.","authors":"Tsz Kin Chau, Paul Bourke, Lily Hibberd, Daniel Jaquet, Sarah Kenderdine","doi":"10.3389/fdata.2024.1309887","DOIUrl":"10.3389/fdata.2024.1309887","url":null,"abstract":"From the nineteenth-century panorama to the emergence of the digital panoramic format in the 1990's, the visualization of large images frequently relies on panoramic viewing strategies. Originally rendered in the form of epic painted canvases, these strategies are now amplified through gigapixel imaging, computer vision and machine learning. Whether for scientific analysis, dissemination, or to visualize cultural big data, panoramic strategies pivot on the illusion of immersion. The latter is achieved through human-centered design situated within a large-scale environment combined with a multi-sensory experience spanning sight, sound, touch, and smell. In this article, we present the original research undertaken to realize a digital twin of the 1894 panorama of the battle of Murten. Following a brief history of the panorama, the methods and technological framework systems developed for Murten panorama's visualization are delineated. Novel visualization methodologies are further discussed, including how to create the illusion of immersion for the world's largest image of a single physical object and its cultural big data. We also present the visualization strategies developed for the augmentation of the layered narratives and histories embedded in the final interactive viewing experience of the Murten panorama. This article offers researchers in heritage big data new schemas for the visualization and augmentation of gigapixel images in digital panoramas.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"7 ","pages":"1309887"},"PeriodicalIF":2.4,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11581890/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142711591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Cybermycelium: a reference architecture for domain-driven distributed big data systems. Cybermycelium：领域驱动的分布式大数据系统参考架构。

IF 2.4 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers in Big Data

Pub Date : 2024-11-05 eCollection Date: 2024-01-01 DOI: 10.3389/fdata.2024.1448481

Pouya Ataei

Introduction: The ubiquity of digital devices, the infrastructure of today, and the ever-increasing proliferation of digital products have dawned a new era, the era of big data (BD). This era began when the volume, variety, and velocity of data overwhelmed traditional systems that used to analyze and store that data. This precipitated a new class of software systems, namely, BD systems. Whereas BD systems provide a competitive advantage to businesses, many have failed to harness the power of them. It has been estimated that only 20% of companies have successfully implemented a BD project.

Methods: This study aims to facilitate BD system development by introducing Cybermycelium, a domain-driven decentralized BD reference architecture (RA). The artifact was developed following the guidelines of empirically grounded RAs and evaluated through implementation in a real-world scenario using the Architecture Tradeoff Analysis Method (ATAM).

Results: The evaluation revealed that Cybermycelium successfully addressed key architectural qualities: performance (achieving <1,000 ms response times), availability (through event brokers and circuit breaking), and modifiability (enabling rapid service deployment and configuration). The prototype demonstrated effective handling of data processing, scalability challenges, and domain-specific requirements in a large-scale international company setting.

Discussion: The results highlight important architectural trade-offs between event backbone implementation and service mesh design. While the domain-driven distributed approach improved scalability and maintainability compared to traditional monolithic architectures, it requires significant technical expertise for implementation. This contribution advances the field by providing a validated reference architecture that addresses the challenges of adopting BD in modern enterprises.

引言无处不在的数字设备、当今的基础设施以及日益激增的数字产品，开启了一个新的时代--大数据（BD）时代。当数据的数量、种类和速度压倒了用于分析和存储数据的传统系统时，这个时代就开始了。这催生了一类新的软件系统，即 BD 系统。虽然 BD 系统为企业提供了竞争优势，但许多企业却未能利用好它的力量。据估计，只有 20% 的公司成功实施了 BD 项目：本研究旨在通过引入领域驱动的分散式 BD 参考架构（RA）Cybermycelium，促进 BD 系统的开发。该工具是根据经验基础参考架构的指导方针开发的，并通过使用架构权衡分析方法（ATAM）在真实世界场景中的实施进行了评估：评估结果表明，Cybermycelium 成功地解决了关键的架构质量问题：性能（实现讨论）、可扩展性、可扩展性和可扩展性：结果凸显了事件骨干网实施与服务网格设计之间的重要架构权衡。虽然与传统的单体架构相比，领域驱动的分布式方法提高了可扩展性和可维护性，但它的实施需要大量的专业技术知识。本文提供了一个经过验证的参考架构，解决了现代企业采用 BD 所面临的挑战，从而推动了该领域的发展。

{"title":"Cybermycelium: a reference architecture for domain-driven distributed big data systems.","authors":"Pouya Ataei","doi":"10.3389/fdata.2024.1448481","DOIUrl":"https://doi.org/10.3389/fdata.2024.1448481","url":null,"abstract":"Introduction: The ubiquity of digital devices, the infrastructure of today, and the ever-increasing proliferation of digital products have dawned a new era, the era of big data (BD). This era began when the volume, variety, and velocity of data overwhelmed traditional systems that used to analyze and store that data. This precipitated a new class of software systems, namely, BD systems. Whereas BD systems provide a competitive advantage to businesses, many have failed to harness the power of them. It has been estimated that only 20% of companies have successfully implemented a BD project.Methods: This study aims to facilitate BD system development by introducing Cybermycelium, a domain-driven decentralized BD reference architecture (RA). The artifact was developed following the guidelines of empirically grounded RAs and evaluated through implementation in a real-world scenario using the Architecture Tradeoff Analysis Method (ATAM).Results: The evaluation revealed that Cybermycelium successfully addressed key architectural qualities: performance (achieving <1,000 ms response times), availability (through event brokers and circuit breaking), and modifiability (enabling rapid service deployment and configuration). The prototype demonstrated effective handling of data processing, scalability challenges, and domain-specific requirements in a large-scale international company setting.Discussion: The results highlight important architectural trade-offs between event backbone implementation and service mesh design. While the domain-driven distributed approach improved scalability and maintainability compared to traditional monolithic architectures, it requires significant technical expertise for implementation. This contribution advances the field by providing a validated reference architecture that addresses the challenges of adopting BD in modern enterprises.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"7 ","pages":"1448481"},"PeriodicalIF":2.4,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11573557/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142677536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Cognitive warfare: a conceptual analysis of the NATO ACT cognitive warfare exploratory concept. 认知战：北约 ACT 认知战探索概念分析。

IF 2.4 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers in Big Data

Pub Date : 2024-11-01 eCollection Date: 2024-01-01 DOI: 10.3389/fdata.2024.1452129

Christoph Deppe, Gary S Schaal

This study evaluates NATO ACT's cognitive warfare concept from a political science perspective, exploring its utility beyond military applications. Despite its growing presence in scholarly discourse, the concept's interdisciplinary nature has hindered a unified definition. By analyzing NATO's framework, developed with input from diverse disciplines and both military and civilian researchers, this paper seeks to assess its applicability to political science. It aims to bridge military and civilian research divides and refine NATO's cognitive warfare approach, offering significant implications for enhancing political science research and fostering integrated scholarly collaboration.

本研究从政治学的角度评估了北约 ACT 的认知战概念，探讨了其在军事应用之外的效用。尽管这一概念在学术界的影响越来越大，但其跨学科的性质阻碍了统一定义的形成。通过分析北约的框架，本文试图评估其对政治科学的适用性。本文旨在弥合军事和民事研究的分歧，完善北约的认知战争方法，为加强政治科学研究和促进综合学术合作提供重要启示。

引用次数: 0

An enhanced whale optimization algorithm for task scheduling in edge computing environments. 用于边缘计算环境任务调度的增强型鲸鱼优化算法。

IF 2.4 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers in Big Data

Pub Date : 2024-10-30 eCollection Date: 2024-01-01 DOI: 10.3389/fdata.2024.1422546

Li Han, Shuaijie Zhu, Haoyang Zhao, Yanqiang He

The widespread use of mobile devices and compute-intensive applications has increased the connection of smart devices to networks, generating significant data. Real-time execution faces challenges due to limited resources and demanding applications in edge computing environments. To address these challenges, an enhanced whale optimization algorithm (EWOA) was proposed for task scheduling. A multi-objective model based on CPU, memory, time, and resource utilization was developed. The model was transformed into a whale optimization problem, incorporating chaotic mapping to initialize populations and prevent premature convergence. A nonlinear convergence factor was introduced to balance local and global search. The algorithm's performance was evaluated in an experimental edge computing environment and compared with ODTS, WOA, HWACO, and CATSA algorithms. Experimental results demonstrated that EWOA reduced costs by 29.22%, decreased completion time by 17.04%, and improved node resource utilization by 9.5%. While EWOA offers significant advantages, limitations include the lack of consideration for potential network delays and user mobility. Future research will focus on fault-tolerant scheduling techniques to address dynamic user needs and improve service robustness and quality.

移动设备和计算密集型应用的广泛使用增加了智能设备与网络的连接，产生了大量数据。在边缘计算环境中，由于资源有限和应用要求苛刻，实时执行面临挑战。为了应对这些挑战，人们提出了一种用于任务调度的增强型鲸鱼优化算法（EWOA）。我们开发了一个基于 CPU、内存、时间和资源利用率的多目标模型。该模型被转化为鲸鱼优化问题，并结合混沌映射来初始化种群，防止过早收敛。还引入了一个非线性收敛因子，以平衡局部搜索和全局搜索。该算法的性能在实验性边缘计算环境中进行了评估，并与 ODTS、WOA、HWACO 和 CATSA 算法进行了比较。实验结果表明，EWOA 降低了 29.22% 的成本，减少了 17.04% 的完成时间，提高了 9.5% 的节点资源利用率。虽然 EWOA 具有显著的优势，但其局限性包括没有考虑潜在的网络延迟和用户移动性。未来的研究将侧重于容错调度技术，以满足用户的动态需求，提高服务的稳健性和质量。

{"title":"An enhanced whale optimization algorithm for task scheduling in edge computing environments.","authors":"Li Han, Shuaijie Zhu, Haoyang Zhao, Yanqiang He","doi":"10.3389/fdata.2024.1422546","DOIUrl":"10.3389/fdata.2024.1422546","url":null,"abstract":"The widespread use of mobile devices and compute-intensive applications has increased the connection of smart devices to networks, generating significant data. Real-time execution faces challenges due to limited resources and demanding applications in edge computing environments. To address these challenges, an enhanced whale optimization algorithm (EWOA) was proposed for task scheduling. A multi-objective model based on CPU, memory, time, and resource utilization was developed. The model was transformed into a whale optimization problem, incorporating chaotic mapping to initialize populations and prevent premature convergence. A nonlinear convergence factor was introduced to balance local and global search. The algorithm's performance was evaluated in an experimental edge computing environment and compared with ODTS, WOA, HWACO, and CATSA algorithms. Experimental results demonstrated that EWOA reduced costs by 29.22%, decreased completion time by 17.04%, and improved node resource utilization by 9.5%. While EWOA offers significant advantages, limitations include the lack of consideration for potential network delays and user mobility. Future research will focus on fault-tolerant scheduling techniques to address dynamic user needs and improve service robustness and quality.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"7 ","pages":"1422546"},"PeriodicalIF":2.4,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11557405/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142631928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Promoting fairness in link prediction with graph enhancement. 通过图增强促进链接预测的公平性

IF 2.4 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers in Big Data

Pub Date : 2024-10-24 eCollection Date: 2024-01-01 DOI: 10.3389/fdata.2024.1489306

Yezi Liu, Hanning Chen, Mohsen Imani

Link prediction is a crucial task in network analysis, but it has been shown to be prone to biased predictions, particularly when links are unfairly predicted between nodes from different sensitive groups. In this paper, we study the fair link prediction problem, which aims to ensure that the predicted link probability is independent of the sensitive attributes of the connected nodes. Existing methods typically incorporate debiasing techniques within graph embeddings to mitigate this issue. However, training on large real-world graphs is already challenging, and adding fairness constraints can further complicate the process. To overcome this challenge, we propose FairLink, a method that learns a fairness-enhanced graph to bypass the need for debiasing during the link predictor's training. FairLink maintains link prediction accuracy by ensuring that the enhanced graph follows a training trajectory similar to that of the original input graph. Meanwhile, it enhances fairness by minimizing the absolute difference in link probabilities between node pairs within the same sensitive group and those between node pairs from different sensitive groups. Our extensive experiments on multiple large-scale graphs demonstrate that FairLink not only promotes fairness but also often achieves link prediction accuracy comparable to baseline methods. Most importantly, the enhanced graph exhibits strong generalizability across different GNN architectures. FairLink is highly scalable, making it suitable for deployment in real-world large-scale graphs, where maintaining both fairness and accuracy is critical.

链接预测是网络分析中的一项重要任务，但事实证明它很容易出现预测偏差，尤其是当来自不同敏感组的节点之间的链接被不公平地预测时。本文研究了公平链接预测问题，旨在确保预测的链接概率与所连接节点的敏感属性无关。现有方法通常会在图嵌入中采用去杂技术来缓解这一问题。然而，在现实世界的大型图上进行训练本来就具有挑战性，如果再加上公平性约束，就会使这一过程更加复杂。为了克服这一挑战，我们提出了 FairLink，这是一种学习公平性增强图的方法，可以在链接预测器的训练过程中绕过去毛刺的需要。FairLink 通过确保增强图遵循与原始输入图相似的训练轨迹来保持链接预测的准确性。同时，它通过最小化同一敏感组内节点对之间以及不同敏感组内节点对之间链接概率的绝对差异来提高公平性。我们在多个大规模图上进行的大量实验表明，FairLink 不仅提高了公平性，而且通常还能达到与基准方法相当的链接预测精度。最重要的是，增强型图在不同的 GNN 架构中表现出很强的通用性。FairLink 具有很强的可扩展性，因此适合部署在现实世界的大规模图中，在这种图中，保持公平性和准确性至关重要。

{"title":"Promoting fairness in link prediction with graph enhancement.","authors":"Yezi Liu, Hanning Chen, Mohsen Imani","doi":"10.3389/fdata.2024.1489306","DOIUrl":"https://doi.org/10.3389/fdata.2024.1489306","url":null,"abstract":"Link prediction is a crucial task in network analysis, but it has been shown to be prone to biased predictions, particularly when links are unfairly predicted between nodes from different sensitive groups. In this paper, we study the fair link prediction problem, which aims to ensure that the predicted link probability is independent of the sensitive attributes of the connected nodes. Existing methods typically incorporate debiasing techniques within graph embeddings to mitigate this issue. However, training on large real-world graphs is already challenging, and adding fairness constraints can further complicate the process. To overcome this challenge, we propose FairLink, a method that learns a fairness-enhanced graph to bypass the need for debiasing during the link predictor's training. FairLink maintains link prediction accuracy by ensuring that the enhanced graph follows a training trajectory similar to that of the original input graph. Meanwhile, it enhances fairness by minimizing the absolute difference in link probabilities between node pairs within the same sensitive group and those between node pairs from different sensitive groups. Our extensive experiments on multiple large-scale graphs demonstrate that FairLink not only promotes fairness but also often achieves link prediction accuracy comparable to baseline methods. Most importantly, the enhanced graph exhibits strong generalizability across different GNN architectures. FairLink is highly scalable, making it suitable for deployment in real-world large-scale graphs, where maintaining both fairness and accuracy is critical.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"7 ","pages":"1489306"},"PeriodicalIF":2.4,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11540639/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142607383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Exploring code portability solutions for HEP with a particle tracking test code. 利用粒子跟踪测试代码探索 HEP 代码可移植性解决方案。

IF 2.4 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers in Big Data

Pub Date : 2024-10-23 eCollection Date: 2024-01-01 DOI: 10.3389/fdata.2024.1485344

Hammad Ather, Sophie Berkman, Giuseppe Cerati, Matti J Kortelainen, Ka Hei Martin Kwok, Steven Lantz, Seyong Lee, Boyana Norris, Michael Reid, Allison Reinsvold Hall, Daniel Riley, Alexei Strelchenko, Cong Wang

Traditionally, high energy physics (HEP) experiments have relied on x86 CPUs for the majority of their significant computing needs. As the field looks ahead to the next generation of experiments such as DUNE and the High-Luminosity LHC, the computing demands are expected to increase dramatically. To cope with this increase, it will be necessary to take advantage of all available computing resources, including GPUs from different vendors. A broad landscape of code portability tools-including compiler pragma-based approaches, abstraction libraries, and other tools-allow the same source code to run efficiently on multiple architectures. In this paper, we use a test code taken from a HEP tracking algorithm to compare the performance and experience of implementing different portability solutions. While in several cases portable implementations perform close to the reference code version, we find that the performance varies significantly depending on the details of the implementation. Achieving optimal performance is not easy, even for relatively simple applications such as the test codes considered in this work. Several factors can affect the performance, such as the choice of the memory layout, the memory pinning strategy, and the compiler used. The compilers and tools are being actively developed, so future developments may be critical for their deployment in HEP experiments.

传统上，高能物理（HEP）实验的大部分重要计算需求都依赖于 x86 CPU。随着该领域对下一代实验（如 DUNE 和高亮度 LHC）的展望，预计计算需求将急剧增加。为了应对这一增长，有必要利用所有可用的计算资源，包括来自不同供应商的 GPU。代码可移植性工具的广泛应用--包括基于编译器语法的方法、抽象库和其他工具--允许相同的源代码在多种体系结构上高效运行。在本文中，我们使用 HEP 跟踪算法的测试代码来比较不同可移植性解决方案的性能和实施经验。虽然在某些情况下，可移植实现的性能接近参考代码版本，但我们发现，性能因实现细节的不同而有很大差异。实现最佳性能并非易事，即使是相对简单的应用，如本研究中考虑的测试代码。有几个因素会影响性能，如内存布局的选择、内存引脚策略和所使用的编译器。编译器和工具正在积极开发中，因此未来的发展可能对其在 HEP 实验中的部署至关重要。

{"title":"Exploring code portability solutions for HEP with a particle tracking test code.","authors":"Hammad Ather, Sophie Berkman, Giuseppe Cerati, Matti J Kortelainen, Ka Hei Martin Kwok, Steven Lantz, Seyong Lee, Boyana Norris, Michael Reid, Allison Reinsvold Hall, Daniel Riley, Alexei Strelchenko, Cong Wang","doi":"10.3389/fdata.2024.1485344","DOIUrl":"10.3389/fdata.2024.1485344","url":null,"abstract":"Traditionally, high energy physics (HEP) experiments have relied on x86 CPUs for the majority of their significant computing needs. As the field looks ahead to the next generation of experiments such as DUNE and the High-Luminosity LHC, the computing demands are expected to increase dramatically. To cope with this increase, it will be necessary to take advantage of all available computing resources, including GPUs from different vendors. A broad landscape of code portability tools-including compiler pragma-based approaches, abstraction libraries, and other tools-allow the same source code to run efficiently on multiple architectures. In this paper, we use a test code taken from a HEP tracking algorithm to compare the performance and experience of implementing different portability solutions. While in several cases portable implementations perform close to the reference code version, we find that the performance varies significantly depending on the details of the implementation. Achieving optimal performance is not easy, even for relatively simple applications such as the test codes considered in this work. Several factors can affect the performance, such as the choice of the memory layout, the memory pinning strategy, and the compiler used. The compilers and tools are being actively developed, so future developments may be critical for their deployment in HEP experiments.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"7 ","pages":"1485344"},"PeriodicalIF":2.4,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11537910/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142591971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0