Pub Date : 2024-11-20eCollection Date: 2024-01-01DOI: 10.3389/fdata.2024.1444634
Haoliang Wang, Chen Zhao, Feng Chen
Introduction: Multi-layer aggregation is key to the success of out-of-distribution (OOD) detection in deep neural networks. Moreover, in real-time systems, the efficiency of OOD detection is equally important as its effectiveness.
Methods: We propose a novel early stopping OOD detection framework for deep neural networks. By attaching multiple OOD detectors to the intermediate layers, this framework can detect OODs early to save computational cost. Additionally, through a layer-adaptive scoring function, it can adaptively select the optimal layer for each OOD based on its complexity, thereby improving OOD detection accuracy.
Results: Extensive experiments demonstrate that our proposed framework is robust against OODs of varying complexity. Adopting the early stopping strategy can increase OOD detection efficiency by up to 99.1% while maintaining superior accuracy.
Discussion: OODs of varying complexity are better detected at different layers. Leveraging the intrinsic characteristics of inputs encoded in the intermediate latent space is important for achieving high OOD detection accuracy. Our proposed framework, incorporating early stopping, significantly enhances OOD detection efficiency without compromising accuracy, making it practical for real-time applications.
{"title":"Efficient out-of-distribution detection via layer-adaptive scoring and early stopping.","authors":"Haoliang Wang, Chen Zhao, Feng Chen","doi":"10.3389/fdata.2024.1444634","DOIUrl":"10.3389/fdata.2024.1444634","url":null,"abstract":"<p><strong>Introduction: </strong>Multi-layer aggregation is key to the success of out-of-distribution (OOD) detection in deep neural networks. Moreover, in real-time systems, the efficiency of OOD detection is equally important as its effectiveness.</p><p><strong>Methods: </strong>We propose a novel early stopping OOD detection framework for deep neural networks. By attaching multiple OOD detectors to the intermediate layers, this framework can detect OODs early to save computational cost. Additionally, through a layer-adaptive scoring function, it can adaptively select the optimal layer for each OOD based on its complexity, thereby improving OOD detection accuracy.</p><p><strong>Results: </strong>Extensive experiments demonstrate that our proposed framework is robust against OODs of varying complexity. Adopting the early stopping strategy can increase OOD detection efficiency by up to 99.1% while maintaining superior accuracy.</p><p><strong>Discussion: </strong>OODs of varying complexity are better detected at different layers. Leveraging the intrinsic characteristics of inputs encoded in the intermediate latent space is important for achieving high OOD detection accuracy. Our proposed framework, incorporating early stopping, significantly enhances OOD detection efficiency without compromising accuracy, making it practical for real-time applications.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"7 ","pages":"1444634"},"PeriodicalIF":2.4,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11615063/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142781820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-20eCollection Date: 2024-01-01DOI: 10.3389/fdata.2024.1469819
Bilal Abu-Salih, Salihah Alotaibi, Manaf Al-Okaily, Mohammed Aljaafari, Muder Almiani
Brand advocates, characterized by their enthusiasm for promoting a brand without incentives, play a crucial role in driving positive word-of-mouth (WOM) and influencing potential customers. However, there is a notable lack of intelligent systems capable of accurately identifying online advocates based on their social interactions with brands. Knowledge Graphs (KGs) offer structured and factual representations of human knowledge, providing a potential solution to gain holistic insights into customer preferences and interactions with a brand. This study presents a novel framework that leverages KG construction and embedding techniques to identify brand advocates accurately. By harnessing the power of KGs, our framework enhances the accuracy and efficiency of identifying and understanding brand advocates, providing valuable insights into customer advocacy dynamics in the online realm. Moreover, we address the critical aspect of social credibility, which significantly influences the impact of advocacy efforts. Incorporating social credibility analysis into our framework allows businesses to identify and mitigate spammers, preserving authenticity and customer trust. To achieve this, we incorporate and extend DSpamOnto, a specialized ontology designed to identify social spam, with a focus on the social commerce domain. Additionally, we employ cutting-edge embedding techniques to map the KG into a low-dimensional vector space, enabling effective link prediction, clustering, and visualization. Through a rigorous evaluation process, we demonstrate the effectiveness and performance of our proposed framework, highlighting its potential to empower businesses in cultivating brand advocates and driving meaningful customer engagement strategies.
{"title":"Credibility-based knowledge graph embedding for identifying social brand advocates.","authors":"Bilal Abu-Salih, Salihah Alotaibi, Manaf Al-Okaily, Mohammed Aljaafari, Muder Almiani","doi":"10.3389/fdata.2024.1469819","DOIUrl":"10.3389/fdata.2024.1469819","url":null,"abstract":"<p><p>Brand advocates, characterized by their enthusiasm for promoting a brand without incentives, play a crucial role in driving positive word-of-mouth (WOM) and influencing potential customers. However, there is a notable lack of intelligent systems capable of accurately identifying online advocates based on their social interactions with brands. Knowledge Graphs (KGs) offer structured and factual representations of human knowledge, providing a potential solution to gain holistic insights into customer preferences and interactions with a brand. This study presents a novel framework that leverages KG construction and embedding techniques to identify brand advocates accurately. By harnessing the power of KGs, our framework enhances the accuracy and efficiency of identifying and understanding brand advocates, providing valuable insights into customer advocacy dynamics in the online realm. Moreover, we address the critical aspect of social credibility, which significantly influences the impact of advocacy efforts. Incorporating social credibility analysis into our framework allows businesses to identify and mitigate spammers, preserving authenticity and customer trust. To achieve this, we incorporate and extend DSpamOnto, a specialized ontology designed to identify social spam, with a focus on the social commerce domain. Additionally, we employ cutting-edge embedding techniques to map the KG into a low-dimensional vector space, enabling effective link prediction, clustering, and visualization. Through a rigorous evaluation process, we demonstrate the effectiveness and performance of our proposed framework, highlighting its potential to empower businesses in cultivating brand advocates and driving meaningful customer engagement strategies.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"7 ","pages":"1469819"},"PeriodicalIF":2.4,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11614760/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142781816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-19eCollection Date: 2024-01-01DOI: 10.3389/fdata.2024.1427104
Zohreh Raghebi, Farnoush Banaei-Kashani
With graph reachability query, one can answer whether there exists a path between two query vertices in a given graph. The existing reachability query processing solutions use traditional reachability index structures and can only compute exact answers, which may take a long time to resolve in large graphs. In contrast, with an approximate reachability query, one can offer a compromise by enabling users to strike a trade-off between query time and the accuracy of the query result. In this study, we propose a framework, dubbed ActiveReach, for learning index structures to answer approximate reachability query. ActiveReach is a two-phase framework that focuses on embedding nodes in a reachability space. In the first phase, we leverage node attributes and positional information to create reachability-aware embeddings for each node. These embeddings are then used as nodes' attributes in the second phase. In the second phase, we incorporate the new attributes and include reachability information as labels in the training data to generate embeddings in a reachability space. In addition, computing reachability for all training data may not be practical. Therefore, selecting a subset of data to compute reachability effectively and enhance reachability prediction performance is challenging. ActiveReach addresses this challenge by employing an active learning approach in the second phase to selectively compute reachability for a subset of node pairs, thus learning the approximate reachability for the entire graph. Our extensive experimental study with various real attributed large-scale graphs demonstrates the effectiveness of each component of our framework.
{"title":"ActiveReach: an active learning framework for approximate reachability query answering in large-scale graphs.","authors":"Zohreh Raghebi, Farnoush Banaei-Kashani","doi":"10.3389/fdata.2024.1427104","DOIUrl":"10.3389/fdata.2024.1427104","url":null,"abstract":"<p><p>With graph reachability query, one can answer whether there exists a path between two query vertices in a given graph. The existing reachability query processing solutions use traditional reachability index structures and can only compute exact answers, which may take a long time to resolve in large graphs. In contrast, with an approximate reachability query, one can offer a compromise by enabling users to strike a trade-off between query time and the accuracy of the query result. In this study, we propose a framework, dubbed ActiveReach, for learning index structures to answer approximate reachability query. ActiveReach is a two-phase framework that focuses on embedding nodes in a reachability space. In the first phase, we leverage node attributes and positional information to create reachability-aware embeddings for each node. These embeddings are then used as nodes' attributes in the second phase. In the second phase, we incorporate the new attributes and include reachability information as labels in the training data to generate embeddings in a reachability space. In addition, computing reachability for all training data may not be practical. Therefore, selecting a subset of data to compute reachability effectively and enhance reachability prediction performance is challenging. ActiveReach addresses this challenge by employing an active learning approach in the second phase to selectively compute reachability for a subset of node pairs, thus learning the approximate reachability for the entire graph. Our extensive experimental study with various real attributed large-scale graphs demonstrates the effectiveness of each component of our framework.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"7 ","pages":"1427104"},"PeriodicalIF":2.4,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11611874/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142774795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-15eCollection Date: 2024-01-01DOI: 10.3389/fdata.2024.1431346
Bowen Yang, LinLin Yu, Feng Chen
Bird's-eye-view Semantic Segmentation (BEVSS) is a powerful and crucial component of planning and control systems in many autonomous vehicles. Current methods rely on end-to-end learning to train models, leading to indirectly supervised and inaccurate camera-to-BEV projections. We propose a novel method of supervising feature extraction with camera-view depth and segmentation information, which improves the quality of feature extraction and projection in the BEVSS pipeline. Our model, evaluated on the nuScenes dataset, shows a 3.8% improvement in Intersection-over-Union (IoU) for vehicle segmentation and a 30-fold reduction in depth error compared to baselines, while maintaining competitive inference times of 32 FPS. This method offers more accurate and reliable BEVSS for real-time autonomous driving systems. The codes and implementation details and code can be found at https://github.com/bluffish/sucam.
{"title":"Camera-view supervision for bird's-eye-view semantic segmentation.","authors":"Bowen Yang, LinLin Yu, Feng Chen","doi":"10.3389/fdata.2024.1431346","DOIUrl":"https://doi.org/10.3389/fdata.2024.1431346","url":null,"abstract":"<p><p>Bird's-eye-view Semantic Segmentation (BEVSS) is a powerful and crucial component of planning and control systems in many autonomous vehicles. Current methods rely on end-to-end learning to train models, leading to indirectly supervised and inaccurate camera-to-BEV projections. We propose a novel method of supervising feature extraction with camera-view depth and segmentation information, which improves the quality of feature extraction and projection in the BEVSS pipeline. Our model, evaluated on the nuScenes dataset, shows a 3.8% improvement in Intersection-over-Union (IoU) for vehicle segmentation and a 30-fold reduction in depth error compared to baselines, while maintaining competitive inference times of 32 FPS. This method offers more accurate and reliable BEVSS for real-time autonomous driving systems. The codes and implementation details and code can be found at https://github.com/bluffish/sucam.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"7 ","pages":"1431346"},"PeriodicalIF":2.4,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11604745/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142774796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-08eCollection Date: 2024-01-01DOI: 10.3389/fdata.2024.1309887
Tsz Kin Chau, Paul Bourke, Lily Hibberd, Daniel Jaquet, Sarah Kenderdine
From the nineteenth-century panorama to the emergence of the digital panoramic format in the 1990's, the visualization of large images frequently relies on panoramic viewing strategies. Originally rendered in the form of epic painted canvases, these strategies are now amplified through gigapixel imaging, computer vision and machine learning. Whether for scientific analysis, dissemination, or to visualize cultural big data, panoramic strategies pivot on the illusion of immersion. The latter is achieved through human-centered design situated within a large-scale environment combined with a multi-sensory experience spanning sight, sound, touch, and smell. In this article, we present the original research undertaken to realize a digital twin of the 1894 panorama of the battle of Murten. Following a brief history of the panorama, the methods and technological framework systems developed for Murten panorama's visualization are delineated. Novel visualization methodologies are further discussed, including how to create the illusion of immersion for the world's largest image of a single physical object and its cultural big data. We also present the visualization strategies developed for the augmentation of the layered narratives and histories embedded in the final interactive viewing experience of the Murten panorama. This article offers researchers in heritage big data new schemas for the visualization and augmentation of gigapixel images in digital panoramas.
{"title":"Cultural big data: nineteenth to twenty-first century panoramic visualization.","authors":"Tsz Kin Chau, Paul Bourke, Lily Hibberd, Daniel Jaquet, Sarah Kenderdine","doi":"10.3389/fdata.2024.1309887","DOIUrl":"10.3389/fdata.2024.1309887","url":null,"abstract":"<p><p>From the nineteenth-century panorama to the emergence of the digital panoramic format in the 1990's, the visualization of large images frequently relies on panoramic viewing strategies. Originally rendered in the form of epic painted canvases, these strategies are now amplified through gigapixel imaging, computer vision and machine learning. Whether for scientific analysis, dissemination, or to visualize cultural big data, panoramic strategies pivot on the illusion of immersion. The latter is achieved through human-centered design situated within a large-scale environment combined with a multi-sensory experience spanning sight, sound, touch, and smell. In this article, we present the original research undertaken to realize a digital twin of the 1894 panorama of the battle of Murten. Following a brief history of the panorama, the methods and technological framework systems developed for Murten panorama's visualization are delineated. Novel visualization methodologies are further discussed, including how to create the illusion of immersion for the world's largest image of a single physical object and its cultural big data. We also present the visualization strategies developed for the augmentation of the layered narratives and histories embedded in the final interactive viewing experience of the Murten panorama. This article offers researchers in heritage big data new schemas for the visualization and augmentation of gigapixel images in digital panoramas.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"7 ","pages":"1309887"},"PeriodicalIF":2.4,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11581890/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142711591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-05eCollection Date: 2024-01-01DOI: 10.3389/fdata.2024.1448481
Pouya Ataei
Introduction: The ubiquity of digital devices, the infrastructure of today, and the ever-increasing proliferation of digital products have dawned a new era, the era of big data (BD). This era began when the volume, variety, and velocity of data overwhelmed traditional systems that used to analyze and store that data. This precipitated a new class of software systems, namely, BD systems. Whereas BD systems provide a competitive advantage to businesses, many have failed to harness the power of them. It has been estimated that only 20% of companies have successfully implemented a BD project.
Methods: This study aims to facilitate BD system development by introducing Cybermycelium, a domain-driven decentralized BD reference architecture (RA). The artifact was developed following the guidelines of empirically grounded RAs and evaluated through implementation in a real-world scenario using the Architecture Tradeoff Analysis Method (ATAM).
Results: The evaluation revealed that Cybermycelium successfully addressed key architectural qualities: performance (achieving <1,000 ms response times), availability (through event brokers and circuit breaking), and modifiability (enabling rapid service deployment and configuration). The prototype demonstrated effective handling of data processing, scalability challenges, and domain-specific requirements in a large-scale international company setting.
Discussion: The results highlight important architectural trade-offs between event backbone implementation and service mesh design. While the domain-driven distributed approach improved scalability and maintainability compared to traditional monolithic architectures, it requires significant technical expertise for implementation. This contribution advances the field by providing a validated reference architecture that addresses the challenges of adopting BD in modern enterprises.
{"title":"Cybermycelium: a reference architecture for domain-driven distributed big data systems.","authors":"Pouya Ataei","doi":"10.3389/fdata.2024.1448481","DOIUrl":"https://doi.org/10.3389/fdata.2024.1448481","url":null,"abstract":"<p><strong>Introduction: </strong>The ubiquity of digital devices, the infrastructure of today, and the ever-increasing proliferation of digital products have dawned a new era, the era of big data (BD). This era began when the volume, variety, and velocity of data overwhelmed traditional systems that used to analyze and store that data. This precipitated a new class of software systems, namely, BD systems. Whereas BD systems provide a competitive advantage to businesses, many have failed to harness the power of them. It has been estimated that only 20% of companies have successfully implemented a BD project.</p><p><strong>Methods: </strong>This study aims to facilitate BD system development by introducing Cybermycelium, a domain-driven decentralized BD reference architecture (RA). The artifact was developed following the guidelines of empirically grounded RAs and evaluated through implementation in a real-world scenario using the Architecture Tradeoff Analysis Method (ATAM).</p><p><strong>Results: </strong>The evaluation revealed that Cybermycelium successfully addressed key architectural qualities: performance (achieving <1,000 ms response times), availability (through event brokers and circuit breaking), and modifiability (enabling rapid service deployment and configuration). The prototype demonstrated effective handling of data processing, scalability challenges, and domain-specific requirements in a large-scale international company setting.</p><p><strong>Discussion: </strong>The results highlight important architectural trade-offs between event backbone implementation and service mesh design. While the domain-driven distributed approach improved scalability and maintainability compared to traditional monolithic architectures, it requires significant technical expertise for implementation. This contribution advances the field by providing a validated reference architecture that addresses the challenges of adopting BD in modern enterprises.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"7 ","pages":"1448481"},"PeriodicalIF":2.4,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11573557/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142677536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-01eCollection Date: 2024-01-01DOI: 10.3389/fdata.2024.1452129
Christoph Deppe, Gary S Schaal
This study evaluates NATO ACT's cognitive warfare concept from a political science perspective, exploring its utility beyond military applications. Despite its growing presence in scholarly discourse, the concept's interdisciplinary nature has hindered a unified definition. By analyzing NATO's framework, developed with input from diverse disciplines and both military and civilian researchers, this paper seeks to assess its applicability to political science. It aims to bridge military and civilian research divides and refine NATO's cognitive warfare approach, offering significant implications for enhancing political science research and fostering integrated scholarly collaboration.
{"title":"Cognitive warfare: a conceptual analysis of the NATO ACT cognitive warfare exploratory concept.","authors":"Christoph Deppe, Gary S Schaal","doi":"10.3389/fdata.2024.1452129","DOIUrl":"https://doi.org/10.3389/fdata.2024.1452129","url":null,"abstract":"<p><p>This study evaluates NATO ACT's cognitive warfare concept from a political science perspective, exploring its utility beyond military applications. Despite its growing presence in scholarly discourse, the concept's interdisciplinary nature has hindered a unified definition. By analyzing NATO's framework, developed with input from diverse disciplines and both military and civilian researchers, this paper seeks to assess its applicability to political science. It aims to bridge military and civilian research divides and refine NATO's cognitive warfare approach, offering significant implications for enhancing political science research and fostering integrated scholarly collaboration.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"7 ","pages":"1452129"},"PeriodicalIF":2.4,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11565700/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142649508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-30eCollection Date: 2024-01-01DOI: 10.3389/fdata.2024.1422546
Li Han, Shuaijie Zhu, Haoyang Zhao, Yanqiang He
The widespread use of mobile devices and compute-intensive applications has increased the connection of smart devices to networks, generating significant data. Real-time execution faces challenges due to limited resources and demanding applications in edge computing environments. To address these challenges, an enhanced whale optimization algorithm (EWOA) was proposed for task scheduling. A multi-objective model based on CPU, memory, time, and resource utilization was developed. The model was transformed into a whale optimization problem, incorporating chaotic mapping to initialize populations and prevent premature convergence. A nonlinear convergence factor was introduced to balance local and global search. The algorithm's performance was evaluated in an experimental edge computing environment and compared with ODTS, WOA, HWACO, and CATSA algorithms. Experimental results demonstrated that EWOA reduced costs by 29.22%, decreased completion time by 17.04%, and improved node resource utilization by 9.5%. While EWOA offers significant advantages, limitations include the lack of consideration for potential network delays and user mobility. Future research will focus on fault-tolerant scheduling techniques to address dynamic user needs and improve service robustness and quality.
{"title":"An enhanced whale optimization algorithm for task scheduling in edge computing environments.","authors":"Li Han, Shuaijie Zhu, Haoyang Zhao, Yanqiang He","doi":"10.3389/fdata.2024.1422546","DOIUrl":"10.3389/fdata.2024.1422546","url":null,"abstract":"<p><p>The widespread use of mobile devices and compute-intensive applications has increased the connection of smart devices to networks, generating significant data. Real-time execution faces challenges due to limited resources and demanding applications in edge computing environments. To address these challenges, an enhanced whale optimization algorithm (EWOA) was proposed for task scheduling. A multi-objective model based on CPU, memory, time, and resource utilization was developed. The model was transformed into a whale optimization problem, incorporating chaotic mapping to initialize populations and prevent premature convergence. A nonlinear convergence factor was introduced to balance local and global search. The algorithm's performance was evaluated in an experimental edge computing environment and compared with ODTS, WOA, HWACO, and CATSA algorithms. Experimental results demonstrated that EWOA reduced costs by 29.22%, decreased completion time by 17.04%, and improved node resource utilization by 9.5%. While EWOA offers significant advantages, limitations include the lack of consideration for potential network delays and user mobility. Future research will focus on fault-tolerant scheduling techniques to address dynamic user needs and improve service robustness and quality.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"7 ","pages":"1422546"},"PeriodicalIF":2.4,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11557405/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142631928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-24eCollection Date: 2024-01-01DOI: 10.3389/fdata.2024.1489306
Yezi Liu, Hanning Chen, Mohsen Imani
Link prediction is a crucial task in network analysis, but it has been shown to be prone to biased predictions, particularly when links are unfairly predicted between nodes from different sensitive groups. In this paper, we study the fair link prediction problem, which aims to ensure that the predicted link probability is independent of the sensitive attributes of the connected nodes. Existing methods typically incorporate debiasing techniques within graph embeddings to mitigate this issue. However, training on large real-world graphs is already challenging, and adding fairness constraints can further complicate the process. To overcome this challenge, we propose FairLink, a method that learns a fairness-enhanced graph to bypass the need for debiasing during the link predictor's training. FairLink maintains link prediction accuracy by ensuring that the enhanced graph follows a training trajectory similar to that of the original input graph. Meanwhile, it enhances fairness by minimizing the absolute difference in link probabilities between node pairs within the same sensitive group and those between node pairs from different sensitive groups. Our extensive experiments on multiple large-scale graphs demonstrate that FairLink not only promotes fairness but also often achieves link prediction accuracy comparable to baseline methods. Most importantly, the enhanced graph exhibits strong generalizability across different GNN architectures. FairLink is highly scalable, making it suitable for deployment in real-world large-scale graphs, where maintaining both fairness and accuracy is critical.
{"title":"Promoting fairness in link prediction with graph enhancement.","authors":"Yezi Liu, Hanning Chen, Mohsen Imani","doi":"10.3389/fdata.2024.1489306","DOIUrl":"https://doi.org/10.3389/fdata.2024.1489306","url":null,"abstract":"<p><p>Link prediction is a crucial task in network analysis, but it has been shown to be prone to biased predictions, particularly when links are unfairly predicted between nodes from different sensitive groups. In this paper, we study the fair link prediction problem, which aims to ensure that the predicted link probability is independent of the sensitive attributes of the connected nodes. Existing methods typically incorporate debiasing techniques within graph embeddings to mitigate this issue. However, training on large real-world graphs is already challenging, and adding fairness constraints can further complicate the process. To overcome this challenge, we propose FairLink, a method that learns a fairness-enhanced graph to bypass the need for debiasing during the link predictor's training. FairLink maintains link prediction accuracy by ensuring that the enhanced graph follows a training trajectory similar to that of the original input graph. Meanwhile, it enhances fairness by minimizing the absolute difference in link probabilities between node pairs within the same sensitive group and those between node pairs from different sensitive groups. Our extensive experiments on multiple large-scale graphs demonstrate that FairLink not only promotes fairness but also often achieves link prediction accuracy comparable to baseline methods. Most importantly, the enhanced graph exhibits strong generalizability across different GNN architectures. FairLink is highly scalable, making it suitable for deployment in real-world large-scale graphs, where maintaining both fairness and accuracy is critical.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"7 ","pages":"1489306"},"PeriodicalIF":2.4,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11540639/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142607383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-23eCollection Date: 2024-01-01DOI: 10.3389/fdata.2024.1485344
Hammad Ather, Sophie Berkman, Giuseppe Cerati, Matti J Kortelainen, Ka Hei Martin Kwok, Steven Lantz, Seyong Lee, Boyana Norris, Michael Reid, Allison Reinsvold Hall, Daniel Riley, Alexei Strelchenko, Cong Wang
Traditionally, high energy physics (HEP) experiments have relied on x86 CPUs for the majority of their significant computing needs. As the field looks ahead to the next generation of experiments such as DUNE and the High-Luminosity LHC, the computing demands are expected to increase dramatically. To cope with this increase, it will be necessary to take advantage of all available computing resources, including GPUs from different vendors. A broad landscape of code portability tools-including compiler pragma-based approaches, abstraction libraries, and other tools-allow the same source code to run efficiently on multiple architectures. In this paper, we use a test code taken from a HEP tracking algorithm to compare the performance and experience of implementing different portability solutions. While in several cases portable implementations perform close to the reference code version, we find that the performance varies significantly depending on the details of the implementation. Achieving optimal performance is not easy, even for relatively simple applications such as the test codes considered in this work. Several factors can affect the performance, such as the choice of the memory layout, the memory pinning strategy, and the compiler used. The compilers and tools are being actively developed, so future developments may be critical for their deployment in HEP experiments.
传统上,高能物理(HEP)实验的大部分重要计算需求都依赖于 x86 CPU。随着该领域对下一代实验(如 DUNE 和高亮度 LHC)的展望,预计计算需求将急剧增加。为了应对这一增长,有必要利用所有可用的计算资源,包括来自不同供应商的 GPU。代码可移植性工具的广泛应用--包括基于编译器语法的方法、抽象库和其他工具--允许相同的源代码在多种体系结构上高效运行。在本文中,我们使用 HEP 跟踪算法的测试代码来比较不同可移植性解决方案的性能和实施经验。虽然在某些情况下,可移植实现的性能接近参考代码版本,但我们发现,性能因实现细节的不同而有很大差异。实现最佳性能并非易事,即使是相对简单的应用,如本研究中考虑的测试代码。有几个因素会影响性能,如内存布局的选择、内存引脚策略和所使用的编译器。编译器和工具正在积极开发中,因此未来的发展可能对其在 HEP 实验中的部署至关重要。
{"title":"Exploring code portability solutions for HEP with a particle tracking test code.","authors":"Hammad Ather, Sophie Berkman, Giuseppe Cerati, Matti J Kortelainen, Ka Hei Martin Kwok, Steven Lantz, Seyong Lee, Boyana Norris, Michael Reid, Allison Reinsvold Hall, Daniel Riley, Alexei Strelchenko, Cong Wang","doi":"10.3389/fdata.2024.1485344","DOIUrl":"10.3389/fdata.2024.1485344","url":null,"abstract":"<p><p>Traditionally, high energy physics (HEP) experiments have relied on x86 CPUs for the majority of their significant computing needs. As the field looks ahead to the next generation of experiments such as DUNE and the High-Luminosity LHC, the computing demands are expected to increase dramatically. To cope with this increase, it will be necessary to take advantage of all available computing resources, including GPUs from different vendors. A broad landscape of code portability tools-including compiler pragma-based approaches, abstraction libraries, and other tools-allow the same source code to run efficiently on multiple architectures. In this paper, we use a test code taken from a HEP tracking algorithm to compare the performance and experience of implementing different portability solutions. While in several cases portable implementations perform close to the reference code version, we find that the performance varies significantly depending on the details of the implementation. Achieving optimal performance is not easy, even for relatively simple applications such as the test codes considered in this work. Several factors can affect the performance, such as the choice of the memory layout, the memory pinning strategy, and the compiler used. The compilers and tools are being actively developed, so future developments may be critical for their deployment in HEP experiments.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"7 ","pages":"1485344"},"PeriodicalIF":2.4,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11537910/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142591971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}