Pub Date : 2025-12-04DOI: 10.1177/2167647X251399606
Suhas Alalasandra Ramakrishnaiah, Yasir Abdullah Rabi, Ananth John Patrick, Mohammad Shabaz, Surbhi B Khan, Rijwan Khan, Ahlam Almusharraf
Engineering teams need timely signals about evolving requirements and release risk, yet multilingual fan discourse around live sports is noisy, code-switched, and saturated with sarcasm and event-driven drift. We present Hybrid DeepSentX, an AI-driven framework that converts crowd commentary into actionable requirements insight and sprint-level risk scores. The pipeline couples multilingual transformer encoders with an inductive GraphSAGE conversation graph to inject relational context across posts, and adds a reinforcement learner whose reward is shaped to prioritize correct decisions on sarcasm-heavy items and rapidly shifting events. We assembled a million-plus posts from X, Reddit, and sports forums and evaluated the framework against strong baselines, including BERT, long short-term memory, support-vector machines, and recent hybrid models, with significance tests, calibration analysis, ablations, and efficiency profiling. DeepSentX achieved higher macro-averaged accuracy and F1 on code-switched and sarcastic subsets, reduced missed risk flags, and produced developer-facing artefacts that directly support backlog grooming and defect triage. Relative to prior hybrids that combine transformers with either graph reasoning or reinforcement alone, our contributions are fourfold: (i) a unified multilingual design that integrates transformer, graph, and reinforcement components for sarcasm and drift robustness, (ii) an annotated multi-platform corpus with explicit code switching and sarcasm labels and per platform language balance, (iii) a rigorous comparative study reporting accuracy, calibration, latency, memory, and parameter count, and (iv) deployment artefacts that turn model outputs into requirement clusters and sprint risk scores suitable for continuous planning.
{"title":"Hybrid DeepSentX Framework for AI-Driven Requirements Insight and Risk Prediction in Multilingual Sports Using Natural Language Processing.","authors":"Suhas Alalasandra Ramakrishnaiah, Yasir Abdullah Rabi, Ananth John Patrick, Mohammad Shabaz, Surbhi B Khan, Rijwan Khan, Ahlam Almusharraf","doi":"10.1177/2167647X251399606","DOIUrl":"https://doi.org/10.1177/2167647X251399606","url":null,"abstract":"<p><p>Engineering teams need timely signals about evolving requirements and release risk, yet multilingual fan discourse around live sports is noisy, code-switched, and saturated with sarcasm and event-driven drift. We present Hybrid DeepSentX, an AI-driven framework that converts crowd commentary into actionable requirements insight and sprint-level risk scores. The pipeline couples multilingual transformer encoders with an inductive GraphSAGE conversation graph to inject relational context across posts, and adds a reinforcement learner whose reward is shaped to prioritize correct decisions on sarcasm-heavy items and rapidly shifting events. We assembled a million-plus posts from X, Reddit, and sports forums and evaluated the framework against strong baselines, including BERT, long short-term memory, support-vector machines, and recent hybrid models, with significance tests, calibration analysis, ablations, and efficiency profiling. DeepSentX achieved higher macro-averaged accuracy and F1 on code-switched and sarcastic subsets, reduced missed risk flags, and produced developer-facing artefacts that directly support backlog grooming and defect triage. Relative to prior hybrids that combine transformers with either graph reasoning or reinforcement alone, our contributions are fourfold: (i) a unified multilingual design that integrates transformer, graph, and reinforcement components for sarcasm and drift robustness, (ii) an annotated multi-platform corpus with explicit code switching and sarcasm labels and per platform language balance, (iii) a rigorous comparative study reporting accuracy, calibration, latency, memory, and parameter count, and (iv) deployment artefacts that turn model outputs into requirement clusters and sprint risk scores suitable for continuous planning.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":""},"PeriodicalIF":2.6,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145702853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-08-22DOI: 10.1177/2167647X251366060
Xuna Wang
With the rapid development of social media and online platforms, the speed and influence of emergency dissemination in cyberspace have significantly increased. The swift changes in public opinion, especially the phenomenon of opinion reversals, exert profound impacts on social stability and government credibility. The hypernetwork structure, characterized by its multilayered and multidimensional complexity, offers a new theoretical framework for analyzing multiagents and their interactions in the evolution of public opinion. Based on hypernetwork theory, this study constructs a four-layer subnet model encompassing user interaction network, event evolution network, semantic association network, and emotional conduction network. By extracting network structural features and conducting cross-layer linkage analysis, an identification system for public opinion reversals in emergencies is established. Taking the donation incident involving Hongxing Erke during the Henan rainstorm in 2021 as a case study, an empirical analysis of the public opinion reversal process is conducted. The research results indicate that the proposed hypernetwork model can effectively identify key nodes in public opinion reversals. The multi-indicator collaborative identification system for public opinion reversals aids in rapidly and effectively detecting signals of such reversals. This study not only provides new methodological support for the dynamic identification of public opinion reversals but also offers theoretical references and practical guidance for public opinion monitoring and emergency response decision-making in emergencies.
{"title":"A Study of Public Opinion Reversal Recognition of Emergency Based on Hypernetwork.","authors":"Xuna Wang","doi":"10.1177/2167647X251366060","DOIUrl":"10.1177/2167647X251366060","url":null,"abstract":"<p><p>With the rapid development of social media and online platforms, the speed and influence of emergency dissemination in cyberspace have significantly increased. The swift changes in public opinion, especially the phenomenon of opinion reversals, exert profound impacts on social stability and government credibility. The hypernetwork structure, characterized by its multilayered and multidimensional complexity, offers a new theoretical framework for analyzing multiagents and their interactions in the evolution of public opinion. Based on hypernetwork theory, this study constructs a four-layer subnet model encompassing user interaction network, event evolution network, semantic association network, and emotional conduction network. By extracting network structural features and conducting cross-layer linkage analysis, an identification system for public opinion reversals in emergencies is established. Taking the donation incident involving Hongxing Erke during the Henan rainstorm in 2021 as a case study, an empirical analysis of the public opinion reversal process is conducted. The research results indicate that the proposed hypernetwork model can effectively identify key nodes in public opinion reversals. The multi-indicator collaborative identification system for public opinion reversals aids in rapidly and effectively detecting signals of such reversals. This study not only provides new methodological support for the dynamic identification of public opinion reversals but also offers theoretical references and practical guidance for public opinion monitoring and emergency response decision-making in emergencies.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"497-512"},"PeriodicalIF":2.6,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144977778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01DOI: 10.1177/2167647X251406607
Yuping Yan, Hanyang Xie, Liang Chen, You Wen, Huaquan Su
Data in power grid digital operation exhibit multisource heterogeneous characteristics, resulting in low integration efficiency and slow anomaly detection response. To address this, this paper proposes a method for power grid digital operation data integration based on K-medoids clustering. The basic service layer utilizes an Field Programmable Gate Array parallel architecture. This enables millisecond-level synchronous acquisition and dynamic preprocessing of multisource data, such as mechanical vibration, partial discharge signals, and temperature. The implementation is based on the analysis of the power grid digital operation structure. The data are then fed back to the cloud service layer, which, through business integration services, data analysis, and data access services, performs data filtering and analysis. Subsequently, the data are input to the application layer via the database server. The application layer employs a K-medoids clustering method that introduces a density-weighted Euclidean distance metric and an adaptive centroid selection strategy, significantly enhancing the clustering performance of multisource data. In particular, the proposed architecture supports real-time data processing and can be extended to cross-modal scenarios, including integration with speech-to-text systems in power grid monitoring. By aligning with low-latency neural network principles, this method facilitates timely decision-making in intelligent operation environments. Experiments confirm the method's efficacy. It acquires and integrates multisource heterogeneous power grid digital operation data effectively. The data throughput of different power grid digital operation data sources all exceed 110 MB/s. The silhouette coefficient of the integrated data sets is greater than 0.91, indicating that the integration of power grid digital operation data using this method exhibits good separability and reliability, enabling rapid detection of data anomalies within the power grid, thus laying a solid foundation for the operation and maintenance management of power grid digital operation.
{"title":"Method for Power Grid Digital Operation Data Integration Based on K-Medoids Clustering with Support for Real-Time Cross-Modal Applications.","authors":"Yuping Yan, Hanyang Xie, Liang Chen, You Wen, Huaquan Su","doi":"10.1177/2167647X251406607","DOIUrl":"https://doi.org/10.1177/2167647X251406607","url":null,"abstract":"<p><p>Data in power grid digital operation exhibit multisource heterogeneous characteristics, resulting in low integration efficiency and slow anomaly detection response. To address this, this paper proposes a method for power grid digital operation data integration based on K-medoids clustering. The basic service layer utilizes an Field Programmable Gate Array parallel architecture. This enables millisecond-level synchronous acquisition and dynamic preprocessing of multisource data, such as mechanical vibration, partial discharge signals, and temperature. The implementation is based on the analysis of the power grid digital operation structure. The data are then fed back to the cloud service layer, which, through business integration services, data analysis, and data access services, performs data filtering and analysis. Subsequently, the data are input to the application layer via the database server. The application layer employs a K-medoids clustering method that introduces a density-weighted Euclidean distance metric and an adaptive centroid selection strategy, significantly enhancing the clustering performance of multisource data. In particular, the proposed architecture supports real-time data processing and can be extended to cross-modal scenarios, including integration with speech-to-text systems in power grid monitoring. By aligning with low-latency neural network principles, this method facilitates timely decision-making in intelligent operation environments. Experiments confirm the method's efficacy. It acquires and integrates multisource heterogeneous power grid digital operation data effectively. The data throughput of different power grid digital operation data sources all exceed 110 MB/s. The silhouette coefficient of the integrated data sets is greater than 0.91, indicating that the integration of power grid digital operation data using this method exhibits good separability and reliability, enabling rapid detection of data anomalies within the power grid, thus laying a solid foundation for the operation and maintenance management of power grid digital operation.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":"13 6","pages":"453-470"},"PeriodicalIF":2.6,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145716573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Soybeans are a high-quality vegetable protein resource and a fundamental strategic material integral to the national economy and public livelihood. To investigate the research status of soybean quality evaluation, this study analyzes relevant literature from Web of Science and China Knowledge Network (2000-2024). Using bibliometric methods with Excel and VOSviewer, we examined publication years, keywords, authors, sources, countries/regions, and institutions, generating visualizations to intuitively illustrate the field's developmental status. Results indicate that over the past 25 years, soybean quality evaluation research has emerged as a focal point in crop science, with institutions predominantly located in China and the United States. Key journals in this domain include Food Chemistry, Frontiers in Plant Science, and Soybean Science, among others. Research primarily focuses on soybean physical characteristics and the component-quality relationship. Interdisciplinary advancements have positioned spectral analysis, intelligent systems, and multitechnology fusion as innovative frontiers in this field. These findings enhance researchers' understanding of current trends and support evidence-based decision-making in soybean quality evaluation.
大豆是一种优质植物蛋白资源,是关系国计民生的基础性战略物资。为了了解大豆品质评价的研究现状,本研究分析了Web of Science和中国知识网(2000-2024)的相关文献。利用文献计量学方法,结合Excel和VOSviewer,对论文的出版年份、关键词、作者、来源、国家/地区和机构进行了统计分析,生成了可视化图,直观地说明了该领域的发展状况。结果表明,在过去的25年中,大豆质量评价研究已成为作物科学的一个焦点,研究机构主要集中在中国和美国。该领域的主要期刊包括《食品化学》、《植物科学前沿》和《大豆科学》等。研究主要集中在大豆的物理特性和成分与品质的关系。跨学科的进步将光谱分析、智能系统和多技术融合定位为该领域的创新前沿。这些发现增强了研究人员对当前趋势的理解,并为大豆质量评价的循证决策提供了支持。
{"title":"Analysis on Research Situation of Soybean Quality Evaluation Based on Bibliometrics.","authors":"Yanxia Gao, Pengju Tang, Xuhong Tang, Dong Wang, Jiaqi Luo, JiaDong Wu","doi":"10.1177/2167647X251399053","DOIUrl":"10.1177/2167647X251399053","url":null,"abstract":"<p><p>Soybeans are a high-quality vegetable protein resource and a fundamental strategic material integral to the national economy and public livelihood. To investigate the research status of soybean quality evaluation, this study analyzes relevant literature from Web of Science and China Knowledge Network (2000-2024). Using bibliometric methods with Excel and VOSviewer, we examined publication years, keywords, authors, sources, countries/regions, and institutions, generating visualizations to intuitively illustrate the field's developmental status. Results indicate that over the past 25 years, soybean quality evaluation research has emerged as a focal point in crop science, with institutions predominantly located in China and the United States. Key journals in this domain include <i>Food Chemistry</i>, <i>Frontiers in Plant Science</i>, and <i>Soybean Science</i>, among others. Research primarily focuses on soybean physical characteristics and the component-quality relationship. Interdisciplinary advancements have positioned spectral analysis, intelligent systems, and multitechnology fusion as innovative frontiers in this field. These findings enhance researchers' understanding of current trends and support evidence-based decision-making in soybean quality evaluation.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"487-496"},"PeriodicalIF":2.6,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145679308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01DOI: 10.1177/2167647X251405797
Qiong He, Xueqing Guo
This study aims to enhance the prediction precision of aircraft engine remaining useful life (RUL) by overcoming common challenges in current models, such as ineffective feature extraction and insufficient modeling of long-term temporal dependencies. We propose a novel multilayer hybrid architecture that combines bidirectional long short-term memory (BiLSTM) and gated recurrent unit (GRU) networks, augmented with an attention mechanism to enhance the model's focus on informative temporal patterns. In this framework, raw time series data are initially processed by the BiLSTM to extract bidirectional features associated with engine health conditions. The GRU network is subsequently used to effectively model long-range dependencies, thereby enriching the temporal representation. An adaptive attention module is included to assign varying importance to different features, allowing the model to focus on key indicators of engine condition. Evaluation results on the FD001 and FD003 datasets show that the model achieves root mean squared error reductions ranging from 8.81% to 30.60% and from 7.48% to 37.96%, validating its performance and robustness in RUL forecasting. In comparison with conventional BiLSTM and GRU models, the proposed BiLSTM-GRU-Attention architecture integrates attention-based feature weighting with a hybrid recurrent framework, thereby offering a concise and effective approach to RUL prediction for aircraft engines.
{"title":"Prediction of Remaining Life of Aircraft Engines Based on BiLSTM-GRU-Attention Model.","authors":"Qiong He, Xueqing Guo","doi":"10.1177/2167647X251405797","DOIUrl":"https://doi.org/10.1177/2167647X251405797","url":null,"abstract":"<p><p>This study aims to enhance the prediction precision of aircraft engine remaining useful life (RUL) by overcoming common challenges in current models, such as ineffective feature extraction and insufficient modeling of long-term temporal dependencies. We propose a novel multilayer hybrid architecture that combines bidirectional long short-term memory (BiLSTM) and gated recurrent unit (GRU) networks, augmented with an attention mechanism to enhance the model's focus on informative temporal patterns. In this framework, raw time series data are initially processed by the BiLSTM to extract bidirectional features associated with engine health conditions. The GRU network is subsequently used to effectively model long-range dependencies, thereby enriching the temporal representation. An adaptive attention module is included to assign varying importance to different features, allowing the model to focus on key indicators of engine condition. Evaluation results on the FD001 and FD003 datasets show that the model achieves root mean squared error reductions ranging from 8.81% to 30.60% and from 7.48% to 37.96%, validating its performance and robustness in RUL forecasting. In comparison with conventional BiLSTM and GRU models, the proposed BiLSTM-GRU-Attention architecture integrates attention-based feature weighting with a hybrid recurrent framework, thereby offering a concise and effective approach to RUL prediction for aircraft engines.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":"13 6","pages":"471-486"},"PeriodicalIF":2.6,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145716512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-11-18DOI: 10.1177/2167647X251392796
Yang Wang, Tianchun Xiang, Shuai Luo, Yi Gao, Xiangyu Kong
Human activities that generate greenhouse gas emissions pose a significant threat to urban green and sustainable development. Production activities in key industrial sectors are a primary contributor to high urban carbon emissions. Therefore, effectively reducing carbon emissions in these sectors is crucial for achieving urban carbon peak and neutrality goals. Carbon emission monitoring is a critical approach that aids governmental bodies in understanding changes in industrial carbon emissions, thereby supporting decision-making and carbon reduction efforts. However, current industry-oriented carbon monitoring methods suffer from issues such as low frequency, poor accuracy, and inadequate privacy security. To address these challenges, this article proposes a novel privacy-protected "electricity-carbon'' nexus model, long short-term memory with the vertical federated framework (VF-LSTM), to monitor carbon emissions in key urban industries. The vertical federated framework ensures "usable but invisible" privacy protection for multisource data from various participants. The embedded long short-term memory model accurately captures industry-specific carbon emissions. Using data from key industries (steel, petrochemical, chemical, and nonferrous industries), this article constructs and validates the performance of the proposed industry-level carbon emission monitoring model. The results demonstrate that the model has high accuracy and robustness, effectively monitoring industry carbon emissions while protecting data privacy.
{"title":"Monitoring Carbon Emission from Key Industries Based on VF-LSTM Model.","authors":"Yang Wang, Tianchun Xiang, Shuai Luo, Yi Gao, Xiangyu Kong","doi":"10.1177/2167647X251392796","DOIUrl":"10.1177/2167647X251392796","url":null,"abstract":"<p><p>Human activities that generate greenhouse gas emissions pose a significant threat to urban green and sustainable development. Production activities in key industrial sectors are a primary contributor to high urban carbon emissions. Therefore, effectively reducing carbon emissions in these sectors is crucial for achieving urban carbon peak and neutrality goals. Carbon emission monitoring is a critical approach that aids governmental bodies in understanding changes in industrial carbon emissions, thereby supporting decision-making and carbon reduction efforts. However, current industry-oriented carbon monitoring methods suffer from issues such as low frequency, poor accuracy, and inadequate privacy security. To address these challenges, this article proposes a novel privacy-protected \"electricity-carbon'' nexus model, long short-term memory with the vertical federated framework (VF-LSTM), to monitor carbon emissions in key urban industries. The vertical federated framework ensures \"usable but invisible\" privacy protection for multisource data from various participants. The embedded long short-term memory model accurately captures industry-specific carbon emissions. Using data from key industries (steel, petrochemical, chemical, and nonferrous industries), this article constructs and validates the performance of the proposed industry-level carbon emission monitoring model. The results demonstrate that the model has high accuracy and robustness, effectively monitoring industry carbon emissions while protecting data privacy.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"441-452"},"PeriodicalIF":2.6,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145574953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-01Epub Date: 2025-02-28DOI: 10.1089/big.2024.0128
Ikpe Justice Akpan, Rouzbeh Razavi, Asuama A Akpan
Decision sciences (DSC) involves studying complex dynamic systems and processes to aid informed choices subject to constraints in uncertain conditions. It integrates multidisciplinary methods and strategies to evaluate decision engineering processes, identifying alternatives and providing insights toward enhancing prudent decision-making. This study analyzes the evolutionary trends and innovation in DSC education and research trends over the past 25 years. Using metadata from bibliographic records and employing the science mapping method and text analytics, we map and evaluate the thematic, intellectual, and social structures of DSC research. The results identify "knowledge management," "decision support systems," "data envelopment analysis," "simulation," and "artificial intelligence" (AI) as some of the prominent critical skills and knowledge requirements for problem-solving in DSC before and during the period (2000-2024). However, these technologies are evolving significantly in the recent wave of digital transformation, with data analytics frameworks (including techniques such as big data analytics, machine learning, business intelligence, data mining, and information visualization) becoming crucial. DSC education and research continue to mirror the development in practice, with sustainable education through virtual/online learning becoming prominent. Innovative pedagogical approaches/strategies also include computer simulation and games ("play and learn" or "role-playing"). The current era witnesses AI adoption in different forms as conversational Chatbot agent and generative AI (GenAI), such as chat generative pretrained transformer in teaching, learning, and scholarly activities amidst challenges (academic integrity, plagiarism, intellectual property violations, and other ethical and legal issues). Future DSC education must innovatively integrate GenAI into DSC education and address the resulting challenges.
{"title":"Evolutionary Trends in Decision Sciences Education Research from Simulation and Games to Big Data Analytics and Generative Artificial Intelligence.","authors":"Ikpe Justice Akpan, Rouzbeh Razavi, Asuama A Akpan","doi":"10.1089/big.2024.0128","DOIUrl":"10.1089/big.2024.0128","url":null,"abstract":"<p><p>Decision sciences (DSC) involves studying complex dynamic systems and processes to aid informed choices subject to constraints in uncertain conditions. It integrates multidisciplinary methods and strategies to evaluate decision engineering processes, identifying alternatives and providing insights toward enhancing prudent decision-making. This study analyzes the evolutionary trends and innovation in DSC education and research trends over the past 25 years. Using metadata from bibliographic records and employing the science mapping method and text analytics, we map and evaluate the thematic, intellectual, and social structures of DSC research. The results identify \"knowledge management,\" \"decision support systems,\" \"data envelopment analysis,\" \"simulation,\" and \"artificial intelligence\" (AI) as some of the prominent critical skills and knowledge requirements for problem-solving in DSC before and during the period (2000-2024). However, these technologies are evolving significantly in the recent wave of digital transformation, with data analytics frameworks (including techniques such as big data analytics, machine learning, business intelligence, data mining, and information visualization) becoming crucial. DSC education and research continue to mirror the development in practice, with sustainable education through virtual/online learning becoming prominent. Innovative pedagogical approaches/strategies also include computer simulation and games (\"play and learn\" or \"role-playing\"). The current era witnesses AI adoption in different forms as conversational Chatbot agent and generative AI (GenAI), such as chat generative pretrained transformer in teaching, learning, and scholarly activities amidst challenges (academic integrity, plagiarism, intellectual property violations, and other ethical and legal issues). Future DSC education must innovatively integrate GenAI into DSC education and address the resulting challenges.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"416-437"},"PeriodicalIF":2.6,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143527974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-01Epub Date: 2025-01-10DOI: 10.1089/big.2024.0036
Sofie Goethals, Sandra Matz, Foster Provost, David Martens, Yanou Ramon
Our online lives generate a wealth of behavioral records-digital footprints-which are stored and leveraged by technology platforms. These data can be used to create value for users by personalizing services. At the same time, however, it also poses a threat to people's privacy by offering a highly intimate window into their private traits (e.g., their personality, political ideology, sexual orientation). We explore the concept of cloaking: allowing users to hide parts of their digital footprints from predictive algorithms, to prevent unwanted inferences. This article addresses two open questions: (i) can cloaking be effective in the longer term, as users continue to generate new digital footprints? And (ii) what is the potential impact of cloaking on the accuracy of desirable inferences? We introduce a novel strategy focused on cloaking "metafeatures" and compare its efficacy against just cloaking the raw footprints. The main findings are (i) while cloaking effectiveness does indeed diminish over time, using metafeatures slows the degradation; (ii) there is a tradeoff between privacy and personalization: cloaking undesired inferences also can inhibit desirable inferences. Furthermore, the metafeature strategy-which yields more stable cloaking-also incurs a larger reduction in desirable inferences.
{"title":"The Impact of Cloaking Digital Footprints on User Privacy and Personalization.","authors":"Sofie Goethals, Sandra Matz, Foster Provost, David Martens, Yanou Ramon","doi":"10.1089/big.2024.0036","DOIUrl":"10.1089/big.2024.0036","url":null,"abstract":"<p><p>Our online lives generate a wealth of behavioral records-<i>digital footprints</i>-which are stored and leveraged by technology platforms. These data can be used to create value for users by personalizing services. At the same time, however, it also poses a threat to people's privacy by offering a highly intimate window into their private traits (e.g., their personality, political ideology, sexual orientation). We explore the concept of <i>cloaking</i>: allowing users to hide parts of their digital footprints from predictive algorithms, to prevent unwanted inferences. This article addresses two open questions: (i) can cloaking be effective in the longer term, as users continue to generate new digital footprints? And (ii) what is the potential impact of cloaking on the accuracy of <i>desirable</i> inferences? We introduce a novel strategy focused on cloaking \"metafeatures\" and compare its efficacy against just cloaking the raw footprints. The main findings are (i) while cloaking effectiveness does indeed diminish over time, using metafeatures slows the degradation; (ii) there is a tradeoff between privacy and personalization: cloaking undesired inferences also can inhibit desirable inferences. Furthermore, the metafeature strategy-which yields more stable cloaking-also incurs a larger reduction in desirable inferences.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"345-363"},"PeriodicalIF":2.6,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142958560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The influence maximization problem has several issues, including low infection rates and high time complexity. Many proposed methods are not suitable for large-scale networks due to their time complexity or free parameter usage. To address these challenges, this article proposes a local heuristic called Embedding Technique for Influence Maximization (ETIM) that uses shell decomposition, graph embedding, and reduction, as well as combined local structural features. The algorithm selects candidate nodes based on their connections among network shells and topological features, reducing the search space and computational overhead. It uses a deep learning-based node embedding technique to create a multidimensional vector of candidate nodes and calculates the dependency on spreading for each node based on local topological features. Finally, influential nodes are identified using the results of the previous phases and newly defined local features. The proposed algorithm is evaluated using the independent cascade model, showing its competitiveness and ability to achieve the best performance in terms of solution quality. Compared with the collective influence global algorithm, ETIM is significantly faster and improves the infection rate by an average of 12%.
{"title":"Maximizing Influence in Social Networks Using Combined Local Features and Deep Learning-Based Node Embedding.","authors":"Asgarali Bouyer, Hamid Ahmadi Beni, Amin Golzari Oskouei, Alireza Rouhi, Bahman Arasteh, Xiaoyang Liu","doi":"10.1089/big.2023.0117","DOIUrl":"10.1089/big.2023.0117","url":null,"abstract":"<p><p>The influence maximization problem has several issues, including low infection rates and high time complexity. Many proposed methods are not suitable for large-scale networks due to their time complexity or free parameter usage. To address these challenges, this article proposes a local heuristic called Embedding Technique for Influence Maximization (ETIM) that uses shell decomposition, graph embedding, and reduction, as well as combined local structural features. The algorithm selects candidate nodes based on their connections among network shells and topological features, reducing the search space and computational overhead. It uses a deep learning-based node embedding technique to create a multidimensional vector of candidate nodes and calculates the dependency on spreading for each node based on local topological features. Finally, influential nodes are identified using the results of the previous phases and newly defined local features. The proposed algorithm is evaluated using the independent cascade model, showing its competitiveness and ability to achieve the best performance in terms of solution quality. Compared with the collective influence global algorithm, ETIM is significantly faster and improves the infection rate by an average of 12%.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"379-397"},"PeriodicalIF":2.6,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142480288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-01Epub Date: 2024-10-23DOI: 10.1089/big.2023.0131
Qi Ouyang, Hongchang Chen, Shuxin Liu, Liming Pu, Dongdong Ge, Ke Fan
Predicting propagation cascades is crucial for understanding information propagation in social networks. Existing methods always focus on structure or order of infected users in a single cascade sequence, ignoring the global dependencies of cascades and users, which is insufficient to characterize their dynamic interaction preferences. Moreover, existing methods are poor at addressing the problem of model robustness. To address these issues, we propose a predication model named DropMessage Hypergraph Attention Networks, which constructs a hypergraph based on the cascade sequence. Specifically, to dynamically obtain user preferences, we divide the diffusion hypergraph into multiple subgraphs according to the time stamps, develop hypergraph attention networks to explicitly learn complete interactions, and adopt a gated fusion strategy to connect them for user cascade prediction. In addition, a new drop immediately method DropMessage is added to increase the robustness of the model. Experimental results on three real-world datasets indicate that proposed model significantly outperforms the most advanced information propagation prediction model in both MAP@k and Hits@K metrics, and the experiment also proves that the model achieves more significant prediction performance than the existing model under data perturbation.
{"title":"DMHANT: DropMessage Hypergraph Attention Network for Information Propagation Prediction.","authors":"Qi Ouyang, Hongchang Chen, Shuxin Liu, Liming Pu, Dongdong Ge, Ke Fan","doi":"10.1089/big.2023.0131","DOIUrl":"10.1089/big.2023.0131","url":null,"abstract":"<p><p>Predicting propagation cascades is crucial for understanding information propagation in social networks. Existing methods always focus on structure or order of infected users in a single cascade sequence, ignoring the global dependencies of cascades and users, which is insufficient to characterize their dynamic interaction preferences. Moreover, existing methods are poor at addressing the problem of model robustness. To address these issues, we propose a predication model named DropMessage Hypergraph Attention Networks, which constructs a hypergraph based on the cascade sequence. Specifically, to dynamically obtain user preferences, we divide the diffusion hypergraph into multiple subgraphs according to the time stamps, develop hypergraph attention networks to explicitly learn complete interactions, and adopt a gated fusion strategy to connect them for user cascade prediction. In addition, a new drop immediately method DropMessage is added to increase the robustness of the model. Experimental results on three real-world datasets indicate that proposed model significantly outperforms the most advanced information propagation prediction model in both MAP@k and Hits@K metrics, and the experiment also proves that the model achieves more significant prediction performance than the existing model under data perturbation.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"364-378"},"PeriodicalIF":2.6,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142512575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}