Liver cirrhosis stands as a prominent contributor to mortality, impacting millions across the United States. Enabling health care providers to predict early mortality among patients with cirrhosis holds the potential to enhance treatment efficacy significantly. Our hypothesis centers on the correlation between mortality and laboratory test results along with relevant diagnoses in this patient cohort. Additionally, we posit that a deep learning model could surpass the predictive capabilities of the existing Model for End-Stage Liver Disease score. This research seeks to advance prognostic accuracy and refine approaches to address the critical challenges posed by cirrhosis-related mortality. This study evaluates the performance of an artificial neural network model for liver disease classification using various training dataset sizes. Through meticulous experimentation, three distinct training proportions were analyzed: 70%, 80%, and 90%. The model's efficacy was assessed using precision, recall, F1-score, accuracy, and support metrics, alongside receiver operating characteristic (ROC) and precision-recall (PR) curves. The ROC curves were quantified using the area under the curve (AUC) metric. Results indicated that the model's performance improved with an increased size of the training dataset. Specifically, the 80% training data model achieved the highest AUC, suggesting superior classification ability over the models trained with 70% and 90% data. PR analysis revealed a steep trade-off between precision and recall across all datasets, with 80% training data again demonstrating a slightly better balance. This is indicative of the challenges faced in achieving high precision with a concurrently high recall, a common issue in imbalanced datasets such as those found in medical diagnostics.
{"title":"Prognostic Modeling for Liver Cirrhosis Mortality Prediction and Real-Time Health Monitoring from Electronic Health Data.","authors":"Chengping Zhang, Muhammad Faisal Buland Iqbal, Imran Iqbal, Minghao Cheng, Nadia Sarhan, Emad Mahrous Awwad, Yazeed Yasin Ghadi","doi":"10.1089/big.2024.0071","DOIUrl":"https://doi.org/10.1089/big.2024.0071","url":null,"abstract":"<p><p>Liver cirrhosis stands as a prominent contributor to mortality, impacting millions across the United States. Enabling health care providers to predict early mortality among patients with cirrhosis holds the potential to enhance treatment efficacy significantly. Our hypothesis centers on the correlation between mortality and laboratory test results along with relevant diagnoses in this patient cohort. Additionally, we posit that a deep learning model could surpass the predictive capabilities of the existing Model for End-Stage Liver Disease score. This research seeks to advance prognostic accuracy and refine approaches to address the critical challenges posed by cirrhosis-related mortality. This study evaluates the performance of an artificial neural network model for liver disease classification using various training dataset sizes. Through meticulous experimentation, three distinct training proportions were analyzed: 70%, 80%, and 90%. The model's efficacy was assessed using precision, recall, F1-score, accuracy, and support metrics, alongside receiver operating characteristic (ROC) and precision-recall (PR) curves. The ROC curves were quantified using the area under the curve (AUC) metric. Results indicated that the model's performance improved with an increased size of the training dataset. Specifically, the 80% training data model achieved the highest AUC, suggesting superior classification ability over the models trained with 70% and 90% data. PR analysis revealed a steep trade-off between precision and recall across all datasets, with 80% training data again demonstrating a slightly better balance. This is indicative of the challenges faced in achieving high precision with a concurrently high recall, a common issue in imbalanced datasets such as those found in medical diagnostics.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":""},"PeriodicalIF":2.6,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142803050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-01Epub Date: 2023-09-04DOI: 10.1089/big.2022.0021
Xinjun Lai, Guitao Huang, Ziyue Zhao, Shenhe Lin, Sheng Zhang, Huiyu Zhang, Qingxin Chen, Ning Mao
This study investigates customers' product design requirements through online comments from social media, and quickly translates these needs into product design specifications. First, the exponential discriminative snowball sampling method was proposed to generate a product-related subnetwork. Second, natural language processing (NLP) was utilized to mine user-generated comments, and a Graph SAmple and aggreGatE method was employed to embed the user's node neighborhood information in the network to jointly define a user's persona. Clustering was used for market and product model segmentation. Finally, a deep learning bidirectional long short-term memory with conditional random fields framework was introduced for opinion mining. A comment frequency-invert group frequency indicator was proposed to quantify all user groups' positive and negative opinions for various specifications of different product functions. A case study of smartphone design analysis is presented with data from a large Chinese online community called Baidu Tieba. Eleven layers of social relationships were snowball sampled, with 14,018 users and 30,803 comments. The proposed method produced a more reasonable user group clustering result than the conventional method. With our approach, user groups' dominating likes and dislikes for specifications could be immediately identified, and the similar and different preferences of product features by different user groups were instantly revealed. Managerial and engineering insights were also discussed.
{"title":"Social Listening for Product Design Requirement Analysis and Segmentation: A Graph Analysis Approach with User Comments Mining.","authors":"Xinjun Lai, Guitao Huang, Ziyue Zhao, Shenhe Lin, Sheng Zhang, Huiyu Zhang, Qingxin Chen, Ning Mao","doi":"10.1089/big.2022.0021","DOIUrl":"10.1089/big.2022.0021","url":null,"abstract":"<p><p>This study investigates customers' product design requirements through online comments from social media, and quickly translates these needs into product design specifications. First, the exponential discriminative snowball sampling method was proposed to generate a product-related subnetwork. Second, natural language processing (NLP) was utilized to mine user-generated comments, and a Graph SAmple and aggreGatE method was employed to embed the user's node neighborhood information in the network to jointly define a user's persona. Clustering was used for market and product model segmentation. Finally, a deep learning bidirectional long short-term memory with conditional random fields framework was introduced for opinion mining. A comment frequency-invert group frequency indicator was proposed to quantify all user groups' positive and negative opinions for various specifications of different product functions. A case study of smartphone design analysis is presented with data from a large Chinese online community called Baidu Tieba. Eleven layers of social relationships were snowball sampled, with 14,018 users and 30,803 comments. The proposed method produced a more reasonable user group clustering result than the conventional method. With our approach, user groups' dominating likes and dislikes for specifications could be immediately identified, and the similar and different preferences of product features by different user groups were instantly revealed. Managerial and engineering insights were also discussed.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"456-477"},"PeriodicalIF":2.6,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10508327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-01Epub Date: 2023-05-18DOI: 10.1089/big.2022.0158
Farah Haneef, Muddassar A Sindhu
We present an efficient incremental learning algorithm for Deterministic Finite Automaton (DFA) with the help of inverse query (IQ) and membership query (MQ). This algorithm is an extension of the Identification of Regular Languages (ID) algorithm from a complete to an incremental learning setup. The learning algorithm learns by making use of a set of labeled examples and by posing queries to a knowledgeable teacher, which is equipped to answer IQs along with MQs and equivalence query. Based on the examples (elements of the live complete set) and responses against IQs from the minimally adequate teacher (MAT), the learning algorithm constructs the hypothesis automaton, consistent with all observed examples. The Incremental DFA Learning algorithm through Inverse Queries (IDLIQ) takes time complexity in the presence of a MAT and ensures convergence to a minimal representation of the target DFA with finite number of labeled examples. Existing incremental learning algorithms; the Incremental ID, the Incremental Distinguishing Strings have polynomial (cubic) time complexity in the presence of a MAT. Therefore, sometimes, these algorithms even fail to learn large complex software systems. In this research work, we have reduced the complexity (from cubic to square form) of the DFA learning in an incremental setup. Finally, we prove the correctness and termination of the IDLIQ algorithm.
{"title":"IDLIQ: An Incremental <i>Deterministic Finite Automaton</i> Learning Algorithm Through Inverse Queries for Regular Grammar Inference.","authors":"Farah Haneef, Muddassar A Sindhu","doi":"10.1089/big.2022.0158","DOIUrl":"10.1089/big.2022.0158","url":null,"abstract":"<p><p>We present an efficient incremental learning algorithm for <i>Deterministic Finite Automaton</i> (DFA) with the help of inverse query (IQ) and membership query (MQ). This algorithm is an extension of the <i>Identification of Regular Languages</i> (ID) algorithm from a complete to an incremental learning setup. The learning algorithm learns by making use of a set of labeled examples and by posing queries to a knowledgeable teacher, which is equipped to answer IQs along with MQs and equivalence query. Based on the examples (elements of the live complete set) and responses against IQs from the <i>minimally adequate teacher</i> (MAT), the learning algorithm constructs the hypothesis automaton, consistent with all observed examples. The Incremental DFA Learning algorithm through Inverse Queries (IDLIQ) takes <math><mstyle><mi>O</mi></mstyle><mrow><mo>(</mo><mrow><mo>|</mo><mi>Σ</mi><mo>|</mo><mi>N</mi><mo>+</mo><mo>|</mo><msub><mrow><mi>P</mi></mrow><mrow><mi>c</mi></mrow></msub><mo>|</mo><mo>|</mo><mi>F</mi><mo>|</mo></mrow><mo>)</mo></mrow></math> time complexity in the presence of a MAT and ensures convergence to a minimal representation of the target DFA with finite number of labeled examples. Existing incremental learning algorithms; the Incremental ID, the Incremental Distinguishing Strings have polynomial (cubic) time complexity in the presence of a MAT. Therefore, sometimes, these algorithms even fail to learn large complex software systems. In this research work, we have reduced the complexity (from cubic to square form) of the DFA learning in an incremental setup. Finally, we prove the correctness and termination of the IDLIQ algorithm.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"446-455"},"PeriodicalIF":2.6,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9492270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-01Epub Date: 2022-02-10DOI: 10.1089/big.2021.0013
Rajalakshmi Gurusamy, Siva Ranjani Seenivasan
Deep learning and big data techniques have become increasingly popular in traffic flow forecasting. Deep neural networks have also been applied to traffic flow forecasting. Furthermore, it is difficult to determine whether neural networks can be used for accurate traffic flow prediction. Moreover, since the network model is poorly structured and the parameter optimization technique is inappropriate, the traffic flow prediction is inaccurate because of the lack of certainty. The proposed system overcomes these problems by combining multiple simple recurrent long short-term memory (LSTM) neural networks with time traits to predict traffic flow using a deep gated stacked neural network. To deepen the model, the hidden layers have been trained using an unsupervised layer-by-layer approach. This approach provides a systematic representation of the time series data. A systematic representation of hidden layers improves the accuracy of time series forecasting by capturing information at multiple levels. Furthermore, it emphasizes the importance of model structure, random weight initialization, and hyperparameters used in stacked LSTM to enhance predictive performance. The prediction efficacy of the deep gated stacked LSTM model is compared with that of the gated recurrent unit model and the stacked autoencoder model.
{"title":"DGSLSTM: Deep Gated Stacked Long Short-Term Memory Neural Network for Traffic Flow Forecasting of Transportation Networks on Big Data Environment.","authors":"Rajalakshmi Gurusamy, Siva Ranjani Seenivasan","doi":"10.1089/big.2021.0013","DOIUrl":"10.1089/big.2021.0013","url":null,"abstract":"<p><p>Deep learning and big data techniques have become increasingly popular in traffic flow forecasting. Deep neural networks have also been applied to traffic flow forecasting. Furthermore, it is difficult to determine whether neural networks can be used for accurate traffic flow prediction. Moreover, since the network model is poorly structured and the parameter optimization technique is inappropriate, the traffic flow prediction is inaccurate because of the lack of certainty. The proposed system overcomes these problems by combining multiple simple recurrent long short-term memory (LSTM) neural networks with time traits to predict traffic flow using a deep gated stacked neural network. To deepen the model, the hidden layers have been trained using an unsupervised layer-by-layer approach. This approach provides a systematic representation of the time series data. A systematic representation of hidden layers improves the accuracy of time series forecasting by capturing information at multiple levels. Furthermore, it emphasizes the importance of model structure, random weight initialization, and hyperparameters used in stacked LSTM to enhance predictive performance. The prediction efficacy of the deep gated stacked LSTM model is compared with that of the gated recurrent unit model and the stacked autoencoder model.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"504-517"},"PeriodicalIF":2.6,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39906258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-01Epub Date: 2022-02-08DOI: 10.1089/big.2021.0200
Sima Attar-Khorasani, Ricardo Chalmeta
This study contributes to the research on Internet of Things data visualization for business intelligence processes, an area of growing interest to scholars, by conducting a systematic review of the literature. A total of 237 articles published over the past 11 years were obtained and compared. This made it possible to identify the top contributing and most influential authors, countries, publishers, institutions, papers, and research findings, together with the challenges facing current research. Based on these results, this work provides a thorough insight into the field by proposing four research categories (Technology infrastructure, Case examples, Final-user experience, and Big Data tools), together with the development of these research streams over time and their future research directions.
{"title":"Internet of Things Data Visualization for Business Intelligence.","authors":"Sima Attar-Khorasani, Ricardo Chalmeta","doi":"10.1089/big.2021.0200","DOIUrl":"10.1089/big.2021.0200","url":null,"abstract":"<p><p>This study contributes to the research on Internet of Things data visualization for business intelligence processes, an area of growing interest to scholars, by conducting a systematic review of the literature. A total of 237 articles published over the past 11 years were obtained and compared. This made it possible to identify the top contributing and most influential authors, countries, publishers, institutions, papers, and research findings, together with the challenges facing current research. Based on these results, this work provides a thorough insight into the field by proposing four research categories (Technology infrastructure, Case examples, Final-user experience, and Big Data tools), together with the development of these research streams over time and their future research directions.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"478-503"},"PeriodicalIF":2.6,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39899264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-01Epub Date: 2022-04-29DOI: 10.1089/big.2021.0292
Iram Mushtaq, Muhammad Umer, Muhammad Attique Khan, Seifedine Kadry
Pre-COVID-19, most of the supply chains functioned with more capacity than demand. However, COVID-19 changed traditional supply chains' dynamics, resulting in more demand than their production capacity. This article presents a multiobjective and multiperiod supply chain network design along with customer prioritization, keeping in view price discounts and outsourcing strategies to deal with the situation when demand exceeds the production capacity. Initially, a multiperiod, multiobjective supply chain network is designed that incorporates prices discounts, customer prioritization, and outsourcing strategies. The main objectives are profit and prioritization maximization and time minimization. The introduction of the prioritization objective function having customer ranking as a parameter and considering less capacity than demand and outsourcing differentiates this model from the literature. A four-valued neutrosophic multiobjective optimization method is introduced to solve the model developed. To validate the model, a case study of the supply chain of a surgical mask is presented as the real-life application of research. The research findings are useful for the managers to make price discounts and preferred customer prioritization decisions under uncertainty and imbalance between supply and demand. In future, the logic in the proposed model can be used to create web application for optimal decision-making in supply chains.
{"title":"Customer Prioritization Integrated Supply Chain Optimization Model with Outsourcing Strategies.","authors":"Iram Mushtaq, Muhammad Umer, Muhammad Attique Khan, Seifedine Kadry","doi":"10.1089/big.2021.0292","DOIUrl":"10.1089/big.2021.0292","url":null,"abstract":"<p><p>Pre-COVID-19, most of the supply chains functioned with more capacity than demand. However, COVID-19 changed traditional supply chains' dynamics, resulting in more demand than their production capacity. This article presents a multiobjective and multiperiod supply chain network design along with customer prioritization, keeping in view price discounts and outsourcing strategies to deal with the situation when demand exceeds the production capacity. Initially, a multiperiod, multiobjective supply chain network is designed that incorporates prices discounts, customer prioritization, and outsourcing strategies. The main objectives are profit and prioritization maximization and time minimization. The introduction of the prioritization objective function having customer ranking as a parameter and considering less capacity than demand and outsourcing differentiates this model from the literature. A four-valued neutrosophic multiobjective optimization method is introduced to solve the model developed. To validate the model, a case study of the supply chain of a surgical mask is presented as the real-life application of research. The research findings are useful for the managers to make price discounts and preferred customer prioritization decisions under uncertainty and imbalance between supply and demand. In future, the logic in the proposed model can be used to create web application for optimal decision-making in supply chains.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":"1 1","pages":"413-428"},"PeriodicalIF":2.6,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41616147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2021-12-13DOI: 10.1089/big.2021.0176
R Thenmozhi, S Shridevi, Sachi Nandan Mohanty, Vicente García-Díaz, Deepak Gupta, Prayag Tiwari, Mohammad Shorfuzzaman
There is a drastic increase in Internet usage across the globe, thanks to mobile phone penetration. This extreme Internet usage generates huge volumes of data, in other terms, big data. Security and privacy are the main issues to be considered in big data management. Hence, in this article, Attribute-based Adaptive Homomorphic Encryption (AAHE) is developed to enhance the security of big data. In the proposed methodology, Oppositional Based Black Widow Optimization (OBWO) is introduced to select the optimal key parameters by following the AAHE method. By considering oppositional function, Black Widow Optimization (BWO) convergence analysis was enhanced. The proposed methodology has different processes, namely, process setup, encryption, and decryption processes. The researcher evaluated the proposed methodology with non-abelian rings and the homomorphism process in ciphertext format. Further, it is also utilized in improving one-way security related to the conjugacy examination issue. Afterward, homomorphic encryption is developed to secure the big data. The study considered two types of big data such as adult datasets and anonymous Microsoft web datasets to validate the proposed methodology. With the help of performance metrics such as encryption time, decryption time, key size, processing time, downloading, and uploading time, the proposed method was evaluated and compared against conventional cryptography techniques such as Rivest-Shamir-Adleman (RSA) and Elliptic Curve Cryptography (ECC). Further, the key generation process was also compared against conventional methods such as BWO, Particle Swarm Optimization (PSO), and Firefly Algorithm (FA). The results established that the proposed method is supreme than the compared methods and can be applied in real time in near future.
{"title":"Attribute-Based Adaptive Homomorphic Encryption for Big Data Security.","authors":"R Thenmozhi, S Shridevi, Sachi Nandan Mohanty, Vicente García-Díaz, Deepak Gupta, Prayag Tiwari, Mohammad Shorfuzzaman","doi":"10.1089/big.2021.0176","DOIUrl":"10.1089/big.2021.0176","url":null,"abstract":"<p><p>There is a drastic increase in Internet usage across the globe, thanks to mobile phone penetration. This extreme Internet usage generates huge volumes of data, in other terms, big data. Security and privacy are the main issues to be considered in big data management. Hence, in this article, Attribute-based Adaptive Homomorphic Encryption (AAHE) is developed to enhance the security of big data. In the proposed methodology, Oppositional Based Black Widow Optimization (OBWO) is introduced to select the optimal key parameters by following the AAHE method. By considering oppositional function, Black Widow Optimization (BWO) convergence analysis was enhanced. The proposed methodology has different processes, namely, process setup, encryption, and decryption processes. The researcher evaluated the proposed methodology with non-abelian rings and the homomorphism process in ciphertext format. Further, it is also utilized in improving one-way security related to the conjugacy examination issue. Afterward, homomorphic encryption is developed to secure the big data. The study considered two types of big data such as adult datasets and anonymous Microsoft web datasets to validate the proposed methodology. With the help of performance metrics such as encryption time, decryption time, key size, processing time, downloading, and uploading time, the proposed method was evaluated and compared against conventional cryptography techniques such as Rivest-Shamir-Adleman (RSA) and Elliptic Curve Cryptography (ECC). Further, the key generation process was also compared against conventional methods such as BWO, Particle Swarm Optimization (PSO), and Firefly Algorithm (FA). The results established that the proposed method is supreme than the compared methods and can be applied in real time in near future.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"343-356"},"PeriodicalIF":2.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39718084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2022-02-02DOI: 10.1089/big.2021.0251
Fei Dai, Pengfei Cao, Penggui Huang, Qi Mo, Bi Huang
Traffic speed prediction plays a fundamental role in traffic management and driving route planning. However, timely accurate traffic speed prediction is challenging as it is affected by complex spatial and temporal correlations. Most existing works cannot simultaneously model spatial and temporal correlations in traffic data, resulting in unsatisfactory prediction performance. In this article, we propose a novel hybrid deep learning approach, named HDL4TSP, to predict traffic speed in each region of a city, which consists of an input layer, a spatial layer, a temporal layer, a fusion layer, and an output layer. Specifically, first, the spatial layer employs graph convolutional networks to capture spatial near dependencies and spatial distant dependencies in the spatial dimension. Second, the temporal layer employs convolutional long short-term memory (ConvLSTM) networks to model closeness, daily periodicity, and weekly periodicity in the temporal dimension. Third, the fusion layer designs a fusion component to merge the outputs of ConvLSTM networks. Finally, we conduct extensive experiments and experimental results to show that HDL4TSP outperforms four baselines on two real-world data sets.
{"title":"Hybrid Deep Learning Approach for Traffic Speed Prediction.","authors":"Fei Dai, Pengfei Cao, Penggui Huang, Qi Mo, Bi Huang","doi":"10.1089/big.2021.0251","DOIUrl":"10.1089/big.2021.0251","url":null,"abstract":"<p><p>Traffic speed prediction plays a fundamental role in traffic management and driving route planning. However, timely accurate traffic speed prediction is challenging as it is affected by complex spatial and temporal correlations. Most existing works cannot simultaneously model spatial and temporal correlations in traffic data, resulting in unsatisfactory prediction performance. In this article, we propose a novel hybrid deep learning approach, named HDL4TSP, to predict traffic speed in each region of a city, which consists of an input layer, a spatial layer, a temporal layer, a fusion layer, and an output layer. Specifically, first, the spatial layer employs graph convolutional networks to capture spatial near dependencies and spatial distant dependencies in the spatial dimension. Second, the temporal layer employs convolutional long short-term memory (ConvLSTM) networks to model closeness, daily periodicity, and weekly periodicity in the temporal dimension. Third, the fusion layer designs a fusion component to merge the outputs of ConvLSTM networks. Finally, we conduct extensive experiments and experimental results to show that HDL4TSP outperforms four baselines on two real-world data sets.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"377-389"},"PeriodicalIF":2.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39880866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2023-08-01DOI: 10.1089/big.2021.0473
Dibin Shan, Xuehui Du, Wenjuan Wang, Aodi Liu, Na Wang
Context information is the key element to realizing dynamic access control of big data. However, existing context-aware access control (CAAC) methods do not support automatic context awareness and cannot automatically model and reason about context relationships. To solve these problems, this article proposes a weighted GraphSAGE-based context-aware approach for big data access control. First, graph modeling is performed on the access record data set and transforms the access control context-awareness problem into a graph neural network (GNN) node learning problem. Then, a GNN model WGraphSAGE is proposed to achieve automatic context awareness and automatic generation of CAAC rules. Finally, weighted neighbor sampling and weighted aggregation algorithms are designed for the model to realize automatic modeling and reasoning of node relationships and relationship strengths simultaneously in the graph node learning process. The experiment results show that the proposed method has obvious advantages in context awareness and context relationship reasoning compared with similar GNN models. Meanwhile, it obtains better results in dynamic access control decisions than the existing CAAC models.
{"title":"A Weighted GraphSAGE-Based Context-Aware Approach for Big Data Access Control.","authors":"Dibin Shan, Xuehui Du, Wenjuan Wang, Aodi Liu, Na Wang","doi":"10.1089/big.2021.0473","DOIUrl":"10.1089/big.2021.0473","url":null,"abstract":"<p><p>Context information is the key element to realizing dynamic access control of big data. However, existing context-aware access control (CAAC) methods do not support automatic context awareness and cannot automatically model and reason about context relationships. To solve these problems, this article proposes a weighted GraphSAGE-based context-aware approach for big data access control. First, graph modeling is performed on the access record data set and transforms the access control context-awareness problem into a graph neural network (GNN) node learning problem. Then, a GNN model WGraphSAGE is proposed to achieve automatic context awareness and automatic generation of CAAC rules. Finally, weighted neighbor sampling and weighted aggregation algorithms are designed for the model to realize automatic modeling and reasoning of node relationships and relationship strengths simultaneously in the graph node learning process. The experiment results show that the proposed method has obvious advantages in context awareness and context relationship reasoning compared with similar GNN models. Meanwhile, it obtains better results in dynamic access control decisions than the existing CAAC models.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"390-411"},"PeriodicalIF":2.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9922924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-01Epub Date: 2024-07-31DOI: 10.1089/big.2024.59218.kpa
Farhad Pourkamali-Anaraki
{"title":"Special Issue: Big Scientific Data and Machine Learning in Science and Engineering.","authors":"Farhad Pourkamali-Anaraki","doi":"10.1089/big.2024.59218.kpa","DOIUrl":"10.1089/big.2024.59218.kpa","url":null,"abstract":"","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"269"},"PeriodicalIF":2.6,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141857096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}