Ensuring power grid resiliency, forecasting climate conditions, and optimization of transportation infrastructure are some of the many application areas where data is collected in both space and time. Spatiotemporal modeling is about modeling those patterns for forecasting future trends and carrying out critical decision-making by leveraging machine learning/deep learning. Once trained offline, field deployment of trained models for near real-time inference could be challenging because performance can vary significantly depending on the environment, available compute resources and tolerance to ambiguity in results. Users deploying spatiotemporal models for solving complex problems can benefit from analytical studies considering a plethora of system adaptations to understand the associated performance-quality trade-offs.
To facilitate the co-design of next-generation hardware architectures for field deployment of trained models, it is critical to characterize the workloads of these deep learning (DL) applications during inference and assess their computational patterns at different levels of the execution stack. In this paper, we develop several variants of deep learning applications that use spatiotemporal data from dynamical systems. We study the associated computational patterns for inference workloads at different levels, considering relevant models (Long short-term Memory, Convolutional Neural Network and Spatio-Temporal Graph Convolution Network), DL frameworks (Tensorflow and PyTorch), precision (FP16, FP32, AMP, INT16 and INT8), inference runtime (ONNX and AI Template), post-training quantization (TensorRT) and platforms (Nvidia DGX A100 and Sambanova SN10 RDU).
Overall, our findings indicate that although there is potential in mixed-precision models and post-training quantization for spatiotemporal modeling, extracting efficiency from contemporary GPU systems might be challenging. Instead, co-designing custom accelerators by leveraging optimized High Level Synthesis frameworks (such as SODA High-Level Synthesizer for customized FPGA/ASIC targets) can make workload-specific adjustments to enhance the efficiency.
Vehicular ad hoc networks (VANET) have been the key indispensable module of the future intelligent transportation system. Security and privacy are two essential attributes that protect the safe driving of vehicles. Over the last two decades, numerous conditional privacy-preserving authentication schemes have been presented for the VANET environment. However, existing schemes have various limitations, including security issues, high storage overhead, and frequent interactions. In order to bridge these difficulties, this work combines physically unclonable function and blockchain technology to construct a conditional privacy-preserving authentication scheme for the VANET environment. Specifically, we combine physical unclonable function and dynamic pseudonym techniques to generate unique pseudonym IDs dynamically and private keys using physical unclonable function to enhance privacy protection and resist physical attack. To reduce the number of communication rounds during the verification process, we deployed lightweight blockchain nodes to avoid direct communication between the receiver and the blockchain network. The proposed scheme demonstrates resilience against various potential attacks through comprehensive security analysis and proof. Furthermore, performance metrics indicate that our scheme outperforms similar schemes, making it suitable for resource-constrained VANET.
Computer networks facilitate regular human tasks, providing services like data streaming, online shopping, and digital communications. These applications require more and more network capacity and dynamicity to accomplish their goals. The networks may be targeted by attacks and intrusions that compromise the applications that rely on them and lead to potential losses. We propose a semi-supervised systematic methodology for developing a detection system for traffic volume anomalies in IP flow-based networks. The system is implemented with a vanilla Generative Adversarial Network (GAN). The mitigation module is triggered whenever an anomaly is detected, automatically blocking the suspect IPs and restoring the correct network functioning. We implemented three versions of the proposed solution by incorporating Long Short-Term Memory (LSTM), 1D-Convolutional Neural Network (1D-CNN), and Temporal Convolutional Network (TCN) into the GAN internal structure. The experiments are conducted on three public benchmark datasets: Orion, CIC-DDoS2019, and CIC-IDS2017. The results show that the three considered deep learning models have distinct impacts on the GAN model and, consequently, on the overall system performance. The 1D-CNN-based GAN implementation is the best since it reasonably solves the mode collapse problem, has the most efficient computational complexity, and achieves competitive Matthews Correlation Coefficient scores for the anomaly detection task. Also, the mitigation module can drop most anomalous flows, blocking only a slight portion of legitimate traffic. For comparison with state-of-the-art models, we implemented 1D-CNN, LSTM, and TCN separately from the GAN. The generative networks show improved overall results in the considered performance metrics compared to the other models.
Federated Learning (FL) has gained popularity due to its advantages over centralized learning. However, existing FL research has primarily focused on unconstrained wired networks, neglecting the challenges posed by wireless Internet of Things (IoT) environments. The successful integration of FL into IoT networks requires tailored adaptations to address unique constraints, especially in computation and communication. This paper introduces Communication-Aware Federated Averaging (CAFA), a novel algorithm designed to enhance FL operations in wireless IoT networks with shared communication channels. CAFA primarily leverages the latent computational capacities during the communication phase for local training and aggregation. Through extensive and realistic evaluations in dedicated FL-IoT framework, our method demonstrates significant advantages over state-of-the-art approaches. Indeed, CAFA achieves up to a 4x reduction in communication costs and accelerates FL training by as much as 70%, while preserving model accuracy. These achievements position CAFA as a promising solution for the efficient implementation of FL in constrained wireless networks.