The widespread adoption of electric vehicles (EVs) underscores the urgent need for innovative approaches to estimate their lithium-ion batteries’ state of health (SOH), which is crucial for ensuring safety and efficiency. This study introduces SOH-TEC, a transformer encoder-based model that processes raw time-series battery and vehicle-related data from a single EV trip to estimate the SOH. Unlike conventional methods that rely on lab-experimented battery cycle data, SOH-TEC utilizes real-world EV operation data, enhancing practical application. The model is trained and evaluated on a real-world dataset collected over nearly three years from three EVs. This dataset includes reliable SOH labels obtained through periodic constant-current full-discharge tests using a chassis dynamometer. Despite the challenges posed by noisy EV real-world data, the model shows high accuracy, with a mean absolute error of 0.72% and a root mean square error of 1.17%. Moreover, our proposed pre-training strategies with unlabeled data, particularly SOH ordinal comparison, significantly enhance the model’s performance; using only 50% of the labeled data achieves results nearly identical to those obtained with the full dataset. Self-attention map analysis reveals that the model primarily focuses on stationary or consistent driving periods to estimate SOH. While the study is constrained by a dataset featuring repetitive driving patterns, it highlights the significant potential of transformer for SOH estimation in EVs and offers valuable insights for future data collection and model development.
Machine learning is widely recognized as a promising data-driven modeling technique for the model-based control and optimization of building energy systems. However, the generalizability of data-driven models often faces significant challenges, as the available training data from building operations usually only covers a limited range of working conditions. Active learning can proactively test unseen and informative working conditions to enrich the training set by adding new data samples, leading to improved generalization performance of data-driven models. A novel distance and information density-based sample strategy is developed that accounts for the real-time status of building operation and outdoor environment. Based on Mahalanobis distance, this strategy determines the sampling value of an unlabeled sample (unseen working condition) by assessing its similarity to both the training samples and other unlabeled samples. As collecting sufficiently representative samples can be difficult, costly, and time-consuming, a distance-based sampling cost metric is proposed to compare the efficiency of different sampling methods, considering the detrimental effects of the actively sampling process on the normal operation of building energy systems. This paper presents a comprehensive and in-depth comparison of five active learning methods, including one incorporating the distance-based sampling strategy, by conducting data experiments on the data collected from the cooling towers of a real high-rise building. The results show that active learning can effectively identify informative data samples and improve the generalization performance of data-driven models. The research outcomes are valuable for enhancing AI-enabled data-driven modeling of building energy systems with substantial decreases in costs on data sampling.
Buildings have great energy flexibility potential to manage supply-demand imbalance in power grids with high renewable penetration. Accurate and real-time quantification of building energy flexibility is essential not only for engaging buildings in electricity and grid service markets, but also for ensuring the reliable and optimal operation of power grids. This paper proposes a probabilistic model for rapidly quantifying the aggregated flexibility of buildings under uncertainties. An explicit equation is derived as the analytical solution of a commonly used second-order building thermodynamic model to quantify the flexibility of individual buildings, eliminating the need of time-consuming iterative and finite difference computations. A sampling-based uncertainty analysis is performed to obtain the distribution of aggregated building flexibility, considering major uncertainties comprehensively. Validation tests are conducted using 150 commercial buildings in Hong Kong. The results show that the proposed model not only quantifies the aggregated flexibility with high accuracy, but also dramatically reduces the computation time from 3605 s to 6.7 s, about 537 times faster than the existing probabilistic model solved numerically. Moreover, the proposed model is 8 times faster than the archetype-based model and achieves significantly higher accuracy.