Rising energy usage in wastewater treatment processes (WWTPs) poses pressing economic and environmental challenges. Machine learning approaches to model these complex systems have been limited by highly non-linear processes and high dataset noise. To address this, we introduce a novel Knowledge-Enhanced Graph Disentanglement framework for Energy Consumption Prediction (KEGD-EC) that leverages causal inference and graph neural networks. This work combines specific knowledge of causal relationships with a disentangled graph convolutional network architecture to facilitate accurate predictions. In a study on a WWTP in Melbourne, we demonstrate a 59.7% reduction in root mean squared error in energy consumption prediction using KEGD-EC compared to the next best model. We show that causal models built using domain knowledge outperform data-driven causal discovery models for complex systems. These results signify a step forward in applying machine learning to complex manufacturing processes, with the integration of causal knowledge into deep learning architectures posing a promising area of research for predictive analytics in manufacturing.