Due to the complex temporal dependency and various external factors, it is challenging to capture its nonlinear and unsteady trends accurately. In addition, there are several inevitable errors in the traffic sensor record, including bias and noise. However, most recent works regard the record data as exact input ignoring the effect of unknown errors. In this research, a novel framework that integrates Ensemble Empirical Mode Decomposition (EEMD) method and the Bidirectional Gate Recurrent Units (BiGRU) model was proposed to eliminate noise and enhance short-term prediction. The proposed model is mainly divided into three stages. Firstly, the EEMD algorithm adaptively decomposes the nonlinear and non-steady passenger flow signal into several sub-signals, which share more straightforward fluctuation trends and higher correlation coefficients in the preprocessing stage. Secondly, in the feature recognition and extraction stage, knowledge of the transportation field and statistical theories are applied to analyze and extract the critical decomposed components. Finally, in the prediction stage, the stacked BiGRU can learn and extract information from the input features in both directions and use a multi-step prediction to output the final prediction result. A real dataset of the Chengdu metro system is included in our experiments. The experimental results reveal that the proposed EEMD-BiGRU model's prediction performance exceeds all benchmark models. The Root Mean Square Error (RMSE) of the proposed model is reduced by up to 28.29% compared to a single GRU model without EEMD preprocessing. Also, experiments show the effectiveness and robustness of the proposed method for predicting short-term passenger flow in metro systems.