Accurate short-term passenger flow prediction in metro systems is essential for effective metro operation planning and passenger guidance. However, existing studies take OD information as reference of topological connection relationship, and neglect the real-time distribution trend of passenger flow. This oversight limits their ability to account for the impact of emergencies on passenger flow. Additionally, previous research has insufficiently addressed the bidirectional flow nature of passenger flow, leading to an incomplete capture of its features. To address these gaps, we propose a Hybrid Spatiotemporal Extraction and Multi-Feature Fusion (HSTE_MFF) model that considers both the bidirectional flow characteristics of metro passengers and real-time distribution trends. First, stations are treated as time units within a bidirectional long short-term memory (Bi-LSTM) neural network to aggregate passenger flow information. Then, a hybrid network framework, comprising Bi-LSTM, an improved residual structure (ResGAC), and LSTM, is designed to extract spatiotemporal features of passenger flow. Additionally, a multi-feature fusion algorithm is introduced to leverage the implicit passenger flow distribution characteristics found in OD data. Unlike previous algorithms, our approach uses historical inflow and outflow data to assess station passenger flow levels and employs the distribution trend of passenger flow at the origin as a weight for predicting future outbound passenger flow destinations. An empirical test using real data from the Qingdao metro system validated the model’s effectiveness. Comparative analysis with various baseline models demonstrated that the HSTE_MFF model significantly reduces prediction errors across different time intervals. Furthermore, a portability test with data from the Hangzhou metro system confirmed the model’s effectiveness in metro networks with different structural characteristics.