The perception of Bird's Eye View (BEV) has become a widely adopted approach in 3D object detection due to its spatial and dimensional consistency. However, the increasing complexity of neural network architectures has resulted in higher training memory, thereby limiting the scalability of model training. To address these challenges, we propose a novel model, RevFB-BEV, which is based on the Reversible Swin Transformer (RevSwin) with Forward-Backward View Transformation (FBVT) and LiDAR Guided Back Projection (LGBP). This approach includes the RevSwin backbone network, which employs a reversible architecture to minimise training memory by recomputing intermediate parameters. Moreover, we introduce the FBVT module that refines BEV features extracted from forward projection, yielding denser and more precise camera BEV representations. The LGBP module further utilises LiDAR BEV guidance for back projection to achieve more accurate camera BEV features. Extensive experiments on the nuScenes dataset demonstrate notable performance improvements, with our model achieving over a