Wind farm wake interactions are critical determinants of overall power generation efficiency. To address these challenges, coordinated yaw control of turbines has emerged as a highly effective strategy. While conventional approaches have been widely adopted, the application of contemporary machine learning techniques, specifically reinforcement learning (RL), holds great promise for optimizing wind farm control performance. Considering the scarcity of comparative analyses for yaw control approaches, this study implements and evaluates classical greedy, optimization-based, and RL policies for in-line multiple wind turbine under various wind scenario by an experimentally validated analytical wake model. The results unambiguously establish the superiority of RL over greedy control, particularly below rated wind speeds, as RL optimizes yaw trajectories to maximize total power capture. Furthermore, the RL-controlled policy operates without being hampered by iterative modeling errors, leading to a higher cumulative power generation compared to the optimized control scheme during the control process. At lower wind speeds (5 m/s), it achieves a remarkable 32.63 % improvement over the optimized strategy. As the wind speed increases, the advantages of RL control gradually diminish. In consequence, the model-free adaptation offered by RL control substantially bolsters robustness across a spectrum of changing wind scenarios, facilitating seamless transitions between wake steering and alignment in response to evolving wake physics. This analysis underscores the significant advantages of data-driven RL for wind farm yaw control when compared to traditional methods. Its adaptive nature empowers the optimization of total power production across a range of diverse operating regimes, all without the need for an explicit model representation.