Leadership scholars increasingly acknowledge the shortcomings of using questionnaires. Consequently, there is a trend towards more behavior-based research, with interaction coding as one promising method. By precisely analyzing recordings of leader–follower interactions, interaction coding helps quantify verbal and non-verbal behavioral patterns that unfold between leaders and their followers, thereby providing access to the behavioral dynamics that are at the core of leadership. Yet, analyzing leader–follower interactions is much less straightforward than it might appear. Bold claims like “objective data” and “actual behavior” frequently used in such studies tend to paint a somewhat tainted picture of the opportunities and challenges associated with interaction coding. To synthesize the existing empirical knowledge concerning the use of interaction coding in leadership research, we present the findings from a critical review of the current research landscape. This review highlights that questions related to observer inference, standards for observer agreement, and the validity of interaction coding are often not sufficiently addressed in empirical work. Drawing on these findings, we identify questionable research practices and juxtapose these with best-practice recommendations. Finally, we provide a discussion and outlook on how behavior-based methods can move the leadership field forward by facilitating theoretical advancements and deriving actionable guidance for practitioners.