Although unmanned aerial vehicle (UAV) remote sensing is widely used for high-throughput crop monitoring, few attempts have been made to assess nitrogen content (NC) at the organ level and its association with nitrogen use efficiency (NUE). Also, little is known about the performance of UAV-based image texture features of different spectral bands in monitoring crop nitrogen and NUE. In this study, multi-spectral images were collected throughout different stages of winter wheat in two independent field trials - a single-variety field trial and a multi-variety trial in 2021 and 2022, respectively in China and Germany. Forty-three multispectral vegetation indices (VIs) and forty texture features (TFs) were calculated from images and fed into the partial least squares regression (PLSR) and random forest (RF) regression models for predicting nitrogen-related indicators. Our main objectives were to (1) assess the potential of UAV-based multispectral imagery for predicting NC in different organs of winter wheat, (2) explore the transferability of different image features (VI and TF) and trained machine learning models in predicting NC, and (3) propose a technical workflow for mapping NUE using UAV imagery. The results showed that the correlation between different features (VIs and TFs) and NC in different organs varied between the pre-anthesis and post-anthesis stages. PLSR latent variables extracted from those VIs and TFs could be a great predictor for nitrogen agronomic efficiency (NAE). While adding TFs to VI-based models enhanced the model performance in predicting NC, inconsistency arose when applying the TF-based models trained based on one dataset to the other independent dataset that involved different varieties, UAVs, and cameras. Unsurprisingly, models trained with the multi-variety dataset show better transferability than the models trained with the single-variety dataset. This study not only demonstrates the promise of applying UAV-based imaging to estimate NC in different organs and map NUE in winter wheat but also highlights the importance of conducting model evaluations based on independent datasets.