Tool wear monitoring (TWM) is essential for enhancing the machining accuracy of intelligent manufacturing systems and ensuring the consistency and reliability of products. The complex and dynamic processing environment demands higher real-time monitoring and generalization ability of TWM. Traditional data-driven models lack guided training in physical processes and are limited by the amount of samples with wear labels. To guide the model to capture the underlying physical mechanism and enhance compliance with the law of tool wear, a dual knowledge embedded hybrid model based on augmented data and improved loss function for TWM is proposed in this paper. The second training data source is obtained by constructing the mapping relationship between cutting force and tool wear, which effectively complements and enhances the physical characteristics between the data and addresses the issue of insufficient labeled data in actual network training. Subsequently, a structure integrating serial convolution, parallel convolution, bidirectional gated recurrent unit (BiGRU) and attention mechanism is developed to extract the spatial and temporal features in time series data. Moreover, Based on the physical law of tool wear, an improved loss function with physical constraints is proposed to improve the physical consistency of the model. The experimental results show that the model prediction RMSE error is reduced by 12.67% after augmented data compared to a single data source, and the RMSE error of the prediction result is reduced by 25.16% at most after the improvement of the loss function. The model has high prediction accuracy within short training epochs and good real-time performance. The proposed approach provides a modeling strategy with low computational resource requirements based on the fusion of physical and data information.