This paper presents an intelligent lithology identification method that utilizes the feature fusion of single polarized and orthogonal polarized rock images. The traditional thin section identification method heavily relies on manual expertise, leading to subjective results and requiring significant time and labor. To overcome these limitations, we establish a microscopic feature fusion model using a convolutional neural network (CNN). This model leverages the complementarity information from single polarized and orthogonal polarized features. By extracting features from microscopic rock images using convolutional kernels and integrating multi-feature information at both the input and feature levels, the proposed method enhances the classification accuracy of the model, providing a more efficient and objective solution for lithology identification. To evaluate the identification performance, several metrics including accuracy (Acc), precision (P), recall (R), F1-score, and a confusion matrix are employed. The results demonstrate that the fusion model achieved a maximum accuracy of 98.66% on the testing set, representing a 4.91% improvement over using single polarized images alone and a 1.55% improvement over orthogonal polarized images alone. The integration of advanced deep learning models with microscopic image analysis techniques enables researchers and non-geologists to automate the identification and classification of extensive rock sample datasets efficiently. Moreover, the proposed method proves particularly useful in cases with complex mineral compositions and similar structures, as it provides more reliable and accurate analytical results.