Anomaly detection through machine sounds plays a crucial role in the development of industrial automation due to its excellent flexibility and real-time response capabilities. However, in real-world scenarios, the occurrence frequency of machine anomaly events is relatively low, making it difficult to collect anomaly sound data under various operating conditions. Moreover, due to the influence of operating conditions and environmental noise, the collected sound data may have distribution differences, leading to data domain shifts issues. To address these problems, we propose an unsupervised multi-scale dual-decoder autoencoder (MS-D2AE) network for anomaly sound detection. The MS-D2AE model consists of residual layers, an encoder, and two decoders. The model fuses fine-grained information of sound features through the Multi-scale Feature Fusion Module (MTSFFM), enabling the model to effectively learn feature data from multiple scales. By using a residual layer composed of a single MTSFFM, the encoder's input is directly connected to the intermediate results, further enhancing information transmission. The designed dual-decoder autoencoder structure, in addition to reconstructing error calculation, also utilizes the similarity error calculation between the outputs of the two decoders, encouraging the model to more accurately reconstruct the feature data during learning, thus more comprehensively learning the feature representation of normal data. Additionally, to mitigate the impact of data shift on model performance, we design a feature domain mixing method that blends sound features from both source and target domains to enhance the diversity and generalization of sound features. Finally, we have verified the effectiveness of this method on the Dcase2023 Challenge Task2 and Dcase2022 Challenge Task2 datasets.