Personalized image generation, a key application of diffusion models, holds significant importance for the advancement of computer vision, artistic creation, and content generation technologies. However, existing diffusion models fine-tuned with Low-Rank Adaptation (LoRA) face multiple challenges when learning novel concepts: language drift undermines the generation quality of new concepts in novel contexts; the entanglement of object features with other elements in reference images leads to misalignment between the learning target and its unique identifier; and traditional LoRA approaches are limited to learning only one concept at a time. To address these issues, this study proposes a novel hierarchical learning strategy and an enhanced LoRA module. Specifically, we incorporate the GeLU activation function into the LoRA architecture as a nonlinear transformation to effectively mitigate language drift. Furthermore, a gated hierarchical learning mechanism is designed to achieve inter-concept disentanglement, enabling a single LoRA module to learn multiple concepts concurrently. Experimental results across multiple random seeds demonstrate that our approach achieves a 4%–6% improvement in memory retention metrics and outperforms state-of-the-art methods in object fidelity and style similarity by approximately 12.5% and 10%, respectively. In addition to superior generation quality, our method demonstrates high computational efficiency, requiring significantly fewer trainable parameters (45M) compared to existing baselines. While preserving critical features of target objects and maintaining the model’s original capabilities, our method enables the generation of images across diverse scenes in new styles. In scenarios requiring the simultaneous learning of multiple concepts, this study not only presents a novel solution to the multi-concept learning problem in personalized diffusion model training but also lays a technical foundation for high-quality customized AI image generation and diverse visual content creation. The source code is publicly available at https://github.com/ydniuyongjie/HierLoRA/tree/main.
扫码关注我们
求助内容:
应助结果提醒方式:
