Parameter-efficient fine-tuning (PEFT) reduces the compute and memory demands of adapting large language models, yet standard low-rank adapters (e.g., LoRA) can lag full fine-tuning in performance and stability because they restrict updates to a fixed rank-r subspace. We propose Matrix-Transformation based Low-Rank Adaptation (MTLoRA), a brain-inspired extension that inserts a learnable r × r transformation T into the low-rank update (). By endowing the subspace with data-adapted geometry (e.g., rotations, scalings, and shears), MTLoRA reparameterizes the rank-r hypothesis class, improving its conditioning and inductive bias at negligible O(r2) overhead, and recovers LoRA when . We instantiate four structures for T—SHIM , ICFM , CTCM , and DTSM —providing complementary inductive biases (change of basis, PSD metric, staged mixing, dual superposition). An optimization analysis shows that T acts as a learned preconditioner within the subspace, yielding spectral-norm step-size bounds and operator-norm variance contraction that stabilize training. Empirically, MTLoRA delivers consistent gains while preserving PEFT efficiency: on GLUE (General Language Understanding Evaluation) with DeBERTaV3-base, MTLoRA improves the average over LoRA by (+2.0) points (86.9 → 88.9) and matches AdaLoRA (88.9) without any pruning schedule; on natural language generation with GPT-2 Medium, it raises BLEU on DART by (+0.95) and on WebNLG by (+0.56); and in multimodal instruction tuning with LLaVA-1.5-7B, DTSM attains the best average (69.91) with ∼ 4.7% trainable parameters, outperforming full fine-tuning and strong PEFT baselines. These results indicate that learning geometry inside the low-rank subspace improves both effectiveness and stability, making MTLoRA a practical, plug-compatible alternative to LoRA for large-model fine-tuning.
扫码关注我们
求助内容:
应助结果提醒方式:
