Why LoRA works
The paper Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning [8] by Armen et al. found that the pre-trained representations’ intrinsic dimension is way lower than expected, stated by them as follows:
The intrinsic dimension of a matrix is a concept used to determine the effective number of dimensions required to represent the important information contained within that matrix.
Let’s suppose we have a matrix, M
, with five rows and three columns, like this:
M = 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Each row of this matrix...