This might need some revision. The FF network is fully connected and has the most number of parameters. Though MHA has O(n^2) inference, the number of parameterts might be lower. Ideally LoRA can benefit the FFN. Reference: https://orenleung.com/transformer-parameter-counting
P.S: Havent read the full LoRA paper yet :) So my assumption might be incorrect.