The LoRA family: QLoRA, DoRA, and LoRA+

Since LoRA was introduced, several improved variants have appeared, each targeting one particular problem. Knowing them helps you pick the right one for each job.

LoRA is the base method, but it’s not the only member of the family. Since it was introduced, several improved variants have appeared, each targeting a particular constraint: one memory, one quality, one speed. Knowing these variants helps you make the right choice for each job.

QLoRA: when memory is tight

QLoRA is built to save memory. Its trick is to keep the model’s main body compressed at four-bit precision (with a format called NF4) and only restore it to higher precision during computation. The result is dramatic: a seven-billion-parameter model needs only about six gigabytes of memory instead of about sixteen. And importantly, quality stays almost untouched — very close to ordinary LoRA. If your GPU is small or you want to tune a larger model on limited hardware, QLoRA is the default choice.

DoRA: when quality comes first

DoRA targets a different problem: closing the quality gap between LoRA and full fine-tuning. Its idea is to decompose each weight into two parts: magnitude and direction. DoRA manages these two separately — applying LoRA to the direction and learning the magnitude independently. This separation raises quality by a few percent, especially on complex tasks like reasoning and code, with under five percent overhead on the parameter count. If quality is critical for you and ordinary LoRA shows a gap, DoRA is worth trying.

LoRA+: when speed matters

LoRA+ is a subtle but effective improvement. Recall that in LoRA, matrix B starts from zero and has to travel further to reach its desired value. LoRA+ compensates for this by giving B a larger learning rate than A. The result is faster convergence — reaching the target quality roughly one and a half to two times faster. If your training budget is limited or you need fast iteration, LoRA+ saves time with a small change to the configuration.

A simple decision tree

The choice among these can be reduced to a few simple questions:

Is the main constraint memory? → QLoRA.
Is quality critical and LoRA shows a gap? → DoRA.
Is training speed the most important thing? → LoRA+.
None in particular? → standard LoRA, which still works well for most jobs.

Putting it together

These variants aren’t rivals; they’re tools for different constraints. And the nice thing is that they aren’t mutually incompatible — you can, for instance, combine their ideas. But before complicating things, start with the simple version: try standard LoRA, and only when you hit the limit of a specific constraint, reach for the variant that targets exactly that constraint.

The LoRA family: QLoRA, DoRA, and LoRA+ — which, and when?

QLoRA: when memory is tight

DoRA: when quality comes first

LoRA+: when speed matters

A simple decision tree

Putting it together