Accurate GGUF Memory Calculator

Estimate memory usage for GGUF models based on GPU layers, context length, and cache type.

The formula was discovered through symbolic regression using TuringBot, evaluating over a billion candidate formulas against 19,517 real measurements. For details, see this blog post.

0 256
512 262144
Cache Type

Cache quantization.

Estimated memory usage: