Accurate GGUF VRAM Calculator

Calculate VRAM for GGUF models from GPU layers and context length using an accurate formula.

For an explanation about how this works, consult this blog post: https://oobabooga.github.io/blog/posts/gguf-vram-formula/

0 256
512 131072
Cache Type

Cache quantization.

Estimated VRAM to load the model: