Llocalhost

LLMs

This page lists all LLMs and their quantizations used in the benchmarks.

Model Quantization
Name Params (B) Active (B) Sampling
Based on official recommendation. T=Temp, K=TopK, P=TopP, M=MinP
Type Format Size (GB) Download
qwen3-4b-instruct-2507 4.0 4.0 T=0.7 K=20 P=0.8 M=0 UD-Q4-K-XL Gguf 2.55
unsloth/Qwen3-4B-Instruct-2507-GGUF hf download unsloth/Qwen3-4B-Instruct-2507-GGUF --include Qwen3-4B-Instruct-2507-UD-Q4_K_XL.gguf
UD-Q8-K-XL Gguf 5.06
unsloth/Qwen3-4B-Instruct-2507-GGUF hf download unsloth/Qwen3-4B-Instruct-2507-GGUF --include Qwen3-4B-Instruct-2507-UD-Q8_K_XL.gguf
F16 Gguf 8.05
unsloth/Qwen3-4B-Instruct-2507-GGUF hf download unsloth/Qwen3-4B-Instruct-2507-GGUF --include Qwen3-4B-Instruct-2507-F16.gguf
FP8 Safetensors 5.20
Qwen/Qwen3-4B-Instruct-2507-FP8 hf download Qwen/Qwen3-4B-Instruct-2507-FP8
BF16 Safetensors 8.05
Qwen/Qwen3-4B-Instruct-2507 hf download Qwen/Qwen3-4B-Instruct-2507
gpt-oss-20b 20 3.6 T=1 K=0 P=1 M=0 MXFP4 Gguf 12.1
ggml-org/gpt-oss-20b-GGUF hf download ggml-org/gpt-oss-20b-GGUF
MXFP4 Safetensors 13.8
openai/gpt-oss-20b hf download openai/gpt-oss-20b --exclude "metal/*" "original/*"
devstral-small-2-24b-instruct-2512 24 24 T=0.15 K=0 P=1 M=0.01 UD-Q4-K-XL Gguf 14.5
unsloth/Devstral-Small-2-24B-Instruct-2512-GGUF hf download unsloth/Devstral-Small-2-24B-Instruct-2512-GGUF --include Devstral-Small-2-24B-Instruct-2512-UD-Q4_K_XL.gguf
Q8_0 Gguf 25.1
unsloth/Devstral-Small-2-24B-Instruct-2512-GGUF hf download unsloth/Devstral-Small-2-24B-Instruct-2512-GGUF --include Devstral-Small-2-24B-Instruct-2512-Q8_0.gguf
qwen3-30b-a3b-instruct-2507 31 3.3 T=0.7 K=20 P=0.8 M=0 Q4-0 Gguf 17.4
unsloth/Qwen3-30B-A3B-Instruct-2507-GGUF hf download unsloth/Qwen3-30B-A3B-Instruct-2507-GGUF --include Qwen3-30B-A3B-Instruct-2507-Q4_0.gguf
UD-Q4-K-XL Gguf 17.7
unsloth/Qwen3-30B-A3B-Instruct-2507-GGUF hf download unsloth/Qwen3-30B-A3B-Instruct-2507-GGUF --include Qwen3-30B-A3B-Instruct-2507-UD-Q4_K_XL.gguf
UD-Q8-K-XL Gguf 36.0
unsloth/Qwen3-30B-A3B-Instruct-2507-GGUF hf download unsloth/Qwen3-30B-A3B-Instruct-2507-GGUF --include Qwen3-30B-A3B-Instruct-2507-UD-Q8_K_XL.gguf
granite-4.0-h-small 32 9.0 T=0 K=0 P=1 M=0 UD-Q4-K-XL Gguf 18.8
unsloth/granite-4.0-h-small-GGUF hf download unsloth/granite-4.0-h-small-GGUF --include granite-4.0-h-small-UD-Q4_K_XL.gguf
UD-Q8-K-XL Gguf 38.1
unsloth/granite-4.0-h-small-GGUF hf download unsloth/granite-4.0-h-small-GGUF --include granite-4.0-h-small-UD-Q8_K_XL.gguf
BF16 Gguf 64.4
unsloth/granite-4.0-h-small-GGUF hf download unsloth/granite-4.0-h-small-GGUF --include BF16/*
gpt-oss-120b 117 5.1 T=1 K=0 P=1 M=0 MXFP4 Gguf 63.4
ggml-org/gpt-oss-120b-GGUF hf download ggml-org/gpt-oss-120b-GGUF