LLMs

This page lists all LLMs and their quantizations used in the benchmarks.

Model				Quantization
Name	Params (B)	Active (B)	Sampling Based on official recommendation. T=Temp, K=TopK, P=TopP, M=MinP	Type	Format	Size (GB)	Download
qwen3-4b-instruct-2507	4.0	4.0	T=0.7 K=20 P=0.8 M=0	UD-Q4-K-XL	Gguf	2.55	unsloth/Qwen3-4B-Instruct-2507-GGUF `hf download unsloth/Qwen3-4B-Instruct-2507-GGUF --include Qwen3-4B-Instruct-2507-UD-Q4_K_XL.gguf`
				UD-Q8-K-XL	Gguf	5.06	unsloth/Qwen3-4B-Instruct-2507-GGUF `hf download unsloth/Qwen3-4B-Instruct-2507-GGUF --include Qwen3-4B-Instruct-2507-UD-Q8_K_XL.gguf`
				F16	Gguf	8.05	unsloth/Qwen3-4B-Instruct-2507-GGUF `hf download unsloth/Qwen3-4B-Instruct-2507-GGUF --include Qwen3-4B-Instruct-2507-F16.gguf`
				FP8	Safetensors	5.20	Qwen/Qwen3-4B-Instruct-2507-FP8 `hf download Qwen/Qwen3-4B-Instruct-2507-FP8`
				BF16	Safetensors	8.05	Qwen/Qwen3-4B-Instruct-2507 `hf download Qwen/Qwen3-4B-Instruct-2507`
gpt-oss-20b	20	3.6	T=1 K=0 P=1 M=0	MXFP4	Gguf	12.1	ggml-org/gpt-oss-20b-GGUF `hf download ggml-org/gpt-oss-20b-GGUF`
gpt-oss-20b	20	3.6	T=1 K=0 P=1 M=0	MXFP4	Safetensors	13.8	openai/gpt-oss-20b `hf download openai/gpt-oss-20b --exclude "metal/" "original/"`
devstral-small-2-24b-instruct-2512	24	24	T=0.15 K=0 P=1 M=0.01	UD-Q4-K-XL	Gguf	14.5	unsloth/Devstral-Small-2-24B-Instruct-2512-GGUF `hf download unsloth/Devstral-Small-2-24B-Instruct-2512-GGUF --include Devstral-Small-2-24B-Instruct-2512-UD-Q4_K_XL.gguf`
devstral-small-2-24b-instruct-2512	24	24	T=0.15 K=0 P=1 M=0.01	Q8_0	Gguf	25.1	unsloth/Devstral-Small-2-24B-Instruct-2512-GGUF `hf download unsloth/Devstral-Small-2-24B-Instruct-2512-GGUF --include Devstral-Small-2-24B-Instruct-2512-Q8_0.gguf`
qwen3-30b-a3b-instruct-2507	31	3.3	T=0.7 K=20 P=0.8 M=0	Q4-0	Gguf	17.4	unsloth/Qwen3-30B-A3B-Instruct-2507-GGUF `hf download unsloth/Qwen3-30B-A3B-Instruct-2507-GGUF --include Qwen3-30B-A3B-Instruct-2507-Q4_0.gguf`
				UD-Q4-K-XL	Gguf	17.7	unsloth/Qwen3-30B-A3B-Instruct-2507-GGUF `hf download unsloth/Qwen3-30B-A3B-Instruct-2507-GGUF --include Qwen3-30B-A3B-Instruct-2507-UD-Q4_K_XL.gguf`
				UD-Q8-K-XL	Gguf	36.0	unsloth/Qwen3-30B-A3B-Instruct-2507-GGUF `hf download unsloth/Qwen3-30B-A3B-Instruct-2507-GGUF --include Qwen3-30B-A3B-Instruct-2507-UD-Q8_K_XL.gguf`
granite-4.0-h-small	32	9.0	T=0 K=0 P=1 M=0	UD-Q4-K-XL	Gguf	18.8	unsloth/granite-4.0-h-small-GGUF `hf download unsloth/granite-4.0-h-small-GGUF --include granite-4.0-h-small-UD-Q4_K_XL.gguf`
				UD-Q8-K-XL	Gguf	38.1	unsloth/granite-4.0-h-small-GGUF `hf download unsloth/granite-4.0-h-small-GGUF --include granite-4.0-h-small-UD-Q8_K_XL.gguf`
				BF16	Gguf	64.4	unsloth/granite-4.0-h-small-GGUF `hf download unsloth/granite-4.0-h-small-GGUF --include BF16/*`
gpt-oss-120b	117	5.1	T=1 K=0 P=1 M=0	MXFP4	Gguf	63.4	ggml-org/gpt-oss-120b-GGUF `hf download ggml-org/gpt-oss-120b-GGUF`