Updates

3/10/2026

I've uploaded new quants using the new fused Up + Gate conversion, this offers up to a +10% boost in prompt processing speed from my testing.

Description

This repo contains specialized MoE-quants for Qwen3.5-397B-A17B. The idea being that given the huge size of the FFN tensors compared to the rest of the tensors in the model, it should be possible to achieve a better quality while keeping the overall size of the entire model smaller compared to a similar naive quantization. To that end, the quantization type default is kept in high quality and the FFN UP + FFN GATE tensors are quanted down along with the FFN DOWN tensors.

Quant	Size	Mixture	PPL	1-(Mean PPL(Q)/PPL(base))	KLD
Q5_K_M	273.55 GiB (5.93 BPW)	Q8_0 / Q5_K / Q5_K / Q6_K	3.487363 ± 0.018840	+0.0612%	0.004294 ± 0.000037
Q4_K_M	227.61 GiB (4.93 BPW)	Q8_0 / Q4_K / Q4_K / Q5_K	3.495358 ± 0.018894	+0.2905%	0.008455 ± 0.000072
IQ4_XS	176.99 GiB (3.84 BPW)	Q8_0 / IQ3_S / IQ3_S / IQ4_XS	3.542012 ± 0.019134	+1.6292%	0.022699 ± 0.000189
IQ3_S	136.38 GiB (2.96 BPW)	Q6_K / IQ2_S / IQ2_S / IQ3_S	3.670508 ± 0.020012	+5.3160%	0.064515 ± 0.000505
IQ2_XS	123.22 GiB (2.67 BPW)	Q6_K / IQ2_XS / IQ2_XS / IQ3_XXS	3.777378 ± 0.020737	+8.3824%	0.093718 ± 0.000714
IQ2_XXS	113.95 GiB (2.47 BPW)	Q4_K / IQ2_XXS / IQ2_XXS / IQ3_XXS	3.879226 ± 0.021468	+11.3047%	0.126000 ± 0.000893