GGUF quants?

#17
by fsaudm - opened

@unsloth @shimmyshimmer is it DSA what's slowing down release? I'd love to help!

https://github.com/ggml-org/llama.cpp/issues/16331

Until support for deepseek 3.2 is implemented in llama.cpp there will be no ggufs.

@fsaudm @TPH441

There is a working GGUF for the Thinking version (not Speciale extra thinking so far) here now: https://huggingface.co/sszymczyk/DeepSeek-V3.2-nolight-GGUF

It does not use the new sparse attention stuff, but basically runs same as earlier version.

Sign up or log in to comment